Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Show HN: Tracking the FISA Court in real time with a bit of code (twitter.com/fisacourt)
82 points by konklone on July 14, 2013 | hide | past | favorite | 17 comments


Background: the FISA Court made a public docket only just last month. It's at http://www.uscourts.gov/uscourts/courts/fisc/index.html, and is a tiny flat page with links to scanned PDFs. They clearly update it by hand, whenever something becomes public.

I wrote a Ruby script to watch the page every 5 minutes. When there's a change, it texts me, emails me, and tweets as @FISACourt with a link to a diff of the changes.

When I get notified, I read the new documents and follow up the automatic tweet with a hand-written one that explains the update, usually within just a few minutes of the posting.

Simple, but it breaks news faster than the blogs and papers do. The code is here: https://github.com/konklone/fisa

And I have some further explanation on my blog: http://konklone.com/post/following-the-fisa-court-the-advanc...


If you aren't archiving every newly linked document, I'd like to suggest you do so. You never know when a site like that is going to publish something that they later retract, maybe even within minutes. Its good to have a copy for cases like that.


Good call. I'll do that.


Very cool! I will follow @FISACourt on twitter after a suitable amount of time has lapsed so that this account and my twitter account are less likely to be chronologically collated.

Thanks for your public service!


I'm following them right now. If you're afraid to do what is perfectly legal because you may come to NSA attention, they won.


Thanks for doing this! I don't know what to expect, but it's nice to have the cameras rolling.


I second the previous comment. Thank you for your contributions.



Awesome! You might consider removing the years before 1978 on the FISA Court page though:

http://www.plainsite.org/courts/index.html?id=223


I guess we need to look for the buzzword "push" rather than "realtime" now-a-days. At least, that's closer to how I use "realtime", rather than 5 minutes mechanical polling.

"Sure, realtime-nazi, but what would you suggest?"

Personally I feel up to 30 seconds is acceptable, but I rely more on the underlying method of delivery (push/long poll for change vs. polling and comparing for change). In close comparison to the definition here: https://en.wikipedia.org/wiki/Real-time_web

> receive information as soon as it is published by its authors, rather than requiring that they or their software check a source periodically

For a consensus, see http://stackoverflow.com/tags/real-time/info and/or http://stackoverflow.com/a/5286985

Or am I wrong? What is a better label for what wikipedia defines as realtime-web?


This is why I put the crucial space between "real" and "time". I cannot hold up to intensive buzzword scrutiny. :)

It's every 5 minutes because I don't want them to ban me. And in practice, this is an acceptable delay.


To clearify, I wasn't critiquing your approach, just the phrasing.

What you're doing is how it should be done for sources you don't have direct access to.

I'd use 10 minutes though, as http://www.uscourts.gov/robots.txt points to a crawl delay of 10. Whether it's checked is the risk you take.


Right, now I remember - just double-checked, and Crawl-delay refers to seconds, not minutes.

http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl...


Oh, my bad. Sorry. Thanks for correcting me.


I never thought to check that! Thanks, I may adjust my rate.


Great idea! Watching the watchers is a valuable public service. Cheers!


This is a smart idea. All the power to you.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: