Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Ask HN: Any HN stats analysis been done?
22 points by lifeisstillgood on Aug 12, 2015 | hide | past | favorite | 5 comments
I was just thinking of some questions about HN discussions - things like what does the average discussion look like? How long between replies is common / polite (I seem to let the other end hang for hours)

I don't particularly care about best time to post (that's defined by me as when I am struck by something) but there are many interesting questions one could ask the corpus

(Which leads to the other question - is there a good way to get hold of the corpus?)



There's the main HackerNews API [1] via firebase.com, and there's also the Algolia HN Search API [2]. Over the years I've seen quite a few collections of data [3, 4, 5, 6], but how complete they are and whether or not they've been maintained is unknown.

[1] https://github.com/HackerNews/API

[2] https://github.com/algolia/hn-search

[3] https://archive.org/details/HackerNewsStoriesAndCommentsDump

[4] https://ia902503.us.archive.org/33/items/HackerNewsStoriesAn...

[5] http://shitalshah.com/p/downloading-all-of-hacker-news-posts...

[6] https://news.ycombinator.com/item?id=7835605


There was this last week: a dump of the 10M comments and posts so far: https://news.ycombinator.com/item?id=10002791


I particularly like Andrej Karpathy's analysis: https://cs.stanford.edu/people/karpathy/hn_analysis.html


I have a repository of scripts for downloading all Hacker News data using Python and storing in PostgreSQL: https://github.com/minimaxir/get-all-hacker-news-submissions...





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: