Hacker Newsnew | past | comments | ask | show | jobs | submit | MilkMp's commentslogin

Hi there, yes I used AI to help build this website. I personally don’t have the time nor the talent to build something like this from scratch! I do have knowledge in how historical and crime data are suppose to parsed,viewed, analyzed, and presented to the world :) If someone would like to take this work and improve it, please do!

Was originally just supposed to be a data archive/download place for the parsed data.Thought a website could help! Will look into the standards

Accessibility matters.

In 2026, tools like WAVE, Lighthouse, and a real screen reader should be part of any website design process. They catch issues early. A stitch in time saves nine.

I know you may not be a designer. That’s fine. Starting with a solid, off-the-shelf CSS framework can get you much closer to Web Content Accessibility Guidelines (WCAG) compliance from day one. It sets a baseline so you’re not reinventing solved problems.

Building from scratch is absolutely valid. It’s cool, even. But right now it reads less like an intentional design choice and more like missing fundamentals.

I’m not trying to be a dick, the project has potential! A few design improvements would make it usable for a lot more people.

Cheers!


Thanks! I am definitely not a front-end web designer lol, and I for sure don't want to limit people's access. I will look into the standards and see how best to implement them into the website :)

Thanks! Will look into it

Yeah. Please don't. This is such a breath of fresh air. Dense data should be presented like a book, not a pamphlet-like hyperlinked website.

I agree. I love the current design. Personally, it seems to be just perfect.

Hey there, yeah, definitely. I maintain .txt change logs for all data modifications. To be clear, no information is added or altered — the Factbook content is exactly what the CIA published. The parsing process structures the raw text into fields (removing formatting artifacts, sectioning headers, and deduplicating noise lines), but the actual data values are untouched. What I've added on top are lookup tables that map the CIA's FIPS 10-4 codes to ISO Alpha-2/3 and a unified MasterCountryID, so the different code systems can be joined and queried together.

I will add them to the github :)


Awesome. Thanks so much..

Will check out! Thank you!

Hi there, thanks for linking this! My GitHub and website both link to and use this source! I just thought putting it in a SQL database and making the entire 1990-2025 queryable was needed since I couldn't find one anywhere :)

it is a lot of fun and rewarding to do this! I've done it several times for medium-sized datasets, like wikipedia dumps, the entire geospatial dataset to mapreduce it (pgsql). The wikipedia one was great, i had it set up to query things like "show me all ammunition manufactured after 1950 that is between .30 and .40" and it could just return it nearly instantly. The wikimedia dumps keep the infoboxes and relations intact, so you can do queries like this easily.

Do you have a write-up of this somewhere? When I last looked at the Wikipedia dumps, they looked like a mess to parse. How were you getting structured information?

You'd presumably have to run some part of the transclusion pipeline to properly handle template/module/page transclusion.

unfortunately, i consider it proprietary.

Hi there! If you have anything you want added to the site, just let me know :) I can definitely try.

Ohh that is a great idea! And since we already have the political field in SQL!. I will start working on some of this and update the website this week. Thank you for the awesome suggestions!

Thanks so much!

Found the problem, the total regex doesn't handle magnitude suffixes:

2018: total: 17,856,024 → parses as 17856024 (correct raw count) 2020: total: 18.17 million → parses as 18.17 (WRONG - drops "million") 2025: total: 39.3 million → parses as 39.3 (WRONG) So the chart jumps from ~18 million down to ~18, making it wrong. The fix is to handle "million/billion/trillion" after total.

Just deployed a new bug fix.

Thanks for bringing this to my attention!


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: