Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Couldn't quickly find the limitations when parsing these .csv files. How many lines in them would be still ok?


Things will definitely start slowing down pretty linearly after 100k lines, but we often see millions, and most files shouldn't break us as long as they're not over ~500MB.

Edit: Fleshed out explanation


Do you try to check a resource with an http head request to ensure its under 500MB before ingesting?


Nope. Folks who sign up for the API generally have some awareness of what size files work and what don't. And since they're uploading for their users, we're aligned around wanting those users to have a good experience.

We haven't worried about it in this particular implementation around the API because we didn't run across many raw github files that were big. And even when we do get the odd big one, we just refuse to process it once we receive it, so it doesn't hurt us much if someone sends us a few GBs.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: