But when indexing your json or csv, if you have say 10 rows, each row is separat... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		simlevesque 1 day ago \| parent \| context \| favorite \| on: Why DuckDB is my first choice for data processing But when indexing your json or csv, if you have say 10 rows, each row is separated on your disk instead of all together. So a scan for one columb only needs to read a tenth of the disk space used for the data. Obviously this depends on the columns' content.

gdulli 1 day ago [–]

But you can have a surprisingly large amount of data before the inefficiency you're talking about becomes untenable.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact