Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'll do extensive testing, but I need to scan a lot of data (aggregate basically). I'd be comfortable even with index size in multiples of data size if it delivered RT queries. Have you evaluated anything else?


We also checked mongodb. We dropped it mainly because index size was getting too big.

If your data is read-only then Cloudera Impala is worth a try. It's really fast.


I was looking at Impala (Cassandra) as well as keeping an eye on Drill progress. My data is write only in ETL stage so it seems it could be the right way. Lots of testing ahead! - thanks




Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: