Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

1. Scale. A 10 GB file has a few million 4kB blocks, a 10 GB database may have a billion records. Worse, moving a block takes a read, typically a seek, and a write; moving a record takes the same because the disk works at block scale (this could change with flash memory). Also, part of the strategy for fighting fragmentation is to grow a file a cluster of several blocks at a time, wasting quite a bit of space if the file stays small. That would be too much overhead for a database with small records.

2. MVCC is similar to a log-structured file system, and I don't think fragmentation is a solved issue there. Certainly, Wikipedia doesn't think so (https://sarwiki.informatik.hu-berlin.de/Log-Structured_Files..., https://en.m.wikipedia.org/wiki/Log-structured_File_System_(...) (reading the LFS page makes me think somebody should implement a generational garbage collector for it)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: