Came here to post this, fantastic paper and probably the most comprehensive thing on the internet available about this
Something else I've taken away from research about columnar and vectorized databases: there doesn't seem a good reason why they aren't fit for OLTP workloads
Analytics from SaaS/line-of-business apps shows about a 90/10 read/write ratio for CRUD apps.
Pavlo et al. have a solid paper on an HTAP database that compares OLTP, OLAP, and a novel HTAP storage on varying read/write workloads:
> "Arulraj, J., Pavlo, A., & Menon, P. (2016). Bridging the Archipelago between Row-Stores and Column-Stores for Hybrid Workloads. Proceedings of the 2016 International Conference on Management of Data - SIGMOD ’16. doi:10.1145/2882903.2915231 "
See "Figure 17: Concurrent Hybrid Workloads – The impact of the storage layout on the query processing time under different concurrent hybrid workloads."
The chart contains 4 workload types, one of which is "• Read-Heavy: 90% scans, 10% inserts"
You can see that the columnar storage model outperforms OLTP N-ary storage here
So my question is -- why don't people use columnar databases for CRUD apps?
For scan heavy apps columnstores will be much much faster. For simple CRUD apps that just want to read/write a few rows at a time (with high concurrency) they have a number of inefficiencies vs a rowstore.
- Columnstore often don't support indexing at all (or have
weak support for it). This means a scan is needed to find
any row (with min/max or segment elimination to avoid
opening up files with no matching rows at all). Even if
this scan is very fast, its still going to use up more CPU
then an index seek.
- Most columnstores don't use compression schemes that are
incremental. To grab a single row the columnstore likely
decompresses many adjacent rows (could be millions of rows
- depends on the particular columnstore).
- Most columnstores don't support fine grained locking
(row level locking). They often lock the entire table or
an entire segment (millions?) of rows whenever a single
row is written to. This damages concurrency.
- The row reconstruction costs are high (gluing the
columns back together)
We have a SingleStoreDB paper in this years SIGMOD that goes into these implementation details (not publicly published just yet). The columnstore in SinglestoreDB is reasonably good at OLTP without sacrificing its OLAP performance (see some benchmarking results here: https://www.singlestore.com/blog/tpc-benchmarking-results/).
Also I just noticed that you used the TPC-C benchmark here
Have you considered re-benchmarking with TPC-E? It's the updated version of the OLTP test that more accurately represents these sorts of apps:
> "In February 2007, the new TPC-E benchmark [7] became a TPC standard. It is designed to be a more realistic OLTP benchmark than TPC-C, e.g., incorporating realistic data skews and referential integrity constraints."
> "We find that (i) TPC-E is more read intensive with a 9.7:1 I/O read to write ratio, while TPC-C sees a 1.9:1 read-to-write ratio; and (ii) although TPC-E uses pseudo-realistic data, TPC-E’s I/O access pattern is as random as TPC-C."
It's a difference between a 10/1 read/write ratio, and a 2/1 read/write ratio. I've never worked on a line-of-business/SaaS app with a ratio lower than 80% reads FWIW.
Yeah, TPC-E is a better (more advanced) OLTP benchmark. TPC-C is trivially scalable by sharding on warehouse id everywhere. The problem with TPC-E is not many companies publish (official or unofficial) results for it, so its not as useful when comparing systems, which is what we were after in our blog post.
> The problem with TPC-E is not many companies publish (official or unofficial) results for it, so its not as useful when comparing systems, which is what we were after in our blog post.
Really interesting, going to have a look at that paper -- thanks for sharing your insight
Let me ask this: if you have something like a GraphQL API, which often does sparse column selection from multiple different tables, would that also be a good fit for columnar database?
In most GraphQL queries, you're grabbing a portion of the fields from one or more tables, IE something like:
query JoesCompletedTodos {
users(where: { name: { _eq: "Joe" } }) {
id
name
todos(where: { completed: { _eq: true } }) {
text
}
}
}
Where this will get translated to something along the lines of
SELECT users.id, users.name, json_arrayagg(todos) FROM users
INNER JOIN todos ON todos.user_id = users.id
WHERE users.name = 'Joe' AND todos.completed = true
GROUP BY users.id, users.name
Many times you'll see queries spanning 3-4 relation levels deep, plucking something like 2-6 columns from each table.
Curious how well a columnar DB would do with something like this?
Also, on this point:
> "Most columnstores don't use compression schemes that are incremental. To grab a single row the columnstore likely decompresses many adjacent rows (could be millions of rows - depends on the particular columnstore)."
I think this is something that could be avoided if the data were in IE, Arrow, and you used Arrow Flight/FlightSQL as the transport mechanism, right? But this isn't my area of expertise, for sure.
WHERE users.name = 'Joe' AND todos.completed = true
Have most of the problems I mentioned above unless "many" rows match the filter.
Lack of ability to seek to specific rows in the data hurts.
RE: Arrow.. I haven't read too much about the compression scheme(s) it uses. If it only does dictionary compression, that is reasonable easy to make incremental, so its possible Arrow doesn't have that specific problem.
Making one like that (with a hybrid model like PAX) with other things I consider important to improve the situation (like algebraic type support, full relational, etc) are desirable, but FUNDING that is other thing. Make a database is major effort and is not easy to get the proper support for it.
But I killed to have the chance, and I think the market is ready for go beyond the current crop of NoSql/NewSql...
Something else I've taken away from research about columnar and vectorized databases: there doesn't seem a good reason why they aren't fit for OLTP workloads
Analytics from SaaS/line-of-business apps shows about a 90/10 read/write ratio for CRUD apps.
Pavlo et al. have a solid paper on an HTAP database that compares OLTP, OLAP, and a novel HTAP storage on varying read/write workloads:
See "Figure 17: Concurrent Hybrid Workloads – The impact of the storage layout on the query processing time under different concurrent hybrid workloads."The chart contains 4 workload types, one of which is "• Read-Heavy: 90% scans, 10% inserts"
You can see that the columnar storage model outperforms OLTP N-ary storage here
So my question is -- why don't people use columnar databases for CRUD apps?