I’ve also had some pretty clear successes moving a ~2TB Postgres database into a 70GB Clickhouse instance, (though, in this case a good 2/3 of the issue was scheme design on the PG side, but there were still another order of magnitude compression to be had from clickhouse.)
Queries went from a 12 hour table scan to ~2 minutes, and production queries from 45 sec to 400ms.
This is without doing anything particularly tricky on the clickhouse side, nor even putting it on a particularly large machine. When benchmarking it, I put it on a 10yr old laptop, and I was seeing production queries in the seconds range,
I had as well excellent results when I experimented with Clickhouse (in my case the table with the most rows that I ever created had 6 columns using a total of ~55 bytes per row - after having inserted ~35 billion rows the table was 450MiB big). My usecases involving big aggregations and mini-searches (when searching/retrieving just few rows) were excellent (big aggr) respectively good (small searches - not as fast as a normal DB, but for me for sure acceptable).
For whoever is interested to try it out, be aware that you really should understand how "parts" and "merges" work and your write-workload should take that into consideration (for example dynamically lower the workload when "merges" are running, take into consideration the I/O needed to delete pending old "parts", etc...) - if you don't then you're very likely to get bad performance if your workload is write-heavy. The amount of columns and the sort order is as well extremely important.
Last year I decided to start using Clickhouse to host Collecd's metrics ( https://www.segfault.digital/how-to-directly-use-the-clickho... ) and so far it's working well. I upgraded twice the version of the CH-software, and killed once by mistake its VM but afterwards I never had problems ("so far", hehe).
I have one column that's an incrementing integer per partition, and for a million rows the total storage for that field is ~20k. That's a savings of something like 10% over the whole dataset.
> That's a savings of something like 10% over the whole dataset.
Sorry, I'm probably misunderstanding you - I'm not sure about that "10%", what does it it refer to?
Anyway, I do like a lot in Clickhouse to be able to chain codecs (I personally like to think about "Delta/DoubleDelta/Gorilla/T64 codecs" being "encodings" and the "general purpose codecs LZ4/LZ4HC/ZSTD" being "compression codecs").
I don't have a good math background => about Delta & DoubleDelta I liked this explanation ( https://altinity.com/blog/2019/7/new-encodings-to-improve-cl... ) which defines "delta" as tracking "distance" and "doubledelta" tracking "acceleration" between consecutive column values.
In the end I tested, using relatively small datasets (which were different between usecases), all combinations of encodings (delta/doubledelta/gorilla/t64) and ZSTD-compression levels (mostly 1/3/9) (I ignored LZ4*).
- "Delta"&"DoubleDelta" were often interesting (but in general for my data using "Delta"+ZSTD was already good enough compared to the rest).
- "Gorilla" somehow never gave me any benefits if compared to other codecs and/or compression algos.
- "T64" is a bit a mistery for me, anyway in some tests it delivered excellent results compared to the other combinations, therefore I'm currently using just T64 for some columns, and for some other columns as T64+ZSTD(9).
EDIT: sorry, I think I got it - you probably meant something like "just by doing that on that specific column, the overall storage needs were reduced by 10% for the whole dataset", right?
The system tables are incredibly helpful for optimizing ClickHouse performance. The system.columns table gives exact compression ratios for every column and system.parts shows similar stats for table parts. It's possible to tune schema very effectively without a lot of deep knowledge of how codecs and table partitioning work internally.
In addition to what OP mentions in the sibling comment, ClickHouse supports a whole bunch of encodings that can take advantage of column based layouts to further compress the columns in very clever ways
Parts may have been , but it was ~2B rows at ~ 1K/row in PG, with a couple of embedded json fields (which were probably compressed, but ... that's coming from a bad starting point anyway).
It got converted to a table with ~25 Float32 columns + a couple of key columns.
> Queries went from a 12 hour table scan to ~2 minutes, and production queries from 45 sec to 400ms.
Hmm, why do you have such large tables and then do full scans?
Edit: and if it's analytics why were you using a row store instead of a columnar database? It's like saying changing to a Ford Mustang was 10x faster than a rusty John Dere combine.
> and if it's analytics why were you using a row store instead of a columnar database?
This point confuses me, parent is telling us how they went from a row-based (pgsql) to a column-based DB (Clickhouse), and you asked them why they didn't?
There are many, many ways to avoid a full scan. Fixing a dumb query and proper indexing (partial indexes if this is expensive), materialized views, and several more. It's very common the developers either resist or are not aware of how to use a database. A full scan is dumb and will retrieve all data of this big table without any filtering or even order. This is highly suspicious.
And in the very, very rare case it is absolutely necessary to do a full scan, then Postgres might not be a good idea. Why is it being bashed? I read "Postgres was 12h slow" not "we picked the wrong tool and of course Postgres was slow". Don't blame the hammer.
IME, it's common for programmers to not be educated on what modern databases can do. This was made worse with all the NoSQL mania ending up with the MongoDB is web scale. We are still dealing with the fallout of that.
FWIW, This project was one that I came in at the deployment stage, and the original requirements were for something about 10x smaller. Budget Gone. Shit happens.
There were many places where the right choices weren't made for this, and could have been made better. But even with optimizations in the schema speeding things up by a factor of 3, we would have been looking at something that still didn't hit the performance needs. I'm guessing that to do the reorg on the full DB, we would be looking at O(week). Effective indexes are great, but when one optimized index is tens of GB and takes a day to create, you start to look elsewhere.
Data is the size that it happens to be for many businesses. this could easily have been a large user view history table or similar online use case that folks happened to need analytic capabilities on.
It would be interesting to learn whether other options were considered. I would have definitely tried Presto (or AWS Athena) querying ORC data, which would amount to a columnar storage with a sparse index, just like Click House.
I’ve used Presto and Athena and neither could really hold a candle to ClickHouse.
Presto was handy, but I’d personally much rather go to the effort of getting things into a proper columnar database than put up with the quirks of the likes of Presto.
If you want, there’s a detailed analysis of various db’s performance on various tasks here:
I have evaluated these different options. And by far I prefer Clickhouse when were talking about less than a petabyte of data. It super simple to install and setup. And is a lot faster.
I am currently building an internal analytics platform for web event data on top of CH, and I recommend it too.
I am migrating a lot of complex Scala queries to CH SQL and I am surprised at everything I can do, very deep and rich API backed by great performance. Also, Clickhouse is a bit more operationally involved than a solution like BigQuery, but at a fraction of the performance per dollar cost.
(Mux co-founder here) The Postgres system described as the predecessor to Clickhouse here actually was cstore_fdw with a pretty heavily customized (and outdated ) CitusDB. CitusDB is a great system for sharding and distributing postgres, but we found that the compression and query performance for this specific analytics use case was much better served by Clickhouse.
The only place Citus is mentioned in the post is in the title of a chart. Might be worth updating to include this, would also clarify note #3:
"ClickHouse is still faster than the sharded Postgres setup at retrieving a single row, despite being column-oriented and using sparse indices. I'll describe how we optimized this query in a moment."
To be transparent, we included Citus in original versions of the post, but decided to take it out because it didn't feel like it was a fair representation of Citus™. As Adam mentioned, what we were using was heavily customized and based on a pretty outdated fork by the time we transitioned.
This should be clarified in the post, otherwise you lead the reader to believe you're representing stock Postgres performance instead of the performance of a forked Citus. This is important information for an informed reader.
ClickHouse has to be one of my favourite databases and pieces of technology.
It’s stupid fast, I know it’s capable of workload sizes past where I’ve pushed it so far. I’ve used it in anger and even then I was able to solve the issues I had easily. The out-of-the-box integrations (like the Kafka one) have made my life so much easier. I get excited whenever I get to use the instance I manage at work, or I demo something about it to someone because I know I’ll often blow their minds a bit.
Controversial question: can you trust clickhouse? After the JetBrains hack I'm starting to think if I should question any software that's coming from east of Europe (even if the org is legit).
AFAIK there was no evidence that solarwind fiasco was because of jetbrains [0], also the clickhouse is open source [1] under apache2 license and you can check the source and question it by yourself.
I don't understand the obsession with ClickHouse. While it seems like it fits this particular use case, it still deals with the same limitations and challenges of columnar DBs. Your queries will be very fast with counts/averages, but there's a tradeoff with other functions: inserts are efficient for bulk inserts only, your deletes and updates are slow, no secondary indexes...
While Clickhouse can be lightning fast, is it really designed to be a main backend database?
Clickhouse is not something you use for a CRUD backend.
The obsession with Clickhouse is the phenomenal performance for the OLAP use case, a scenario where there were not many open source, easy to install and maintain options. For the most part you can treat it as a “normal” database, insert data into it and query it without messing about with file format conversions and so on.
The fact that it is blindingly fast is a big bonus!
ClickHouse supports transactional (ACID) properties for INSERT queries. It can also replicate data across geographically distributed locations with automatic support for consistency, failover and recovery. Quorum writes are supported as well.
This allows to safely use ClickHouse for billing data.
Queries went from a 12 hour table scan to ~2 minutes, and production queries from 45 sec to 400ms.
This is without doing anything particularly tricky on the clickhouse side, nor even putting it on a particularly large machine. When benchmarking it, I put it on a 10yr old laptop, and I was seeing production queries in the seconds range,