Well they did say that Cassandra is better than all the others - especially when compared to HBase - on nearly all measurements (except for high write scenario latencies.)
This makes the decision of Facebook to go with HBase for their new messaging platform back in 2010 all the more strange. Though that was two years ago so things might have changed in Cassandra's favor since then.
> This makes the decision of Facebook to go with HBase for their new messaging platform back in 2010 all the more strange.
Speed isn't everything to a database. AFAIK they chose HBase over Cassandra because of consistency guarantees: eventual consistency is a bad choice for a messaging platform.
eventual consistency is a bad choice for a messaging platform
Strange you would mention that in the context of Cassandra, since it allows for per-read/write configuration of consistency, from "eventual" to "strong". You get exactly what you ask for with Cassandra, whether its availability or consistency.
This is not the main reason people would choose HBase. For time series data HBase is much better at load balancing the data and at storing the same data in sequential blocks so that in a single operation you can fetch all of the data points that are interesting. Cassandra supports range queries but the last time I saw it wasn't super awesome at load balancing data across nodes when using the OrderedPartitioner. Do I remember wrong?
I won't put too many words into the mouths of the Facebook fellows, but I meet with them every now and then, and they very much care about "speed" (latency and throughput, best and worst case).
This paper does not reflect what I have seen in production setups.
This makes the decision of Facebook to go with HBase for their new messaging platform back in 2010 all the more strange. Though that was two years ago so things might have changed in Cassandra's favor since then.