Why would you choose to use a system that doesn't scale by default?
Single user local applications? Fair.
Web applications? Very strange choice imo.
Reddis is great, but it is *not* a database, and it's thoroughly rubbish at high load concurrency without clustering, which is (still) a massive pain in the ass to setup manually.
Of course, you can just use a hosted version off a cloud provider... but, it's generally about 10x more expensive than just a plain old database.
/shrug
I mean, sure, it's (arguably...) step up from just using sqlite, but... really, it's easy, and that's good... but it isn't good enough as a general replacement for having a real database.
(To be fair, sqlite has got some pretty sophisticated functionality too, even some support for concurrency; it's probably a step up from redis in many circumstances).
Postgres is not as automatic as other tools but is mostly an artifact of it being around so long, and focus being on other things. Few projects have been around and stayed as relevant as postgres.
Most of the time, you really don't need to scale postgres more than vertically (outside of the usual read replicas), and if you have tons of reads (that aren't hitting cache, I guess), then you can scale reads relatively easily. The problem is that the guarantees that postgres gives you around your data are research-level hard -- you either quorum or you 2pc.
Once you start looking into solutions that scale easily, if they don't ding you on performance, things get murky really quick and all of a sudden you hear a lot of "read-your-writes" or "eventual consistency" -- they're weakening the problem so it can be solved easily.
All that said -- Citus and PostgresXL do exist. They're not perfect by any means, but you also have solutions that scale at the table-level like TimescaleDB and others. You can literally use Postgres for something it was never designed for and still be in a manageable situation -- try that with other tools.
All that said, KeyDB[0] looks pretty awesome. Multithreaded, easy clustering, and flash-as-memory in a pinch, I'm way more excited to roll that out than I am Redis these days.
It really looks absolutely amazing, I feel guilty because I want to run a service on it, there's almost no downside to running it everywhere you'd normally run Redis.
Doesn't have the feature set that KeyDB has but both of these pieces of software feel like they could the basis of a cloud redis product that would be really efficient and fast. I've got some plans to do just that.
Redis is inherently lossy as a matter of basic design, and that's not even touching on the many other issues born of NIH solutions rampant within it. You may not hit the behavior until you push real loads through it. If you talk to anyone who has, I'm confident they'll agree with the criticism that while it may be an excellent cache, it should never be treated as a ground truth database. It's excellent as a slower memcachd with richer features. It's not a database. You can also read Aphyr's reports over the years, which to be utterly frank, bent over backwards to be charitable.
Data loss can occur between flushes to disk, for example (by default every 2 seconds / every I_FORGOT megabytes). Perhaps (most likely) it is possible to fine-tune the configuration to have redis as a very reliable data store, but it doesn't come with such settings by default, unlike most of RDBMSes.
Not all use cases require reliable data storage and it is ok lose few seconds of data. Think simple discussion forums, internal chat applications. There are some scenarios where ease of use and a single server scalability pays off in the faster development and devops cost.
Mostly boils down to Redis having a richer API, and memcached being faster / more efficient. The new EXT store stuff allows you to leverage fast ssd's to cache stupid huge datasets. Memcached is also one of the most battle tested things out there in open source. I've used them both plenty over the years, but tend to lean towards memcached now unless I really need some Redis API feature.
It only disappears if there is a catastrophic failure. The likelihood for such thing to happen when you write a huge comment are less than jackpot in Las Vegas, a sensible risk tradeoff for better development experience and cost.
Note the tradeoff doesn't make sense as soon as you're operating at a meaningful scale. A small likelihood of failure at small scale translates to "I expect a failure a million years from now", whereas at large scale it's more like "a month from now". Accepting the same percent risk of data loss in the former case might be OK, but in the latter case is irresponsible. Provided whatever you're storing is not transient data.
You are correct. I wrote the original comment as "single server" so I assume it does not mean a meaningful scale and can be more effectively dealt with a support ticket. Not everything needs to be growth trajectory SaaS.
I suppose it's a bit more suitable to networked services than sqlite is, since it's natively a web api, and sqlite is natively a local-only solution.
...but, I started writing about clustering and the network API, but, I can't really articulate why those are actually superior in any meaningful way to simply using sqlite, and given the irritation I've had in maintaining them in production in the past...
I guess you're probably right. If I had to pick, I'd probably use sqlite.
Hey I assume this is like a joke and not too serious, and we'd all switch off when things got a bit hairy, but I sure hope other readers can tell.
I am literally in the middle of digging a company out of this mistake (keeping Redis too long) right now. If your software/data is worth something, take a week or a month and figure out a reasonable schema, use an auto-generation tool, ORM, or hire a DB for a little bit to do something for you. Even MongoDB is better than redis if your're gonna do something like this.
If you store protos in your Redis keys (like most people using “NoSQL” for data storage), this comment doesn’t have much punch. Pretty sure we all can think of some pretty high profile examples of NoSQL + structured data working very very well at scale.
I'm not trying to get on people who are using redis as a cache (for photos, or any other ephemeral data).
The idea I was trying to get at was using redis to store data traditionally reserved for OLTP workloads.
> Pretty sure we all can think of some pretty high profile examples of NoSQL + structured data working very very well at scale.
Well that's the thing, you very rarely hear of companies who cursed their decision early on to use NoSQL when they realized that their data was structured but in 20 different ways over the lifetime of the product. Some datasets only need light structure (key/value, a loosely defined document, schema-included documents), and other things should probably have a schema and be stored in a database with a tight grip on that schema and data consistency/correctness. Please don't use redis in that latter case.
operations aside, the big problem in my experience dealing with these systems is you are extremely limited (on purpose) and cant do much sorting/filtering/aggregation/querying. that's what really makes true db's powerful. I love redis for what it does, i just dont think it replaces a DB well in many cases where its non-transient data
1) 99.9% of internet-facing/adjacent businesses are not Google and will never reach even 1% of Google's scale
2) Proto + BigTable is very different from just throwing stuff in redis/mongo. Proto schemas are compile-time enforced, which is great for some teams and might slow others down. Google enforces more discipline than your average engineering team -- this is overkill for most engineering teams.
> take a week or a month and figure out a reasonable schema, use an auto-generation tool, ORM, or hire a DB for a little bit to do something for you.
Sorry but am I the only one who is very worried about the state of software? There are people who drank so much of the schemaless (which was not an actual issue for any dev worth her salt to begin with) that you have to dispense this kind of advice? I find that bordering on criminal if someone did that to you and carries the title programmer.
Again, maybe that is just me.
Edit: not an attack on the parent: good advice. Just didn't know it was that bad. And sad.
Redis can be made durable. The WAIT command allows you to guarantee writes to a quorum of nodes, and it can be configured for on-disk persistence rather easily.
That said, due to it's single-threaded nature, blocking on quorum writes is likely to bottleneck your application under any kind of significant load. It really shines at volatile data, and while it can work for valuable data, there are better tools for the job.
Postgres, SQLite and many others are durable by default. Almost all so-called databases are like that. When you need a database, 90% of the time, you want durable. People make mistakes, developers are people, developers make mistakes, and one such mistake is assuming that Redis is like other databases in being durable by default when it's not. It's not conjecture, I've seen it done in production.
When you commit to Postgres and the server acknowledges it, you know for sure that it's been written to disk and that it will survive anything but a hardware disk loss (or, obviously, system/FS bug). When clustering is enabled with synchronous writes, you can also be confident that the data has been recorded to another node as well.
With redis clustering, there's no guarantee the data has been replicated. I'm not even sure there's any guarantee the data you just asked to be recorded be stored even once if a power outage happens immediately after the request.
I sorta do this, but my approach is more Redis-first than _just_ Redis. I try to see if I can use Redis for 99.999% of my operations and have a more durable store (like Postgres or something) as a "backup". The nature of Redis is such that even with some persistence features, we kinda have to assume that the data could go away at any minute, so I always build some way to rebuild Redis as fast as possible.
But I've run billions of queries per day against a single Redis instance with zero failures serving up traffic to large, enterprise-level customers with no downtime and no major issues (knock on wood). The only minor issues we've run into were some high-concurrency writes that caused save events and primaries to failover to replicas, and resulted in a few minutes downtime at the beginning of our experiments with Redis-first approaches, but that was easily mitigated once we realized what was happening and we haven't had a problem since.
Huge fan of a Redis-first approach, and while the haters have _some_ valid points, I think they're overstated and are missing out on a cool way to solve problems with this piece of tech.
If you want to use transactions for multi table updates you are probably best to use a proper rdbms. Not to mention read consistency. If you only have one table Redis may be fine. I usually find my work grows beyond one table. Redis does make an unbeatable cache.