Funny. My approach is usually the other way around: Can I get away with *just* R...

wokwokwok · on June 12, 2021

Why would you choose to use a system that doesn't scale by default?

Single user local applications? Fair.

Web applications? Very strange choice imo.

Reddis is great, but it is *not* a database, and it's thoroughly rubbish at high load concurrency without clustering, which is (still) a massive pain in the ass to setup manually.

Of course, you can just use a hosted version off a cloud provider... but, it's generally about 10x more expensive than just a plain old database.

/shrug

I mean, sure, it's (arguably...) step up from just using sqlite, but... really, it's easy, and that's good... but it isn't good enough as a general replacement for having a real database.

(To be fair, sqlite has got some pretty sophisticated functionality too, even some support for concurrency; it's probably a step up from redis in many circumstances).

tehbeard · on June 12, 2021

> Why would you choose to use a system that doesn't scale by default?

By all accounts Postgres seems to be a pain to scale off a single machine, much more so than redis.

hardwaresofton · on June 12, 2021

Postgres is not as automatic as other tools but is mostly an artifact of it being around so long, and focus being on other things. Few projects have been around and stayed as relevant as postgres.

Most of the time, you really don't need to scale postgres more than vertically (outside of the usual read replicas), and if you have tons of reads (that aren't hitting cache, I guess), then you can scale reads relatively easily. The problem is that the guarantees that postgres gives you around your data are research-level hard -- you either quorum or you 2pc.

Once you start looking into solutions that scale easily, if they don't ding you on performance, things get murky really quick and all of a sudden you hear a lot of "read-your-writes" or "eventual consistency" -- they're weakening the problem so it can be solved easily.

All that said -- Citus and PostgresXL do exist. They're not perfect by any means, but you also have solutions that scale at the table-level like TimescaleDB and others. You can literally use Postgres for something it was never designed for and still be in a manageable situation -- try that with other tools.

All that said, KeyDB[0] looks pretty awesome. Multithreaded, easy clustering, and flash-as-memory in a pinch, I'm way more excited to roll that out than I am Redis these days.

[0]: https://github.com/EQ-Alpha/KeyDB

victor106 · on June 12, 2021

KeyDB is really good. We use it in production to achieve millisecond response times on millions of requests per second.

hardwaresofton · on June 12, 2021

It really looks absolutely amazing, I feel guilty because I want to run a service on it, there's almost no downside to running it everywhere you'd normally run Redis.

Also in the cool-redis-stuff category:

https://github.com/twitter/pelikan

Doesn't have the feature set that KeyDB has but both of these pieces of software feel like they could the basis of a cloud redis product that would be really efficient and fast. I've got some plans to do just that.

killingtime74 · on June 12, 2021

Are you Google search? How do you have millions of requests per second?

manigandham · on June 12, 2021

Lots of industries and applications can get to that scale. My last few companies were in adtech where that is common.

killingtime74 · on June 13, 2021

Thanks, I had no idea!

qeternity · on June 12, 2021

It's likely millions of internal requests, which as another comment mentions, is common in a number of industries.

cbsmith · on June 12, 2021

Which PostreSQL scaling pain point would you be referring to? Citus?

arpa · on June 12, 2021

redis is not a database. It's a key-value based cache. If you're using it as a database, you're gonna have a bad time.

cube2222 · on June 12, 2021

Why so? It has persistence and I'm not aware of any reported data loss happening with it.

It's also got loads of complex and useful instructions.

jasonwatkinspdx · on June 12, 2021

Redis is inherently lossy as a matter of basic design, and that's not even touching on the many other issues born of NIH solutions rampant within it. You may not hit the behavior until you push real loads through it. If you talk to anyone who has, I'm confident they'll agree with the criticism that while it may be an excellent cache, it should never be treated as a ground truth database. It's excellent as a slower memcachd with richer features. It's not a database. You can also read Aphyr's reports over the years, which to be utterly frank, bent over backwards to be charitable.

arpa · on June 12, 2021

Data loss can occur between flushes to disk, for example (by default every 2 seconds / every I_FORGOT megabytes). Perhaps (most likely) it is possible to fine-tune the configuration to have redis as a very reliable data store, but it doesn't come with such settings by default, unlike most of RDBMSes.

miohtama · on June 12, 2021

Not all use cases require reliable data storage and it is ok lose few seconds of data. Think simple discussion forums, internal chat applications. There are some scenarios where ease of use and a single server scalability pays off in the faster development and devops cost.

arpa · on June 12, 2021

GP was asking why redis is not a reliable storage solution/database. Redis is great as an unreliable (not source-of-truth) storage.

kwdc · on June 12, 2021

For that temporary use case, how does it compare to memcached?

jasonwatkinspdx · on June 12, 2021

Mostly boils down to Redis having a richer API, and memcached being faster / more efficient. The new EXT store stuff allows you to leverage fast ssd's to cache stupid huge datasets. Memcached is also one of the most battle tested things out there in open source. I've used them both plenty over the years, but tend to lean towards memcached now unless I really need some Redis API feature.

charrondev · on June 12, 2021

I work on a SaaS community forums service and I can assure you data loss is not acceptable to our clients.

As a result we use MySQL w/ memcached, although we are considering a swap to redis for the caching layer.

EvilEy3 · on June 12, 2021

> Not all use cases require reliable data storage and it is ok lose few seconds of data. Think simple discussion forums, internal chat applications.

That is definitely not ok. I'd be really pissed as a user if I wrote a huge comment and it suddenly disappeared.

miohtama · on June 12, 2021

It only disappears if there is a catastrophic failure. The likelihood for such thing to happen when you write a huge comment are less than jackpot in Las Vegas, a sensible risk tradeoff for better development experience and cost.

obstacle1 · on June 12, 2021

> a sensible risk tradeoff

Note the tradeoff doesn't make sense as soon as you're operating at a meaningful scale. A small likelihood of failure at small scale translates to "I expect a failure a million years from now", whereas at large scale it's more like "a month from now". Accepting the same percent risk of data loss in the former case might be OK, but in the latter case is irresponsible. Provided whatever you're storing is not transient data.

miohtama · on June 13, 2021

You are correct. I wrote the original comment as "single server" so I assume it does not mean a meaningful scale and can be more effectively dealt with a support ticket. Not everything needs to be growth trajectory SaaS.

TheCoelacanth · on June 12, 2021

How is that a sensible tradeoff compared to just using something that was actually designed to be a database when you need a database?

miohtama · on June 13, 2021

You do not need a relational database for a simple chat/forum application.

Hendrikto · on June 12, 2021

> sure, it's (arguably...) step up from just using sqlite

How so? What‘s wrong with SQLite?

wokwokwok · on June 12, 2021

I suppose it's a bit more suitable to networked services than sqlite is, since it's natively a web api, and sqlite is natively a local-only solution.

...but, I started writing about clustering and the network API, but, I can't really articulate why those are actually superior in any meaningful way to simply using sqlite, and given the irritation I've had in maintaining them in production in the past...

I guess you're probably right. If I had to pick, I'd probably use sqlite.

zigzag312 · on June 12, 2021

I would say Redis with RediSearch is a database.

hardwaresofton · on June 12, 2021

Hey I assume this is like a joke and not too serious, and we'd all switch off when things got a bit hairy, but I sure hope other readers can tell.

I am literally in the middle of digging a company out of this mistake (keeping Redis too long) right now. If your software/data is worth something, take a week or a month and figure out a reasonable schema, use an auto-generation tool, ORM, or hire a DB for a little bit to do something for you. Even MongoDB is better than redis if your're gonna do something like this.

chrisdinn · on June 12, 2021

If you store protos in your Redis keys (like most people using “NoSQL” for data storage), this comment doesn’t have much punch. Pretty sure we all can think of some pretty high profile examples of NoSQL + structured data working very very well at scale.

hardwaresofton · on June 12, 2021

I'm not trying to get on people who are using redis as a cache (for photos, or any other ephemeral data).

The idea I was trying to get at was using redis to store data traditionally reserved for OLTP workloads.

> Pretty sure we all can think of some pretty high profile examples of NoSQL + structured data working very very well at scale.

Well that's the thing, you very rarely hear of companies who cursed their decision early on to use NoSQL when they realized that their data was structured but in 20 different ways over the lifetime of the product. Some datasets only need light structure (key/value, a loosely defined document, schema-included documents), and other things should probably have a schema and be stored in a database with a tight grip on that schema and data consistency/correctness. Please don't use redis in that latter case.

dionian · on June 12, 2021

operations aside, the big problem in my experience dealing with these systems is you are extremely limited (on purpose) and cant do much sorting/filtering/aggregation/querying. that's what really makes true db's powerful. I love redis for what it does, i just dont think it replaces a DB well in many cases where its non-transient data

chrisdinn · on June 12, 2021

I mean, Google was built on protos in a “NoSQL” database (BigTable). I think maybe you are overindexing on personal experience.

hardwaresofton · on June 12, 2021

Sure, but:

1) 99.9% of internet-facing/adjacent businesses are not Google and will never reach even 1% of Google's scale

2) Proto + BigTable is very different from just throwing stuff in redis/mongo. Proto schemas are compile-time enforced, which is great for some teams and might slow others down. Google enforces more discipline than your average engineering team -- this is overkill for most engineering teams.

tluyben2 · on June 12, 2021

> take a week or a month and figure out a reasonable schema, use an auto-generation tool, ORM, or hire a DB for a little bit to do something for you.

Sorry but am I the only one who is very worried about the state of software? There are people who drank so much of the schemaless (which was not an actual issue for any dev worth her salt to begin with) that you have to dispense this kind of advice? I find that bordering on criminal if someone did that to you and carries the title programmer.

Again, maybe that is just me.

Edit: not an attack on the parent: good advice. Just didn't know it was that bad. And sad.

himinlomax · on June 12, 2021

If your data has no value whatsoever, sure.

cheald · on June 12, 2021

Redis can be made durable. The WAIT command allows you to guarantee writes to a quorum of nodes, and it can be configured for on-disk persistence rather easily.

That said, due to it's single-threaded nature, blocking on quorum writes is likely to bottleneck your application under any kind of significant load. It really shines at volatile data, and while it can work for valuable data, there are better tools for the job.

himinlomax · on June 12, 2021

> Redis can be made durable

Postgres, SQLite and many others are durable by default. Almost all so-called databases are like that. When you need a database, 90% of the time, you want durable. People make mistakes, developers are people, developers make mistakes, and one such mistake is assuming that Redis is like other databases in being durable by default when it's not. It's not conjecture, I've seen it done in production.

YetAnotherNick · on June 12, 2021

Why does by default matters so much to you? Redis has persistence support and it can be easily turned on.

himinlomax · on June 12, 2021

vi has persistence support

gizdan · on June 12, 2021

Why? Redis has persistent data stores, and doesn't necessarily need to be memory only.

himinlomax · on June 12, 2021

When you commit to Postgres and the server acknowledges it, you know for sure that it's been written to disk and that it will survive anything but a hardware disk loss (or, obviously, system/FS bug). When clustering is enabled with synchronous writes, you can also be confident that the data has been recorded to another node as well.

With redis clustering, there's no guarantee the data has been replicated. I'm not even sure there's any guarantee the data you just asked to be recorded be stored even once if a power outage happens immediately after the request.

Jemaclus · on June 12, 2021

I sorta do this, but my approach is more Redis-first than _just_ Redis. I try to see if I can use Redis for 99.999% of my operations and have a more durable store (like Postgres or something) as a "backup". The nature of Redis is such that even with some persistence features, we kinda have to assume that the data could go away at any minute, so I always build some way to rebuild Redis as fast as possible.

But I've run billions of queries per day against a single Redis instance with zero failures serving up traffic to large, enterprise-level customers with no downtime and no major issues (knock on wood). The only minor issues we've run into were some high-concurrency writes that caused save events and primaries to failover to replicas, and resulted in a few minutes downtime at the beginning of our experiments with Redis-first approaches, but that was easily mitigated once we realized what was happening and we haven't had a problem since.

Huge fan of a Redis-first approach, and while the haters have _some_ valid points, I think they're overstated and are missing out on a cool way to solve problems with this piece of tech.

grenoire · on June 12, 2021

My arsenal is Redis to SQLite to pg.

mianos · on June 12, 2021

If you want to use transactions for multi table updates you are probably best to use a proper rdbms. Not to mention read consistency. If you only have one table Redis may be fine. I usually find my work grows beyond one table. Redis does make an unbeatable cache.

siscia · on June 12, 2021

It was the reason with https://zeesql.com formerly know as https://RediSQL.com was written.

My clients seems rather happy with the project.

dionian · on June 12, 2021

well... how much uptime do you need and how much resources do you have to devote towards achieving your desired uptime