Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Redis EXPIRE doesn't actually delete any data after it expires though. Active deletion happens at random, so you can easily still have expired values in memory months later:

> Redis keys are expired in two ways: a passive way, and an active way.

> A key is passively expired simply when some client tries to access it, and the key is found to be timed out.

> Of course this is not enough as there are expired keys that will never be accessed again. These keys should be expired anyway, so periodically Redis tests a few keys at random among keys with an expire set. All the keys that are already expired are deleted from the keyspace.

> Specifically this is what Redis does 10 times per second:

1. Test 20 random keys from the set of keys with an associated expire. 2. Delete all the keys found expired. 3. If more than 25% of keys were expired, start again from step 1.

So really it's not much better than doing `SELECT value from keys where key=? and expires > now()` with manual deletion. Though I agree that it can be more convenient.



I would contend that it really depends on what one would prioritize the most in that scenario. In my experience, Redis EXPIRE means it is not selectable. That is the primary requirement for a lot of development around EXPIRE/TTL. It is OK if it is still in memory in some form, it still won't be accessible by applications SDK or CLI. Since Redis 2.6 the expire error is from 0 to 1 milliseconds which is accurate enough for many use cases. Not to mention, Redis will handle that deletion for you. You don't need to run a deletion job and/or include an additional condition on a query.

Additionally, the expire/ttl/get/set in Redis is incredibly easy to use (and abuse, hence the OP article). Some team's criteria is limiting the amount of moving parts - and that's great. Don't use Redis and use a relational database for everything such as what you mentioned. Use it as a queue, a cache, a message broker, etc..

Other teams may care less about an extra moving part if it means their code will look simpler and they leverage relational databases for their more common usecases.


The fewer moving parts bit is key.

It was a government project, written by one team (us) to be maintained by another.

The data that needed to be expunged was user signup data, upon completion the record was sent to a CRM and the Redis record destroyed. If the signup wasn't finished it's automatically removed after 12 hours.

Referential integrity wasn't really a problem, emails are unique and if we clash the two records are auto-merged by the CRM.

Setting up scheduled tasks, triggers, partitioning, cron, etc, is just more things that can go wrong. If they go wrong _and_ go unnoticed we end up with piles of data we shouldn't have. That would be many different kinds of bad.

Doing `redis.set(k, v, ex: 12.hours)` or whatever is just easier.


You could very easily create a database view that applies the where query, and even prevent your db user from selecting from the underlying table.

You could also use a library like PG boss to handle the cleanup task.


> Redis EXPIRE doesn't actually delete any data after it expires though.

I guess OP likes the simplicity that built-in expiration provides. In your example - all selects reading the value will need to have this expiration check. And also some scheduled process will have to be written to actually delete the values.


I would access the table through a view that had that query built into it.

create table all_items(id integer, value text, expires timestamp);

create index all_item_expiry on all_items(expires);

create view items as (select id, value, expires from all_items where expires > now());

Then you can treat items as your base table and postgres neatly allows INSERT/UPDATE/DELETE from it. You'll need a job to clean up expires < now() items but it can be done at whatever arbitrary interval you like, could even be a trigger in PG if you were feeling spicy.


hmmm, I disagree that it's not better. Select operation implies index scan most likely with O(log n), while GET operation is essentially O(2-3). And you also have to run DELETE on sql to remove the expired keys.

Oh, and i'm not entirely sure about the part about redis active expiry (disabled by default, default is remove expired on lookup - lazy); you're talking about key eviction which applies to all deleted keys and AFAIR happens only when certain watermarks are hit. Since it happens in ram, it's also faaaast, unlike SQL DELETE, which will definitely involve disk...




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: