More

postgressomethi · on Aug 17, 2024

I always thought there should be a two-arg overload of new, so you could write new(bool, true) or new(int, 20). Would solve the problem without any trickery.

postgressomethi · on May 4, 2024

Could "effectively controls access" be attacked here? The purpose of the hardware is not to control access.

dannyw · on May 4, 2024

It’s not hardware, it’s encryption and decryption systems.

postgressomethi · on May 6, 2024

A Nintendo Switch is a non-hardware decryption system?

postgressomethi · on April 30, 2024

> You might think adding a UNIQUE index would cause the "losing" xact to get constraint errors, but instead both xacts succeed and no longer have a race condition.

This is not true. What happens is that the (sub)transaction that loses the race to the index is aborted:

  =# INSERT INTO foo (bar) (SELECT max(bar) + 1 FROM foo);
  ERROR:  duplicate key value violates unique constraint "foo_bar_idx"
  DETAIL:  Key (bar)=(2) already exists.

hot_gril · on May 1, 2024

I probably got the wrong query. There was some other one where the unique index removes a race condition, lemme find it...

postgressomethi · on Jan 12, 2024

> Imagine booting your computer and having to give an email to login offline.

As much as I hate what modern Windows has become, this is not actually true. If you know the correct sequence of clicks you can avoid this. Not defending this bullshit, though.

wlesieutre · on Jan 12, 2024

They've made it much more difficult than it was even a couple of years ago when you could just install without the computer connected to the internet. Now you have to start with an internet connection and then disconnect partway through and jump through command line hoops to make the installer let you keep installing

https://www.tomshardware.com/how-to/install-windows-11-witho...

postgressomethi · on Jan 1, 2024

While that's kind of convenient in a lot of ways, it also makes querying the database really annoying, since you have to remember to add the filtering to every single query or you're screwed. Personally, I wish there was a database-implementation-acknowledged "deleted" flag, which could even expose a "history table"-like interface.

Karellen · on Jan 1, 2024

How does using a timestamp column make querying the database more annoying than using a boolean column? Aren't the places you have to use filters exactly the same either way?

postgressomethi · on Jan 1, 2024

I'm sorry, I wasn't being clear. Any column meaning "this row is to be ignored for most applications" is an annoying pattern.

postgressomethi · on Jan 1, 2024

Ah, now I see I responded to the wrong message.

postgressomethi · on Sept 22, 2023

Polling an updated_at column is not robust in its most simple form, as transactions are not guaranteed to commit in that order.

_acco · on Sept 22, 2023

Author here. Good point. For those that are curious, parent is referring to the following situation:

1. Transaction A starts, its before trigger fires, Row 1 has its updated_at timestamp set to 2023-09-22 12:00:01.

2. Transaction B starts a moment later, its before trigger fires, Row 2 has its updated_at timestamp set to 2023-09-22 12:00:02.

3. Transaction B commits successfully.

4. Polling query runs, sees Row 2 as the latest change, and updates its cursor to 2023-09-22 12:00:02.

5. Transaction A then commits successfully.

A simple way to avoid this issue is to not poll close to real-time, as the order is eventually consistent.

Perhaps a more robust suggestion would be to use a sequence? Imagine a new column, `updated_at_idx`, that incremented every time a row was changed.

postgressomethi · on Sept 22, 2023

Sequences kind of have the same issue, because you don't know if a gap is because of a rollback or an uncommitted transaction. Though with some logic you can do a pretty good job at this with sequences. And then you're not in the realm of "simple" anymore, at all.

_acco · on Sept 22, 2023

Any ideas for a simple polling implementation that's more robust?

farsa · on Sept 22, 2023

It's not exactly simple as it involves some postgres specific knowledge, but you can make it reliable when working with transaction ids (see https://event-driven.io/en/ordering_in_postgres_outbox/).

qazxcvbnm · on Sept 22, 2023

I've had pretty much the exact same problem and what I went for in my low-volume case was to simply add advisory locks such that I can guarantee the transaction start times provide correct ordering.

oconnore · on Sept 22, 2023

Set the trigger to add the primary key + change time to a separate table, then scan/truncate that table to poll changes.

iknownothow · on Sept 22, 2023

Woah, that's news to me. Is that true even if triggers are used to update a column?

  CREATE OR REPLACE FUNCTION
      update_updated_at_function()
  RETURNS TRIGGER AS $$
  BEGIN
      NEW.updated_at = now();
      RETURN NEW;
  END;
  $$ language 'plpgsql';


  CREATE TRIGGER
      update_updated_at_trigger
  BEFORE INSERT OR UPDATE ON
      "my_schema"."my_table"
  FOR EACH ROW EXECUTE PROCEDURE
      update_updated_at_function();
  END $$;

Is it possible for two rows to have `updated_at` timestamps that are different from the transaction commit order even if the above function and trigger are used? It's alright if `updated_at` and the commit timestamp are not the same, but the `updated_at` must represent commit order accurate to the millisecond/microsecond.

smilliken · on Sept 22, 2023

To confirm your fear, you can't use the updated_at timestamp as a proxy for commit order. The commits happen in a different order, and can be arbitrarily far apart, like hours or days depending on how long your transactions can last.

singron · on Sept 22, 2023

now() is the timestamp the transaction began at. There is no function to return the commit timestamp because you have to write the value before you commit.

physicles · on Sept 23, 2023

For polling, instead of updated_at, I use a _txid column that gets set by a trigger to the current transaction ID. Then, when polling, use txid_current() to see which transactions have committed and which haven’t. It’s a little dicey and super easy to hit fencepost errors, but it’s been running smoothly in production for a few years.

postgressomethi · on June 16, 2021

> Adding a column, changing column's nullability and adding/changing constraints is already zero-downtime in PG.

Making a previously nullable column NOT NULL is not zero downtime. Neither are adding constraints -- except in some cases with NOT VALID, which isn't quite the same thing.

> Dropping a column and changing the data type are not zero-downtime.

Dropping a column is zero downtime.

postgressomethi · on June 16, 2021

I don't like really solutions which force everything into a single schema just to do migrations. They shouldn't be that difficult.

In this particular case, rather than making everything a view all the time, you could just use views during a migration window to get the same effect. Replace the table with a view which has all the columns plus the new name as an alias for the old one, migrate all the clients, replace the view with the table with the column renamed. No special permanent crap necessary.

fabianlindfors · on June 16, 2021

The idea is not for all the different versions to be permanent, just temporary while the clients are being updated. Check out the top-level comment I wrote for a more fleshed our explanation!

postgressomethi · on June 17, 2021

That's still, in my opinion, overkill and unnecessary.

klysm · on June 17, 2021

That would probably depend on the use case no?

postgressomethi · on June 17, 2021

postgressomethi · on June 13, 2021

Can this quote die already? Rob has repeatedly demonstrated he doesn't know what's best for everyone.

dylan604 · on June 13, 2021

It sounds like the 340k of memory quote. Yes, it did not age well, but you know what, some times people can't predict the future and at the time the quote was said, it made some sense then. The internet is full of "smarter than everyone else" that have never misspoke about anything, ever.

Out_of_Characte · on June 13, 2021

Bill gates did not misspeak. Because he has never said it in the first place. Not to rain on your parade, people make mistakes all the time and everyone has a duty to not escalate mistakes.

" Gates himself has strenuously denied making the comment. In a newspaper column that he wrote in the mid-1990s, Gates responded to a student's question about the quote: "I've said some stupid things and some wrong things, but not that. No one involved in computers would ever say that a certain amount of memory is enough for all time." Later in the column, he added, "I keep bumping into that silly quotation attributed to me that says 640K of memory is enough. There's never a citation; the quotation just floats like a rumor, repeated again and again." "

https://www.computerworld.com/article/2534312/the--640k--quo...

dylan604 · on June 13, 2021

I specifically left the 340k quote unattributed for this reason. However, it is an (in)famous quote on the internet. That's all I was referencing. So, it's not my parade you're raining on, so hope you don't have a big jump from your soap box. You kind of proved my point of the internet if full of "people smarter than everyone else"

banana_maker · on June 13, 2021

And some quotes didn’t make sense at the time they were said, either.

djur · on June 13, 2021

He said it in 2012.

sim_card_map · on June 13, 2021

340k of memory quote is a myth

postgressomethi · on June 12, 2021

> LISTEN ties up one connection completely

I've seen this twice in this thread, but I don't know what that means. Can you explain a bit?

sa46 · on June 12, 2021

When a client connects to Postgres, Postgres creates a new process just for that client. Separate processes are great for isolation but it means Postgres connections are a bit expensive (this is an active area of improvement). Postgres really starts to struggle once you a have a few thousand connections, even if those connections aren’t doing anything.

The common workaround is to use a connection pooler like PGBouncer so that clients reuse connections. This approach doesn’t work for LISTEN because typically a client will listen for its entire lifecycle so you can’t share connections in a pool.

postgressomethi · on June 13, 2021

> This approach doesn’t work for LISTEN because typically a client will listen for its entire lifecycle so you can’t share connections in a pool.

But you only need one connection for LISTEN per database, total. So I'm confused why this is made out to be a big issue.

taffer · on June 12, 2021

You need one connection per worker thread, and realistically you would only have one worker per cpu core. So how many LISTENing connections do you really need?

sa46 · on June 12, 2021

You generally have more than one database connection per thread. As an example, consider Node.js which acts like a single threaded process. You’d probably want to be able to handle more than 1 database query concurrently since network latency tends to dominate OLTP requests.

How you setup LISTEN and NOTIFY is app dependent. In a multi-tenant database, you could have 1 NOTIFY channel per tenant.

As you scale, you probably do something that listens in a smarter way, maybe 1 Go channel per client with a single LISTEN instead of 1 database connection per client. The downside is that now the app code is responsible for tenant isolation instead of the database.

taffer · on June 13, 2021

> [...] since network latency tends to dominate OLTP requests.

You are absolutely right about that. However, in this case we are talking about background workers. I would argue that background workers, such as an image resizing worker, are typically CPU-bound and do not perform high-volume OLTP queries. Therefore, a background worker does not require multiple database connections per thread as a web server would.