farsa's comments

farsa · 2026-03-07T04:11:51 1772856711

I expect to have at least 15 more years in the workforce and I hate that I have to live through this "revolution". I worry about what will be final balance of lives improved vs lives worsened.

farsa · 2026-02-24T00:15:38 1771892138

Congrats on the progress! What is the behavior of PgDoc if it receives some sort of query it can't currently handle properly? Is there a linter/static analysis tool I can use to evaluate if my query will work?

levkk · 2026-02-24T01:13:49 1771895629

The current behavior unfortunately is to just let it through and return an incorrect result. We are adding more checks here and rely heavily on early adopters to have a decent test suite before launching their apps to prod.

That being said, we do have this [1]:

    [general]
    expanded_explain = true

This will modify the output of EXPLAIN queries to return routing decisions made by PgDog. If you see that your query is "direct-to-shard", i.e. goes to only one shard, you can be certain that it'll work as expected. These queries will talk to only one database and don't require us to manipulate the result or assemble results from multiple shards.

For cross-shard queries, you'll need your own integration tests, for now. We'll add checks here shortly. We have a decent CI suite as well, but it doesn't cover everything. Every time we look at that part of the code, we just end up adding more features, like the recent support for LIMIT x OFFSET y (PgDog rewrites it to LIMIT x + y and applies the offset calculation in memory).

We'll get there.

[1]: https://docs.pgdog.dev/features/sharding/explain/

farsa · 2026-02-13T00:16:51 1770941811

Storage getting cheaper did not really reach the cloud providers and for self-hosting it has recently gotten even more expensive due to AI bs.

farsa · 2025-11-04T23:37:43 1762299463

The distinction is more clear when indexing actual text and applying tokenization. A "typical" index on a database column goes like "column(value => rows)". When people mention inverted indexes its usually in the context of full text search, where "column value" usually goes through tokenization and you build an index for all N tokens of a column "column:(token 1 => rows)", "column:(token 2 => rows)",... "column:(token N => rows)".

farsa · 2025-08-10T00:48:06 1754786886

What was your experience like putting such thing together?

farsa · 2025-08-10T00:45:56 1754786756

Not the person you have asked but at work (we are a CRM platform) we allow our clients to arbitrarily query their userbase to find matching users for marketing campaigns (email, sms, whatsapp). These campaigns can some times target a few hundred thousand people. We are on a really ancient version of ES, but it sucks at this job in terms of throughput. Some experimenting with bigquery indicates it is so much better at mass exporting.

whakim · 2025-08-10T01:23:43 1754789023

Fair; my question was mostly in the context of ANN, since that was the discussion point - I have to assume ES (as a search engine) would not necessarily be the right tool for data warehousing types of workloads.

farsa · on Dec 15, 2024

Hey, you have some silly thing rendered at your product's landing page chewing CPU.

downrightmike · on Dec 16, 2024

Its not mine, just learned about it is all

farsa · on Aug 6, 2024

It's still in technical preview.

farsa · on Sept 22, 2023

It's not exactly simple as it involves some postgres specific knowledge, but you can make it reliable when working with transaction ids (see https://event-driven.io/en/ordering_in_postgres_outbox/).