Hacker Newsnew | past | comments | ask | show | jobs | submit | subhobroto's commentslogin

Another pretty good reason to do distributed computing is to move the computation closer to where the data is or where the data will be consumed.

These are good suggestions but I'm apprehensive they might come back and say they have 64 GB (or less) of RAM or they are using PostgreSQL RDS on AWS or something.

I asked them for specifics.


I don't think it really matters in terms of their question though, given MySql on the same specs doesn't have the problem and postgres does. Quite clearly it has something to do with indexes and what is the wall postgres is running into that causes the drop off on quite low amounts of rows. If the answer is just get more RAM, it kind of implies postgres is not really that scalable. Especially if the drop off is proportional to the number of rows.

Why are you using hash indexes? They're much less widely used than standard B-Tree indexes. The bucket split code likely isn't very scalable [1].

I suggest testing the same workload with your existing hash indexes replaced with equivalent B-Trees.

[1] https://github.com/postgres/postgres/blob/master/src/backend...


Last time I almost used a hash index in Postgres, I learned it was an incomplete feature and not crash-safe yet. This was v9.3? At that same time, MySQL had them and they were ok to use.

Later that got fixed, but I haven't tried again since, just been using btree because it seemed like Postgres favored that and it has theoretical advantages too.


You've given us some idea of the volume of your data but there's no mention of what's ingesting it or how.

> during these stress tests the hardware is nowhere close to over-encumbered, and there's consistent headroom on both memory, CPU and disk I/O

This assertion is likely wrong - you're likely skipping over some metrics that has clues to what we need to know. Here are some questions to get the discussion moving.

- Is this PostgreSQL managed or self-hosted?

Your mention of "consistent headroom on both memory, CPU and disk I/O" gives me hope you're self-hosting it but I've heard the same thing in the past from people attempting to use RDS and wondering the same as you are, so no assumptions.

- Are you using COPY or multi-row INSERT statements?

- How much RAM does that server have?

- What is the fillfactor, max_wal_size and checkpoint_timeout?

- Is the WAL on NVMe?

- What's the iostat or wa during the slowdown?

- Are random UUIDs (part of) the index?

Have you posted to https://dba.stackexchange.com/

If I were you, I would create a GitHub repo that has scripts that synthesize the data and reproduce the issues you're seeing.


You might be thinking of Pigsty?

Atleast I hope you are! Nothing else has been as well battletested. Unfortunately, perhaps because if its name, it gets no facetime on HN. Its last few mentions here barely received attention it deserved.



DBOS is amazing when it comes to Durable Workflows. There are others in the space - the most popular one being Temporal but I argue, Temporal is also the most complicated one. I often say Temporal is like Kubernetes while DBOS is like `docker compose`. (and for those taking me literally, you can use DBOS in Kubernetes!)

I don't realize why DBOS is not nearly as popular as Temporal but it has made a world of difference building Durable Queues and Long Running, Durable Workflows in Python (it supports other languages too).

As they show in this article, Postgres scales impressively well (4 billion workflows per day, on a db.m7i.24xlarge, enough for most applications), which is why, if you have your PostgreSQL backup/restore strategy knocked out and dialed in, you should really take a close look at DBOS to handle your cloud agnostic or self hosted Durable Queues and Durable Workflows. It's an amazing piece of software founded by the original author of Ingres (precusor to Postgres - the story of DBOS itself is captivating. I believe it started from being unable to scale Spark job scheduling)


DBOS looks simple (good), but from the docs below, executor elasticity appears to be locked behind license purchase. So it truly is like docker compose, good and bad parts?

https://docs.dbos.dev/production/workflow-recovery#recovery-...

>When self-hosting in a distributed setting without Conductor, it is important to manage workflow recovery so that when an executor crashes, restarts, or is shut down, its workflows are recovered. You should assign each executor running a DBOS application an executor ID through DBOS configuration. Each workflow is tagged with the ID of the executor that started it. When an application with an executor ID restarts, it only recovers pending workflows assigned to that executor ID.

https://docs.dbos.dev/production/hosting-conductor

> Self-hosted Conductor is released under a proprietary license. Self-hosting Conductor for commercial or production use requires a paid license key.


This is a good question! No, it's not like docker compose (I imagine you implied the swarm and hub pull limits?)

DBOS Conductor is an out of band management service that IIRC helps you mainly observe your DBOS and recover failures in a seamless way. As far as I could see, it's not necessary, for you to use DBOS workflows and queues. Don't quote me though and reach out to their forum and verify in case I'm missing certain usecases.

Personally, I do not use DBOS Conductor - I have my own observable setup using Grafana/VictioriaMetrics as my workflows are instrumented with OTel. I had initially set Conductor up for development (and it looked to be free for development although I recall some major limitations on how many workflows etc - which is why I put my own alternate monitoring setup).

They also have a very reasonably priced cloud hosted DBOS Conductor. I think my first 30 days were completely free and then they moved me to a "hobby" tier. It's a fantastic way to help decide whether it's for you.

I believe DBOS Conductor is how DBOS pays the bills but you can use DBOS workflows and queues unlimited without DBOS Conductor. If you don't want to pay for Conductor - their out of band management service, you can put together your own just fine, like I did. My own Grafana/VictioriaMetrics setup answers my questions but I would imagine Claude/Codex/Cursor should be able to put something fairly useful if you didn't want to go down my route.

> executor elasticity appears to be locked behind license purchase

DBOS has designed their system to be extremely flexible and extensible. While yes, Conductor can absolutely manage your executors for you, it's not the only thing that can. You're not limited to using Conductor. As I said, I manage my own - everything you need to know to do so is in the code and documentation. They even have a document for LLMs and agents. I have had to interact with the DBOS team 0 times to set everything up.

I prefer this business model (an optional tool - Conductor, is paid) vs. DBOS offering just everything across the stack on a "free tier" but with caps on DBOS workflows and queues. In their current business model DBOS workflows and queues are completely uncapped (atleast from what I can make out).

If you do reach out to them, I would appreciate if you let me know anything to the contrary.


Are you using Temporal with distributed workers?

We have a simple worker setup and temporal is pretty easy to setup

Out only issue is really needing an intermediary data store for task result storage

We are using DBOS in new projects as it's even simpler and the downside (task log interface behind saas) is easily remedied with a copilot generated task viewer


The reason that DBOS isn't as popular is because it's younger. DBOS launched in the form we know it in 2024. Temporal is much older; Temporal is technically a fork of Cadence and Cadence released originally in 2017, with Temporal forking and releasing back in 2020. When all three are trying to be "the same sort of thing" and that thing is new, it's hard to show up 7-8 years after the trailblazers and say "oh yeah, we're clearly better" when the existing thing works and is used by tons of folks.

Temporal is a dumpster fire, they've gotten so much VC funding (recently had D, 300M at a 5bn valuation) with ... nothing to build except ways to trap customers into their SAAS.

I give them about a year or two before the wheels fall off, then it's off to Broadcom and friends.

But I could be wrong as now they're not in the 'durable execution' space at all, it's 'durable execution for ai' according to their latest conference.

Got to spend that VC dosh somewhere I suppose, they're certainly not spending it on making a good product.


Temporal employee here. I'm very surprised by your comment.

It's true that we recently had a Series D and that VC firms recognize the value of what we do. The Temporal Server software is 100% open source (MIT license: https://github.com/temporalio/temporal/blob/main/LICENSE). It's totally free and you don't even need to fill out a registration form, just download precompiled binaries from GitHub or clone the repo and build it yourself. You can self-host it anywhere you like, no restrictions on scale or commercial usage. We offer SaaS (Temporal Cloud), which customers can choose as an alternative self-hosting, based on their needs. The migration path is bi-directional, so not a trap by any definition.

Regarding AI, Temporal is widely used in that space, but that does not negate the thousands of other companies that use Temporal for other things (e.g., order management systems, customer onboarding, loan origination, money movement, cloud infrastructure management, and so on). In fact, our growth in the AI market came about because companies who were already using Temporal for other use cases realized that it also solved the problems they encountered in their AI projects.

And to your last point, we've made dozens of enhancements to the product (here's a small sample: https://temporal.io/blog/categories/product-news). I'd encourage you to follow the news from next week's Replay conference (https://replay.temporal.io/) because we'll be announcing many more.


Maybe. But as someone who happily self hosting pretty big Temporal workloads for my day job (I inherited it from early adopters circa 2022), it definitely does not feel like a dumpster fire. It chugs along unglamorously and I enjoy working on it.

MOSS-Audio is an open-source audio understanding model from MOSI.AI, the OpenMOSS team, and Shanghai Innovation Institute. It performs unified modeling over complex real-world audio, supporting speech understanding, environmental sound understanding, music understanding, audio captioning, time-aware QA, and complex reasoning.

The current release has four models: MOSS-Audio-4B-Instruct, MOSS-Audio-4B-Thinking, MOSS-Audio-8B-Instruct, and MOSS-Audio-8B-Thinking.

The Instruct variants are optimized for direct instruction following, while the Thinking variants provide stronger chain-of-thought reasoning capabilities.


Simon, I really enjoy your live coding sessions. If you do another one, would you mind showing this part as well? It would be extremely educative.

I haven't been able to do without an `.MD` - no agent (CC, Codex, OpenHands) was smart enough to figure out my layout unguided. So much so, a few weeks ago, I had Claude write the guideline below to document the way I like to lay out my tests and modules. I make extensive use of uv workspaces and don't ship tests to production deployments:

```

- uv Workspace Architecture (`uv` v0.11.8+, `packages/` members):

  **Build tool:** Exclusively `uv_build`. Never `hatchling` or any other build backend.
  Pin as `uv_build>=0.6` in every `[build-system]` block.

  **Naming convention — flat, distinct package names (NOT a shared namespace):**
  Each workspace member uses a *flat* Python package name that is unique across the workspace.
  The `uv_build` backend auto-discovers the module by converting the project name (hyphens → underscores):
  `base-constants` → `src/base_constants/__init__.py`
  `base-domain`    → `src/base_domain/__init__.py`
  `base-geometry`  → `src/base_geometry/__init__.py`
  etc.
  No `[tool.uv.build-backend] module-name` override is needed because the project name already maps directly.

  **Why NOT a `base.*` namespace package:**
  `uv_build` cannot support PEP 420-style namespace packages across workspace members.
  It maps each project name to exactly one module root; only one member can own `base/__init__.py`.
  Attempting `module-name = "base.constants"` treats the dotted name as a nested directory,
  not a namespace — it looks for `src/base/constants/__init__.py`. Confirmed by binary string
  inspection of the `uv` binary. NEVER attempt namespace packages with this build backend.

  **Import style (locked, never change):**
  `from base_constants import CONSTANT_A`        
  `from base.constants import CONSTANT_A`          (namespace layout — abandoned)

  **Tests member:** `package = false` in `[tool.uv]`, no `[build-system]` block at all.
  Tests are never shipped in production; the member exists solely to isolate test dependencies.

  **Microservice split story:** When a member needs to become a standalone repository,
  only the `[tool.uv.sources]` entry in the consuming `pyproject.toml` changes
  (workspace source → PyPI or VCS source). The package code itself is unchanged.
- *Future-phase features: stub, NEVER implement.* When a feature is explicitly scoped to a later phase (e.g., "Phase 4"), write a one-line stub that raises `NotImplementedError` plus a docstring describing the Phase 4 contract. A full implementation spends tokens on untested code that may never ship in its current form. Exception: if the full implementation is ≤ 5 trivial lines and directly validates the current phase's math, implement it outright.

```

Similarly, I find it annoying that every agent uses f-strings inside logging calls. Since I added this, that hasn't been a problem:

```

- NEVER use f-strings or .format() inside logging calls. This forces the string to be interpolated immediately, even if the log level (like DEBUG) is currently disabled. You should NEVER do this and if you notice this in existing code, FLAG IT immediately! By passing the string and the variables separately, you allow the logging library to perform lazy interpolation only when the message is actually being written to the logs. It also increases the caridinality for Structured Logging rendering observability useless!

  BAD:

  ```python
  # The f-string is evaluated BEFORE the logging level is checked.
  # This:
  # - wastes CPU cycles if the log level is higher than INFO
  # - increases the caridinality for Structured Logging rendering observability useless!
  log.info(f"denominator {denominator} is negative!")
  ```

  GOOD:

  ```python
  # The ONLY right way - logging module only merges the variable into the string if 
  # the INFO level is actually enabled.
  log.info("denominator %s is negative!", denominator)
  ```

  Note: Using this "Good" pattern ALSO helps with Structured Logging. Tools like Sentry or ELK can group logs by the template string ("denominator %s is negative!") rather than seeing every unique f-string as a completely different error type.
```

You have been a tremendous influence on my professional life. Vagrant made VMs easy to use. You were very gentle with my Vagrant PRs. We disagreed a bit and I migrated some of those rejected Vagrant PRs into VeeWee. Then Hashicorp happened and I was over the moon. (Full transparency - not everything was perfect, I lost 50% of my Hashicorp equity which hurt real bad but that's not your fault, just saying there were ups and downs!)

This is all to say I have tremendous respect for you. Which is why I say:

You also have the resources to fix this. You not only have the resources and skill Mitchell, to make it happen - You know everything that it takes to be the CEO of a Billion dollar unicorn - you have the connections, you have the vision.

More importantly, Mitchell, you care.

Make it happen. You have done it a few times before. Do it again.


> Streaming replication whether from RDS

Are you using AWS RDS Custom to receive the WAL Streams or are you using something like Pigsty? Really curious about the actual specifics


> It works, I've shipped this as a "local inference"/poor person's ollama for low-end llm tasks like search

fantastic!

> the model download is orders of magnitude greater than downloading the browser itself, and something that needs to happen before you get your first token back

sure but does this mean the model is lazily downloaded? that is, if I used this and I am the first time the model was called, the user would be waiting until the model was downloaded at that point?

that sounds like a horrible user experience - maybe chrome reduces the confusion by showing a download dialog status or similar?

also, any idea what the on disk impact is?


The model download is lazy and cached, so it's a one-time cost presumably across all origins (I assume so since the alternative would be a trivial DoS waiting to happen).

So it's once per browser, not once per site.

You can track the download state yourself and pop whatever UI you want.


chrome://on-device-internals reports "Model Name: v3Nano Version: 2025.06.30.1229 Folder size: 4,072.13 MiB" on a random Windows machine I just checked.

Thank You stranger! I would have assumed the size would vary based on whether your hardware supports the high-quality GPU backend (4 GB) or defaults to a smaller CPU-compatible version (3 GB) but the 22GB note on that page is really confusing. Even if it was including the model server where's the remaining 18GB going towards?

I'd imagine that the 22GB was decided through modelling various scenarios. For a start, it's not just a 4GB current model, it's 2x4GB to be able to update it without needing time when the computer is without a model, that's up to 8GB.

Then it's possible the model you get will scale with the CPU/GPU/RAM available, so if you have a 12GB GPU you probably get a better model, perhaps that's a 10-11GB model? At 2x that's 22GB.

Then consider that a machine is not static, GPUs/hardware come and go, VRAM allocation in integrated graphics changes, etc. You end up with just needing to pick a number and not confuse users.


(Former Chrome built-in AI team member here.)

This is part of it, and also we just didn't want to use up the last of the user's disk space! It's disrespectful to use up 3 GB if the user only has 4 GB left; it's sketchy if the user only has 10 GB. At 22 GB, we felt there was more room to breathe.

One could argue that users should have more agency and transparency into these decisions, and for power users I agree... some kind of neato model management UI in chrome://settings would have been cool. But 99% of users would never see that, so I don't think it ever got built.


> Storage: At least 22 GB of free space on the volume that contains your Chrome profile.

Yes, but that is then followed by:

  > Built-in models should be significantly smaller. The exact size may vary slightly with updates.

Lmao and here I am still staunchly treating Blazor’s 2MB runtime as a deal-breaker

Emacs had long ago exceeded eight megs!

If it doesn't fit on a floppy...!

> `> Storage: At least 22 GB of free space on the volume that contains your Chrome profile.`

Yes, I can read and comprehend English and you should assume I read the page. Because of the "At least" wording, I was curious what a person who has actually used the feature has noticed, aka, learning from people who have actually done it already.


Doesn't sound great, but consider how much better this is than every webpage trying to load their own models.

If it turns out useful enough I'm sure browsers will just start including it as (perhaps optional?) part of installation.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: