The article talks about OS capabilities in the second part when it discusses Moj...

tome · 2026-05-08T13:25:15 1778246715

> > Haskell already passes a type object as an argument to anything which does IO. They don't do it for security. Turns out having pure functions separated from non-pure functions is a beautiful thing.

> But almost nobody uses Haskell

Sad, but true

> partly because of poor ergonomics like this!

I'm somewhat dubious that's the reason, partly because I find such ergonomic excellent! Especially those provided by my capability system Bluefin: https://hackage.haskell.org/package/bluefin

josephg · 2026-05-08T22:28:05 1778279285

> We're talking about critical bugs in the filesystem so what the FS processes idea of a file handle is doesn't really matter.

The copyfail bug wasn’t a bug in the filesystem code. It was a bug in the crypto algorithm code, which wrote to the filesystem page table without checking if the process invoking it had permission to write to the passed file handle. In a monolithic kernel like Linux, every subsystem can access the memory of every other subsystem by default. It’s up to each subsystem to be careful. As we keep discovering, “be really careful” is not a successful security strategy.

A capability based OS like SeL4 is more secure. With SeL4, you would put the crypto algorithms and filesystem in separate user space processes. These processes would only communicate by RPC, by invoking capabilities. We can imagine how the copyfail scenario would play out: A user process has a capability representing its (read only) access to some privileged file on disk. It passes that capability to the crypto algorithm process. A bug - or even complete takeover - of the crypto algorithm process still doesn’t change that the file cap is read only. The crypto algorithm process doesn’t have direct access to the memory representing that file. It only has the read only file handle. All it can do with that handle is invoke it, which will only give it read access. Even with a bug in the crypto algorithms process, the OS would stay secure.

Yes, capability OSes aren't a magic bullet. A bug in the filesystem process could still result in filesystem corruption. But better is better. OS capabilities provide defence in depth. They would have prevented copyfail.

As far as I can tell, your argument against capabilities is that they might be slow. Some implementations have poor ergonomics. They don’t magically solve every possible security bug. You also, personally, used a bad implementation of capabilities this one time years ago in Java. Is that accurate?

You must see how unconvincing I find your argument. What are you even trying to do? Convince people to not explore different ideas in computer science? When I close my eyes I see an old man yelling: “Hey you kids! What are you doing up there, trying new things? You stop that right now!”

mike_hearn · 2026-05-09T11:27:09 1778326029

I don't recall making a performance argument against capabilities, but I think we're conflating microkernels and capability based languages. You can have capabilities without microkernels and that's often what people mean when they talk about passing caps into main(). Context switching at the hardware level does have a performance cost, so if you want to use lots of capabilities without a special programming language then you're going to pay for that yes.

I don't think there are any good mainstream capability based programming languages. At least I've never seen one. Actually the SecurityManager is I think the best implementation that has existed. I've not yet seen a credible proposal that's better. Stuff like Mojo and SEL4 is at least deployed to production but that's not a programming language.

> What are you even trying to do? Convince people to not explore different ideas in computer science?

No. Please go read the opening of the article again, which says: "In this essay I want to show you the challenges that you’ll face if you want to walk that path. This isn’t meant to put anyone off, just to draw a map of the territory you’re about to enter and explain why it’s currently deserted."

Lots of people have proposed capabilities as some silver bullet over the years, yet real systems hardly use them. Anyone who is serious about their own ideas should want to understand why that is and that's the goal of the article. It doesn't say nobody can do better! The whole point of writing it, is the hope that someone will. But to do better you have to understand why existing systems failed. It wasn't (primarily) about performance.

jason_oster · 2026-05-09T06:46:53 1778309213

> If you can confuse or buffer overflow the FS process by sending it messages, you can then edit state inside that process you weren't supposed to be able to access, and as that process controls the security system for everything it's game over.

The assumption here is that the FS is the root of trust for the kernel. (A claim I consider dubious, but what do I know about knowing things?) It's another way to say that if you don't harden your root of trust, you're SOL. Which, ok, fair enough. But that's frankly irrelevant because hardening the root of trust is table stakes. The system cannot be secured without it, regardless of the threat model.

All of the concerns about a definition of "getting hacked" falls out of ignoring the hardening of the root of trust. I don't wish to put words in your mouth, but my interpretation of the argument is essentially, "we can't have nice things because the root of trust cannot be hardened sufficiently to prevent all intrusions."

Iff the FS is the root of trust, and it is not possible to confuse the FS by sending it messages, then there is no game over. You have a root of trust that cannot be broken.

> Microkernels have no way to stop this, which is one reason very few operating systems move the core FS out into a separate process.

My reading of the history reaches a very different conclusion. First, the primary reason that very few operating systems in practice use a microkernel design is because Linus Torvalds believed it was too slow for early 90's hardware [1]. And everyone else just does whatever Linux is doing.

Second, security through surface area reduction (and more broadly, defense-in-depth) was always the point of the microkernel design [2]. Trivially, the principle of least privilege is how one arrives at a secure system. Monolithic kernels, to this very day, continue to prove that they cannot be secured in any practical manner. I can only assume we need things to get worse before kernel developers will tighten up and take security seriously.

> So you might as well just run it in-kernel and reap the performance benefits.

There's that same mentality. Apparently "speed at all costs" is the willful trading of security for performance. That position is just as flawed as trading essential liberty for temporary safety [3]. It doesn't matter how fast the thing is when the slightest bump always causes it to explode, killing everyone on board.

[1]: https://web.archive.org/web/20040210002251/http://people.flu...

[2]: https://www.cosy.sbg.ac.at/~clausen/PVSE2006/linus-rebuttal....

[3]: https://old.reddit.com/r/todayilearned/comments/k0c8o6/til_b...

mike_hearn · 2026-05-09T11:21:46 1778325706

Ah, I'm not saying we can't have nice things or build more secure software. I think we can build more secure software! But the argument I'm responding to is one that I've seen many times over the years on HN and elsewhere, which is some form of "capability based programming languages fix everything". It's always posited as obvious and easy, as if merely saying "capability based language" is the only explanation required and somehow the entire software industry just missed the memo. Sometimes microkernels often come along for the ride, but not always.

You're completely right that the root of trust has to be secured. I argue that the core filesystem is indeed a part of the ROT, which is why e.g. Apple has put so much effort into making it immutable and fully tied to a cryptographic root hash that's checked by the secure boot process. Moving the FS out of the core kernel wouldn't change much though - if you have a bug in your FS code at runtime then you're just SOL even if everything is arranged in a Merkle tree.

The argument being made by josephg in the sibling comment is that in SEL4 or similar the page cache would be separated from the crypto code. And maybe he's right, but the better way to get the same outcome is to not have IPsec in the kernel rather than not have the core FS - as the latter is a ROT and IPsec isn't.

I disagree that the question of what "getting hacked" means is a reformulation of trust roots. A threat model isn't the same thing as a root of trust. The argument over what appears to be minor semantics is important because it scopes your goals and effort. One of the most common failure modes I've seen in security projects is not defining a threat model up front, often leading to an automatic fallback to "the threat model contains everything" followed by despondency and failure when it turns out to be impossible.

I don't think Apple or Microsoft care much about Linus' opinions tbh. Both NeXT/macOS and Windows NT started out as microkernel designs and all of them have oscillated back and forth over the years. The original concept was indeed far too slow and a lot of functionality went back to monolithic. Then over time some functionality got lifted back out e.g. the GUI subsystem on Windows. Core FS remains though in any OS as the cost/benefit ratio of moving it is so poor.

jason_oster · 2026-05-11T05:26:30 1778477190

> "capability based programming languages fix everything"

There is some truth to this idea, though. Setting aside the unsafe boundary, (FFI, direct MMIO access, etc.) a capability system in a programming language would solve some kinds of these problems. Not all; it doesn't solve logic bugs when a capability is in scope.

> It's always posited as obvious and easy

I do believe it's probably pretty obviously true, by now. But not at all easy.

> Moving the FS out of the core kernel wouldn't change much though - if you have a bug in your FS code at runtime then you're just SOL even if everything is arranged in a Merkle tree.

Perhaps, but that's only because traditional file systems are global state. A capability system turns that notion on its head specifically because global state is really the problem. The combination of capabilities and user mode file access would be quite a strong isolation boundary. The bug(s) would have to be "trivially flawed" in a way that these subtle exploits are not.

> A threat model isn't the same thing as a root of trust.

Ah, I didn't say that. I said (roughly) that security relies on a strong root of trust for every thread model. I think the distinction is important. They are not the same, but the thread model can be completely ignored (because it doesn't matter) until the root of trust is secured. In other words, a weak root of trust fails all threat models.

> I don't think Apple or Microsoft care much about Linus' opinions tbh.

True. macOS and NT are (or were?) "microkernel-ish" the last time I was in those weeds. No idea how they've evolved since.

You've made some good points, as well. I see where you are coming from.

mike_hearn · 2026-05-11T08:10:00 1778487000

We agree that a properly sandboxable capability-capable (ugh, lol) language would indeed be a really good security upgrade. I was sad when the SecurityManager died for that reason, even though the reasoning was very understandable.

But those claims have also got to be moderated. As no such thing has ever existed, we can't truly know how well it'd work in practice. Only experience can tell us that.

Global state is one of the key issues. Joe-E simply banned it, which is far too harsh and breaks almost everything. Mobile operating systems locked down filesystem access behind permissions and capabilities quite dramatically and were much more secure, but that came with a lot of 'vigorous' debate over owner control and power for productivity/pro-grade applications. macOS has taken an incremental approach and sandboxes off parts of the FS from apps whilst retaining what looks on the surface like a classical global shared state $HOME and / directory (although it's not).

macOS, iOS, Android and Windows have all been steadily moving code out of the kernel over the years. Apple doesn't run the core FS in a userspace process but every other FS that's not as performance sensitive is now a userspace daemon, for instance. They developed their own FUSE equivalent to do this. In Windows a lot moved out in Vista. Graphics, audio, printing, a lot of drivers are out of kernel now.

Linux has lagged behind quite badly in this respect partly because a microkernel design requires close cooperation between userspace and kernel space but the Linux design philosophy is that the kernel is a self-contained artifact.