we've run into this problem. when you're running 4-5 Codex/Claude Code sessions in parallel across worktrees, the port collision sucks. have to check out
We’ve been building in this space for a while, and the issues listed here are exactly the hard parts: session connectivity, reconnection logic, multi-session UX, and keeping state in-sync across devices. Especially when it comes to long running tasks and the edge cases that show up in real use.
Good question. We don't have E2EE yet (it's on the roadmap), so some level of trust in Omnara is required today. All repo operations happen locally on your machine. For messages/chat history: we store those encrypted at rest because we need access to sync across devices, send notifications, and resume agents. Cloud sandboxing is opt-in and would require syncing codebase state.
Totally get it, we're trying to minimize subscriptions too. Free tier gives you 10 sessions/month with no length limits, so you can actually get a decent amount done before deciding if it's worth paying
Thanks for the shout! Happy looks solid - always great to see more options here. Anecdotally from users who've tried both, we've heard Omnara has better reliability and latency. We also layer on some features like web support, worktrees, sandboxing, richer git management (diffs, checkpoints), and preview URLs. Would love to hear what you think if you give it a spin :)
Fair concern. We don't have true E2EE yet because our service needs access to message content for cross-device sync, notifications, and agent execution. Everything is encrypted in transit and at rest, and all repo operations happen locally on your machine.
We've heard this from other users and it's on our roadmap. The challenge is we're building features like voice coding agents and hosted sandboxes that require plaintext inputs, so we'd need two execution models. Doable, but adds complexity for our team size.
That said, it's something we're prioritizing as we grow. No promises on timing, but it's coming.
> We don't have true E2EE yet because our service needs access to message content
That means you don't have E2EE, period. Implying that there is such a thing as "true E2EE" (as opposed to "E2EE") either indicates that you don't know what E2EE means, or that you're scammily trying to do what Apple does with iMessage and say that something that isn't E2EE is, for marketing purposes.
E2EE means that nobody except the endpoints has keys. There is no such thing as "true E2EE" any more than there is such a thing as "true pregnancy".
Yep, you're right, we don't have E2EE period (and we don't claim to have it anywhere), for the reasons I mention above (our cloud sandbox agent and voice agent need plaintext messages, so we'd need access to the keys, which defeats the purpose of E2EE). Apologies for the incorrect wording!
What do you mean by syncing? Happy coder syncs sessions between all my happy coder clients. I can even see in real time how happy coder in my browser's conversations progress as well as on my phone, in parallel.
Omnara also displays realtime conversations between all Omnara clients. What I mean by syncing is syncing your conversation and code changes to a cloud sandbox, which is useful if you're using Omnara on your laptop and you close your computer (as explained in the original post). If you run your agents on a persistent cloud VM, then this is less of a value add.
I can voice chat with Happy coder.
We use https://docs.livekit.io/agents/ which runs the voice agent in the cloud (to enable the above use case, and a better experience when you're using your phone when it's off), whereas I believe happy runs a client-side voice agent.
Thanks for answering my questions! I see that Happy Coder is not far from Omnara. I hope Omnara can be not too far from E2E encryption. The lack of E2E encryption was why I didn't chose Omnara.
I can voice chat with Happy coder. Also, I run happy coder in a sandbox of mine on my computer. What do you mean by syncing? Happy coder syncs sessions between all my happy coder clients. I can even see in real time how happy coder in my browser's conversations progress as well as on my phone, in parallel.
Happy is an abandonedware unfortunately. It's a great app and dev can capitalize a lot from it but for some reason he hasn't been seen or heard in months since the last release.
There are attempts to create a fork maintained by other developers, but they're yet to be launched.
thats a great stab :) We dig into this in the post, but the key distinction we landed on is that the trigger can be asynchronous without the agent itself being async. A cron job, webhook, or autonomous trigger is really about scheduling, not a property of the agent’s execution model.
In other words: triggering without a human ≠ async by itself. What matters is whether the caller blocks on the agent’s work, as opposed to how or when it was kicked off.
hey, ishaan here (kartik's cofounder). this post came out of a lot of back-and-forth between us trying to pin down what people actually mean when they say "async agents."
the analogy that clicked for me was a turn-based telephone call—only one person can talk at a time. you ask, it answers, you wait. even if the task runs for an hour, you're waiting for your turn.
we kept circling until we started drawing parallels to what async actually means in programming. using that as the reference point made everything clearer: it's not about how long something runs or where it runs. it's about whether the caller blocks on it.
that's the user-facing definition but the implementation distinction matters more.
"takes longer than you're willing to wait" describes the UX, not the architecture. the engineering question is: does the system actually free up the caller's compute/context to do other work, or is it just hiding a spinner?
nost agent frameworks i've worked with are the latter - the orchestrator is still holding the full conversation context in memory, burning tokens on keep-alive, and can't actually multiplex. real async means the agent's state gets serialized, the caller reclaims its resources, and resumption happens via event - same as the difference between setTimeout with a polling loop vs. actual async/await with an event loop.
IMO feels sorta like Simon Willison's definition of agents. "LLMs in a loop with a goal" feels super obvious, but not sure if I would have described it that way in hindsight
One nuance that helps: “async” in the turn-based-telephone sense (you ask, it answers, you wait) is only one way agents can run.
Another is many turns inside a single LLM call — multiple agents (or voices) iterating and communicating dozens or hundreds of times in one epoch, with no API round-trips between them.
That’s “speed of light” vs “carrier pigeon”: no serialization across the boundary until you’re done. We wrote this up here: Speed of Light – MOOLLM (the README has the carrier-pigeon analogy and a 33-turn-in-one-call example).
Speed of Light vs Carrier Pigeon:
The fundamental architectural divide in AI agent systems.
The Core Insight: There are two ways to coordinate multiple AI agents:
Carrier Pigeon
Where agents interact: between LLM calls
Latency: 500 ms+ per hop
Precision: degrades each hop
Cost: high (re-tokenize everything)
Speed of Light
Where agents interact: during one LLM call
Latency: instant
Precision: perfect
Cost: low (one call)
MCP = Carrier Pigeon
Each tool call:
stop generation →
wait for external response →
start a new completion
N tool calls ⇒ N round-trips
MOOLLM Skills and agents can run at the Speed of Light. Once loaded into context, skills iterate, recurse, compose, and simulate multiple agents — all within a single generation. No stopping. No serialization.
Maybe, but that's what I thought while reading the "what actually is async?" part of the post, so I don't think I got biased towards the answer by that point.
i just imagine it as the swap between "human watching agent while it runs"
vs "agent runs for a long time, tells the user over human interfaces when its done" eg. sends a slack. or something like gemini deep research.
an extension would be that they are triggered by events and complete autonomously with only human interfaces when it gets stuck.
theres a bit of a quality difference rather than exactly functionally, in that the agent mostly doesnt need human interaction beyond a starting prompt, and a notification of completion or stuckness. even if im not blocking on a result, it cant immediately need babying or i cant actually leave it alone
we’ve been trying hard to get the Android version out, Google’s been giving us a tough time before approving it for the Play Store. I can send you an internal app link if you’d like; just share your email (I’m at [email protected])