Agreed, had the same experience. Codex feels lazy - I have to explicitly tell it to research existing code before it stops giving hand-wavy answers. Doc lookup is particularly bad; I even gave it access to a Context7 MCP server for documentation and it barely made a difference. The personality also feels off-putting, even after tweaking the experimental flag settings to make it friendlier.
For people suggesting it’s a skill issue: I’ve been using Claude Code for the past 6 months and I genuinely want to make Codex work - it was highly recommended by peers and friends. I’ve tried different model settings, explicitly instructed it to plan first and only execute after my approval, tested it on both Python and TypeScript backend codebases. Results are consistently underwhelming compared to Claude Code.
Claude Code just works for me out of the box. My default workflow is plan mode - a few iterations to nail the approach, then Claude one-shots the implementation after I approve. Haven’t been able to replicate anything close to that with Codex
+1 to this. Been using Codex the last few months, and this morning I asked it to plan a change. It gave me generic instructions like 'Check if you're using X' or 'Determine if logic is doing Y' - I was like WTF.
Curious, are you doing the same planning with Codex out-of-band or otherwise? In order to have the same measurable outcome you'd need to perhaps use Codex in a plan state (there's experimental settings - not recommended) or other means (explicit detailed -reusable- prompt for planning a change). It's a missing feature if your preference is planning in CLI (I do not prefer this).
You are correct in that this mode isn't "out of the box" as it is with Claude (but I don't use it in Claude either).
My preference is to have smart models generate a plan with provided source. I wrote (with AI) a simple python tool that'll filter a codebase and let me select all files or just a subset. I then attach that as context and have a smart model with large context (usually Opus, GPT-5.2, and Gemini 3 Pro in parallel), give me their version of a plan. I then take the best parts of each plan, slap it into a single markdown and have Codex execute in a phased manner. I usually specify that the plan should be phased.
I prefer out-of-CLI planning because frankly it doesn't matter how good Codex or Claude Code dive in, they always miss something unless they read every single file and config. And if they do that, they tip over. Doing it out of band with specialized tools, I can ensure they give me a high quality plan that aligns with the code and expectations, in a single shot (much faster).
Then Claude/Codex/Gemini implement the phased plan - either all at once - or stepwise with me testing the app at each stage.
But yeah, it's not a skill issue on your part if you're used to Plan -> Implement within Claude Code. The Experimental /collab feature does this but it's not supported and more experimental than even the experimental settings.
Came across official anthropic repo on gh actions very relevant to what you mentioned. Your idea on scheduled doc updation using llm is brilliant, I’m stealing this idea.
https://github.com/anthropics/claude-code-action
No path for busy people, unfortunately. Learn everything from ground up, from containers to Compose to k3s, maybe to kubeadm or hosted. Huge abstraction layers coming from Kubernetes serve their purpose well, but can screw you up when anything goes slightly wrong on the upper layer.
For start, ignore operators, ignore custom CSI/CNI, ignore IAM/RBAC. Once you feel good in the basics, you can expand.
k3sup a cluster, ask an AI on how to serve an nginx static site using trafeik on it and explain every step of it and what it does (it should provide: a config map, a deployment, a service and an ingress)
k3s provides: csi, cni (cluster storage interface, cluster network interface) which is flannel and and local-pv which just maps volumes to disk (pvcs)
trafeik is what routes your traffic from the outside to inside your cluster (to an ingress resource)
Is there a comprehensive leaderboard like ClickBench but for vector DBs? Something that measures both the qualitative (precision/recall) and quantitative aspects (query perf at 95th/99th percentile, QPS at load, compression ratios, etc.)?
ANN-Benchmark exists but it’s algorithm-focused rather than full-stack database testing, so it doesn’t capture real-world ops like concurrent writes, filtering, or resource management under load.
Would be great to see something more comprehensive and vendor-neutral emerge, especially testing things like: tail latencies under concurrent load, index build times vs quality tradeoffs, memory/disk usage, and behavior during failures/recovery
Seconding Fullmetal Alchemist. I hear the remake (Fullmetal Alchemist: Brotherhood) is usually regarded as the better version. More suggestions: Neon Genesis Evangelion, Death Note, Sousou no Frieren, Cowboy Bebop, Nichijou, Tengen Toppa Gurren Lagann, Bakemonogatari. There's also quite a few good movies, anything by Studio Ghibli is great, and so are Akira, Perfect Blue, and Ghost in the Shell.
Some of those those aren't really going to appeal people unfamiliar with the conventions of the genre and some of the big personalities, e.g. Evangelion is a deconstruction of the once popular giant robot genre and Hideaki Anno's personal couch trip rolled into one.
Brotherhood follows the plot of the source material comic, which is regarded as having a better ending. The original series aired concurrently with the comic and had to diverge when it passed where the ongoing comic ran out of chapters.
Slightly tangential but this was a learning moment for me.
This reminds me of a story where Sage Mandavya established the first juvenile law in Hindu mythology.
<story starts>
Long ago, there lived a great sage named Mandavya who had taken a vow of silence and spent his days in deep meditation. One day, while he sat motionless beneath a tree with his arms raised in penance, a group of thieves being pursued by the king’s soldiers fled into his hermitage. They hid their stolen loot near the sage and escaped through the other side.
When the king’s soldiers arrived, they found the stolen goods but the sage—deep in meditation and bound by his vow of silence—neither confirmed nor denied their presence. The soldiers arrested him and brought him before the king, accusing him of harboring criminals.
Despite his spiritual stature, the king ordered a severe punishment: Mandavya was to be impaled on a stake (shula)—a horrific execution where a wooden spike was driven through the body. However, due to his immense yogic powers and detachment from the physical world, the sage did not die. He remained alive on the stake, enduring the agony with superhuman patience.
Eventually, other sages intervened, the king realized his grave error, and Mandavya was freed. But the damage was done. When the sage finally left his mortal body, he went directly to Yamaloka—the realm of Yama, the god of death and justice—to demand an explanation.
“Why did I have to suffer such a gruesome fate?” Sage Mandavya asked Lord Yama. “What terrible sin did I commit to deserve impalement?”
Yama consulted his records and replied, “When you were a child, you caught a dragonfly and pierced it with a needle through its body, watching it suffer for your amusement. That act of cruelty resulted in your punishment - you experienced the same suffering you inflicted on that innocent creature.”
Sage Mandavya was furious. “That was when I was a child!” he protested. “I was too young to understand the difference between right and wrong, between sin and virtue. How can you punish an ignorant child with the same severity as a knowing adult?”
Yama tried to explain that karma operates impartially, but Mandavya would not accept this. In his righteous anger, the sage cursed Yama himself: “For this unjust judgment, you shall be born as a human on Earth and experience mortality yourself!”
This curse led to Yama being born as Vidura, the wise and virtuous counselor in the Mahabharata - a human who, despite his wisdom and righteousness, had to endure the limitations and sufferings of mortal life.
But Mandavya didn’t stop there. Using his spiritual authority, he proclaimed a new divine law: “No sin committed by a child below the age of fourteen shall count toward their karmic debt equivalent to that of an adult. Children who do not yet understand dharma and adharma shall not be punished for their ignorant actions.”
This became the first “juvenile law” in Hindu mythology—a recognition that children, in their innocence and ignorance, deserve compassion and correction rather than severe punishment.
<story ends>
When I was a child, I too wanted to catch a dragonfly and tie a thread to it so it would fly around like a little pet. But my mother stopped me. She told me this very story of Sage Mandavya, and it scared me for life. I never forgot it, and I never tried to catch and bind a dragonfly again.
1. If is were possible for an ordinary mortal to impose arbitrary curses on the god of death and justice, the world would quickly descend into utter chaos.
2. If children are completely free from accountability, adults will form them into an army and convince them to commit crimes on their behalf, leading to an intolerable situation. This may already be a standard way of doing business in some parts of the world.
> If children are completely free from accountability, adults will form them into an army and convince them to commit crimes on their behalf, leading to an intolerable situation. This may already be a standard way of doing business in some parts of the world.
This is an ongoing problem in Norway now and I think it has been in Sweden for some time.
If you want to read more, search for the foxtrot network.
> 1. If is were possible for an ordinary mortal to impose arbitrary curses on the god of death and justice, the world would quickly descend into utter chaos.
Mandavya is not just any mortal; he is an enlightened sage. In Hinduism, enlightened beings are considered superior to gods. There’s another story about Sage Markandeya (one of the nine immortals, the Chiranjeevis) who caused the death of Yama, the God of Death. In Hindu cosmology, all the gods hold honorary responsibilities, and nothing is permanent - not even the position of Brahma, the Creator
> 2. If children are completely free from accountability, adults will form them into an army and convince them to commit crimes on their behalf, leading to an intolerable situation. This may already be a standard way of doing business in some parts of the world
I believe he introduced a juvenile law, which involves reduced sentences or milder punishments rather than granting complete immunity from consequences.
> 1. If is were possible for an ordinary mortal to impose arbitrary curses on the god of death and justice, the world would quickly descend into utter chaos.
Opportunity myth? Mortals are simply temporarily embarrassed gods?
Interesting - this seems to target a different layer than services like Tinker (https://thinkingmachines.ai/blog/announcing-tinker/). Monarch provides the infrastructure primitives while Tinker is a managed finetuning service. Could someone build something like Tinker on top of Monarch?
Nice, so the open source equivalent now exists. Meta basically commoditized Tinker's($12B valuation) value prop by giving away the infra (Monarch) and the RL framework (TorchForge). Will be interesting to see how a managed service competes with free + open source at this layer.
Can someone help me understand the pricing of zed? $10 per month- $5 credits for AI credits. This credits can be used for claude code / codex inside zed or should I manage different api keys for codex/claude code?
There are 2 modes of operation - an editor AI mode and a dedicated agent mode. For the agent mode like Claude Code or Codex, you don’t have to pay Zed, only the CLI tool providers. The zed subscription is for those who don’t want to deal with AI vendors, cli tools etc., and just use it in the editor
For people suggesting it’s a skill issue: I’ve been using Claude Code for the past 6 months and I genuinely want to make Codex work - it was highly recommended by peers and friends. I’ve tried different model settings, explicitly instructed it to plan first and only execute after my approval, tested it on both Python and TypeScript backend codebases. Results are consistently underwhelming compared to Claude Code.
Claude Code just works for me out of the box. My default workflow is plan mode - a few iterations to nail the approach, then Claude one-shots the implementation after I approve. Haven’t been able to replicate anything close to that with Codex
reply