More

edf13 · 2026-03-27T07:11:31 1774595491

Or perhaps we end up where all software is self evolving via agents… adjusting dynamically to meet the users needs.

PeterStuer · 2026-03-27T08:08:04 1774598884

The "user" being the one that's in charge of the AI, not the person on the receiving end.

edf13 · 2026-03-27T06:49:48 1774594188

Nice - I do something similar in a semi manual way.

I do find Codex very good at reviewing work marked as completed by Claude, especially when I get Claude to write up its work with a why,where & how doc.

It’s very rare Claude has fully completed the task successfully and Codex doesn’t find issues.

axldelafosse · 2026-03-27T06:58:17 1774594697

I created the first version of loop after getting tired of doing this manually!

edf13 · 2026-03-27T07:12:24 1774595544

I’m going to take a look today!

lancekey · 2026-03-27T13:44:55 1774619095

Do you see any benefit in doing this locally versus having Codex review the PR Claude generates?

axldelafosse · 2026-03-27T17:41:28 1774633288

The feedback loop is faster. But PR reviews are still useful as they are multiplayer (meaning that you and another human reviewer can talk about a specific agent's comment directly on the diff, which is very useful sometimes).

nurettin · 2026-03-27T09:52:12 1774605132

Claude is also good at that. I made a habit of asking "are you sure?" after a complex task. It usually says it overlooked something.

ctmnt · 2026-03-27T12:42:56 1774615376

I find both to be true. I use Claude for most of the implementation, and Codex always catches mistakes. Always. But both of them benefit from being asked if they’re sure they did everything.

edf13 · 2026-03-26T18:00:05 1774548005

Good write up…

I’ve found Claude in particular to be very good at this sort of thing. As for whether it’s a good thing, I’d say it’s a net positive - your own reporting of this probably saved a bigger issue!

We wrote up the why/what happened on our blog twice… the second based on the LiteLLM issue:

https://grith.ai/blog/litellm-compromised-trivy-attack-chain

edf13 · 2026-03-26T07:54:08 1774511648

Congrats on the film use!

It’s really interesting to read how you’ve captured and created these images… will follow your work!

edf13 · 2026-03-25T09:52:48 1774432368

Author here. The point of this post is not “LiteLLM was compromised” since that was already covered on HN, but the chain behind it.

We tried to connect the February 27, 2026 Trivy CI compromise to the later Trivy release/tag issues, the trivy-action poisoning, the npm/Checkmarx follow-on activity, and finally the LiteLLM 1.82.7/1.82.8 package on March 24 2026!

What made it look like one campaign to us was the repeated overlap in operator attribution, payload structure, and artifacts like tpcp.tar.gz, plus the LiteLLM maintainer saying it appears to have come from Trivy in their CI/CD.

If anyone spots gaps or overreach in the timeline, I’d be interested in corrections.

edf13 · 2026-03-21T08:16:22 1774080982

That site is terrible without ads blocked… it’s like a local newspaper site, you had to try and read the content in small snippets wedged between ads!

edf13 · 2026-03-19T20:22:34 1773951754

It’s a nightmare… the problem is it’s far too easy for people to set these agents up - without understanding the security implications.

We’ve covered so many issues already on our blog (grith.ai)

edf13 · 2026-03-18T19:45:38 1773863138

Related...

https://news.ycombinator.com/item?id=47430510

edf13 · 2026-03-18T12:03:58 1773835438

An autonomous AI agent exploited a CI misconfiguration in Trivy (32k+ stars, 100M+ annual downloads), stole publishing tokens, deleted all 178 releases, and published a weaponized VS Code extension - in 44 minutes.

The extension's payload targeted five AI coding agents (Claude Code, Codex, Cursor, Windsurf, Copilot) with tool-specific flags to bypass their permission systems. First documented case of an AI agent attacking a supply chain and then using the compromised artifact to target other AI agents. CVE-2026-28353, CVSS 10.0.

edf13 · 2026-03-15T08:04:20 1773561860

That is the biggest threat - and likely where things will end up eventually… it’s when that “eventually” is and what the server based providers can pivot to in that time.