It seems like we're hitting a solid plateau of LLM performance with only slight ...

aoeusnth1 · 2026-04-16T15:48:32 1776354512

SWE-bench pro is ~20% higher than the previous .1 generation which was released 2 months ago. For their SWE benchmark, the token consumption iso-performance is down 2x from the model they released 2 months ago.

If this is a plateau I struggle to imagine what you consider fast progress.

abstracthinking · 2026-04-16T15:50:58 1776354658

Your comment doesn't make any sense, opus 4.6 was release two months ago, what jump would you expect?

lta · 2026-04-16T15:13:20 1776352400

Every night praying for tomorrow

NickNaraghi · 2026-04-16T15:54:32 1776354872

The generations are two months apart now though…