I wonder if this puts into question the mythos benchmark which smashed basically...

		czhu12 30 days ago \| parent \| context \| favorite \| on: Exploiting the most prominent AI agent benchmarks I wonder if this puts into question the mythos benchmark which smashed basically all coding benchmarks to a staggering degree.