Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
|
from
login
Codex Daily Benchmarks for Degradation Tracking (Marginlab.ai)
(
marginlab.ai
)
1 point
by
wendgeabos
7 days ago
|
past
|
discuss
Claude Code daily benchmarks for degradation tracking
(
marginlab.ai
)
759 points
by
qwesr123
7 days ago
|
past
|
354 comments
No one is evaluating AI coding agents in the way they are used
(
marginlab.ai
)
1 point
by
qwesr123
23 days ago
|
past
Claude Code Daily Degradation Tracker
(
marginlab.ai
)
3 points
by
qwesr123
27 days ago
|
past
|
3 comments
Anatomy of a Coding Agent: A step-by-step illustration
(
marginlab.ai
)
3 points
by
qwesr123
45 days ago
|
past
How are coding assistants evaluated? SWE-Bench Pro Explorer
(
marginlab.ai
)
2 points
by
qwesr123
47 days ago
|
past
SWE-Bench: The $500B Benchmark
(
marginlab.ai
)
5 points
by
qwesr123
49 days ago
|
past
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: