Hacker Newsnew | past | comments | ask | show | jobs | submit | fromlogin
What happens when run you evals on brainrot? (scorecard.io)
2 points by yash1hi 6 days ago | past | discuss
Annotation is all you need (scorecard.io)
1 point by yash1hi 61 days ago | past
Zero-Code Tracing Setup for Claude Agent SDK (scorecard.io)
1 point by gk1 3 months ago | past
You can't QA your way to the frontier (scorecard.io)
1 point by gk1 4 months ago | past
Show HN: Scorecard – Evaluate LLMs like Waymo simulates cars (scorecard.io)
7 points by Rutledge 8 months ago | past
Agenteval.org: An Open-Source Benchmarking Initiative for AI Agent Evaluation (scorecard.io)
6 points by Rutledge on Feb 27, 2025 | past | 1 comment

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: