Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

So this one is 3x the size but only 7% better on MMLU? Given Moores law is mostly dead, this trend is going to make for even more extremely expensive compute for next gen AI models.


That's 25% fewer errors.


True, I was in too distracting an environment to do that calculation, but it still feels like its a logarithmic return on extra compute. How long before the oceans start to boil? (figuratively that is).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: