So this one is 3x the size but only 7% better on MMLU? Given Moores law is mostl... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		kristianp on April 17, 2024 \| parent \| context \| favorite \| on: Mixtral 8x22B So this one is 3x the size but only 7% better on MMLU? Given Moores law is mostly dead, this trend is going to make for even more extremely expensive compute for next gen AI models.

GaggiX on April 17, 2024 [–]

That's 25% fewer errors.

kristianp on April 17, 2024 | [–]

True, I was in too distracting an environment to do that calculation, but it still feels like its a logarithmic return on extra compute. How long before the oceans start to boil? (figuratively that is).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact