Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Until they've undergone preference tuning and RL post-training (which is expected to start using more training compute than next-token-prediction pre-training).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: