And who will pay for all the expensive AI hardware? We're getting into the crazy phase of hundred billion dollar data centers.
Just because R1 was trained cheaply, doesn't mean that this architecture cannot be trained on a very expensive data center to get much better and bigger models.
R1 stands out not just because of efficient training, but because it created its own training data. Works similar to AlphaGo - it tries to solve problems, and has a way to check when the result is correct. The trick is to let it run more, to make better training data. I bet those datacenters will work more on problem solving than training.
Just because R1 was trained cheaply, doesn't mean that this architecture cannot be trained on a very expensive data center to get much better and bigger models.