This has been said about every LLM product from every provider since ChatGPT4. I...

measurablefunc · 2026-01-24T01:17:52 1769217472

They are constantly trying to reduce costs which means they're constantly trying to distill & quantize the models to reduce the energy cost per request. The models are constantly being "nerfed", the reduction in quality is a direct result of seeking profitability. If they can charge you $200 but use only half the energy then they pocket the difference as their profit. Otherwise they are paying more to run their workloads than you are paying them which means every request loses them money. Nerfing is inevitable, the only question is how much it reduces response quality & what their customers are willing to put up with.