Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is the problem: you need the best model, not just a good one, for: - Good architecture, which requires reading specs, code, etc. reads like: lots of tokens in/out - Bug fixing — same, plus logs, e.g. datadog

Once you've found the path, patches are trivial and the savings are tiny unless you're doing refactoring/cleanup.

testing gets more and more complicated. Take a look at opencode go, and you see this:

>Includes GLM-5.1, GLM-5, Kimi K2.5, Kimi K2.6, MiMo-V2-Pro, MiMo-V2-Omni, MiMo->V2.5-Pro, MiMo-V2.5, Qwen3.5 Plus, Qwen3.6 Plus, MiniMax M2.5, MiniMax M2.7, >DeepSeek V4 Pro, and DeepSeek V4 Flash

and now on your own with bugs, all of these models can produce at scale. Am i missing anything in this picture. What is the real use of cheaper models?



I'd argue that you need the model that's good enough, not the best.


any missed bug, any wrong architecture decision, is a huge loss, sure , if you run it as autocomplete on steroids you can get any Chinese model. If you try to move faster, and that is a conscious choice, any hiccup is a productivity loss and tons of tokens burned.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: