OpenAI and Anthropic's real moat is hardware. For local LLMs, context length and...

OpenAI and Anthropic's real moat is hardware. For local LLMs, context length and hardware performance are the limiting factors. Qwen3 4B with a 32,768 context window is great. Until it begins filling up and performance drops quickly.

I use local models when possible. MCPs work well, but their large context injection makes switching to an online provider the no-brainer.