> and only calling out to large models when they actually need the extra knowledge
When would you want lossy encoding of lots of data bundled together with your reasoning? If it is true that reasoning can be done efficiently with fewer parameters it seems like you would always want it operating normal data searching and retrieval tools to access knowledge rather than risk hallucination.
And re: this discussion of large data centers versus local models, do recall that we already know it's possible to make a pretty darn clever reasoning model that's small and portable and made out of meat.
I find it difficult to understand the distinction between parametric knowledge and reasoning skills in LLMs. I still think of them as distinct but I understand there is significantly overlap. Arguably, they are the same thing in LLMs. So I would assume that if reasoning is high quality, using RAG could be logical (if much slower). However if the lack of parametric knowledge impacts reasoning, then use of larger models seems warranted. A dumb LLM wouldn't offer sufficient results even with all the RAG in the world.
I guess we can imagine a pure reasoning model (if that's even the right word any more) with almost zero world-knowledge. How does it know what to look for? How does it do any meaningful communication at all?
So I think it's useful to have an imprecise-but-fairly-accurate set of world knowledge as part of an otherwise reasoning-heavy model. It's a cache.
And if the it's an LLM, or something like that, I think it basically has to have world-knowledge built in, because what is natural language if not communication about the world?
> we already know it's possible to make a pretty darn clever reasoning model
There's is a problem though: we know that it is possible, but we don't know how to (at least not yet and as far as I am aware). So we know the answer to "what?" question, but we don't know the answer to "how?" question.
When would you want lossy encoding of lots of data bundled together with your reasoning? If it is true that reasoning can be done efficiently with fewer parameters it seems like you would always want it operating normal data searching and retrieval tools to access knowledge rather than risk hallucination.
And re: this discussion of large data centers versus local models, do recall that we already know it's possible to make a pretty darn clever reasoning model that's small and portable and made out of meat.