> What kind of time frame is "any time soon" I'm guessing that solving run-time ...

ryandvm · on June 14, 2024

But even now the LLMs absolutely have limited problem solving capability.

For example, yesterday I asked GPT-4o to write multiple alternate endings to the short story "The Last Equation". They weren't dramatically compelling, but they were logical and functional.

How is that not problem solving? And so help me, before anyone tells me it's just stringing together the next most likely tokens - I don't care. Clearly that is at least a primitive form of intelligence. Actually it's not even apparent to me that that isn't exactly what human intelligence is doing...

HarHarVeryFunny · on June 14, 2024

I would define intelligence as "degree of ability to use past experience to predict future outcomes" (which includes reasoning, aka problem solving ability, via repeated what-if prediction, then backtracking/learning on failure etc).

So, intelligence exists on a spectrum - some things are easier to predict given a set of learnt facts and methods than others. The easiest things to predict (the most basic form of intelligence) is "next time will be the same as last time", which is basically memorization and pattern matching, which is mostly what LLMs are able to do thanks to brute-force pattern/rule extraction via gradient descent.

Going beyond "next time will be the same as last time" is where reasoning comes in - where you have the tools (experience) to solve a problem, but it requires a problem-specific decomposition into sub-problems and trial-and-error planning/testing to apply learnt techniques to make progress on the problem...

Certainly a lot of human behavior (applied intelligence) is of the shallow "system 1" pattern matching variety, but I think this is over stated. Not only is "system 2" problem-solving needed for on-the-job training, but I think we're using it all the time when we're doing anything more than reacting to the current situation in mindless fashion.

So, sure, LLMs have limited intelligence, but it's only "system 1" shallow intelligence, gestalt pattern recognition, based on training-time gradient descent learning. What they are missing is run-time "system 2" problem-solving.