Yes - how exactly are you going to conjure new data to increase the model’s real world fidelity?
The most common sense way will always need to have humans in the loop, because humans are being used as judges.
If you try and entertain the idea of a human-less loop, I landed up with something like an AI creating real world products, selling them, then tracking the usage and popularity of the product. Essentially, creating and launching a firm and product, only to update its weights?
Perhaps there are some subsets of tasks that can be regressively self improved - and for those tasks: Holy hell thats awesome!
For general tasks? How are you going to get that data validated?
> Yes - how exactly are you going to conjure new data to increase the model’s real world fidelity?
Robotics and access to surveillance tech. There's an unlimited amount of real world data which hasn't been used yet. The language models have focused on just the small amount of data that is human written text. If there's a way to combine this with image and video models, as well as robotics collecting other sensor data, why would they be limited in learning anything about the world?
Real world data obviously. When you have robots out and about everywhere there is going to be an endless stream of audio visual data. This is in addition to all the real world textual data constantly streaming out of our millions of systems. If an LLM gets “smart” enough to figure out how to filter out the massive amount of noise and use that to improve itself then we will have AGI. Not consciousness, just a self improving, very intelligent machine.
You just brute force it with compute. What is the goal? Evaluating a function. So, given enough compute, you can just generate random functions, plot output distribution, select the function that meets threshold and move on.
Love talking about 'recursive self-improvement' of a prediction engine. What about 'recursive self-degradation'?