The whole point of Deep Learning is that we don't want to describe math behind object recognition; it was the failed "classical" approach where people spent decades figuring out complex features which worked horribly. Deep Learning is actually pretty simple, well understood and parallelizable, and it's basically a billion-dimensional non-linear optimization. As optimization is infested with NP-hard problems, it's as difficult as it gets. It's actually amazing what we can do with it in the real world right now (and we are still far away from seeing all its fruits). Of course, it would frustrate academics that can't base AGI on top of it, but did they really think this approach would do it anyway?
Deep learning does not seem to abstract very well. Train on a data set then test with images that are simply upside down and the preformance can be significant.
Feature extraction also works much better when you toss a lot of data and processing power behind it. So, a lot of progress is simply more data and computing power vs better approaches. Consider how poorly deep leaning works when using a single 286.
> Deep learning does not seem to abstract very well. Train on a data set then test with images that are simply upside down and the preformance can be significant.
But that's true of people too. How quickly can you read upside-down?
If you trained on a mixture of upside-down and right way up images, and tested on upside-down images, performance wouldn't take that much of a hit.
Sure, the problem is we are more willing to ignore failures that are similar to how we fail. IMO, when we compare AI approach X vs. Y we need to consider absolute performance not just performance similar to human performance.
Deep learning for example gains a lot from texture detection in images. But, that also makes it really easy to fool.
While I can't easily read upside down text, I can instantly recognize it as not only text, but that it needs to flipped upside down in order to be read. That's something current "deep learning" AIs can't do reliably, if at all.
If I had to describe the root cause of this problem it would be that humans process "problems" rather than "things" and we "learn" by building an ever growing mental library of problem solving algorithms. As we continue to "learn", we refine our problem solving algorithms to be more general than specific. Compare that to a deep learning AI that learns by building an ever greater data library of things while refining algorithms to suit ever more specific use cases.
I think you're describing a level of generalization above the application at hand. We could easily train a neural network to recognize the orientation of a font, and then build an orientation invariant "reading" app by first recognizing the rotation of the text, transforming it so it is right side up, and then recognizing as normal.
I tend to imagine our brains works similarly. It's not that you have a single "network" in your brain that recognizes test from all angle, but your brain is a "general purpose" machine with many networks that work together. I think current deep learning techniques are great for discrete tasks, and the improvement needed is to have many networks that work together properly with some form of intuition as to what should be done with the information at hand.