Not only are they capable of understanding images(the kind people might actually feed into such a system - photographs), but they're pretty good at it.
A modern robot would struggle to fold socks and put them in a drawer, but they're great at making cars.
I mean, with some of the recent demos, robots have got a lot better at folding stuff and putting it up. Not saying it's anywhere close to human level, but it has taken a pretty massive leap from being a joke just a few years ago.
A modern robot would struggle to fold socks and put them in a drawer, but they're great at making cars.