Allowing access to the data changes the problem. It’s an interesting problem in its own right, but seems much easier to accomplish and arguably much less valuable than limiting input to images and output to mouse/keyboard. Teaching computers to interact with the world is the trick that we want to teach.
That's my thinking as well. It's much more interesting to see how the AI copes with the world, rather than getting bogged down in OCR and other previously-solved problems.