Computer program that teaches how to ‘imagine’ the world shows how AI can think more like us


Ali Eslami, a researcher at DeepMind, and his colleagues tested the approach on three virtual parameters: a block-shaped table, a virtual robot arm, and a simple maze. The system uses two neural networks; one learns and the other generates, or “imagines”, new perspectives. The system captures aspects of a scene, including shapes, positions, and colors of objects, using vector representation, making it relatively efficient. The research appears in the journal Science today.

The work is somewhat of a new direction for DeepMind, which has made a name for itself developing programs capable of achieving remarkable feats, including learning to play the complex and abstract board game Go. The new project builds on other academic research that seeks to mimic human perception and intelligence using similar computer tools.

“It’s an interesting and valuable step in the right direction,” says Josh Tenenbaum, a professor who heads the Computational Cognitive Science group at MIT.

Tenenbaum says the ability to handle complex scenes in a modular fashion is impressive, but adds that the approach shows the same limitations as other machine learning methods, including the need for a huge amount of training data: ” The jury is still out on how much of the problem this resolves.

Sam Gershman, who heads the Computational Cognitive Neuroscience Lab at Harvard, says DeepMind’s work combines important insights into how human visual perception works. But he notes that, like other AI programs, it’s somewhat narrow, in that it can only answer one query: What would a scene look like from a different point of view?

“In contrast, humans can respond to an endless variety of queries on a stage,” Gershman explains. “What would a scene look like if I moved the blue circle to the left a bit, or repainted the red triangle, or smashed the yellow cube?”

Gershman says it’s not clear whether DeepMind’s approach might be adapted to answer more complex questions or whether a fundamentally different approach might be needed.


Gordon K. Morehouse