Google develops a computer program capable of learning tasks independently | Artificial intelligence (AI)


Google scientists have developed the first computer program capable of learning a wide variety of tasks independently, in what has been hailed as a significant step towards true artificial intelligence.

The same program, or “agent,” as its creators call it, has learned to play 49 different retro computer games and devised its own strategies for winning. In the future, the same approach could be used to power self-driving cars, personal assistants in smartphones, or conduct scientific research in fields ranging from climate change to cosmology.

The research was carried out by DeepMind, the British company bought by Google last year for 400 million pounds sterling, whose stated aim is to build “intelligent machines”.

Demis Hassabis, founder of the company, said: “This is the first important rung on the ladder to prove that a general learning system can work. He can work on a difficult task that even humans find difficult. This is the very first small step towards this more ambitious … but important goal. “

The work is seen as a fundamental departure from previous attempts at building AI, such as the Deep Blue program, which beat Gary Kasparov at chess in 1997 or IBM’s Watson, which won the Jeopardy quiz! in 2011.

In both of these cases, the computers were pre-programmed with the rules of the game and specific strategies and outperformed human performance with their computing power.

“With Deep Blue, it was a team of programmers and grandmasters who distilled the knowledge into a program,” Hassabis said. “We have built algorithms that learn from scratch. “

The DeepMind agent simply receives a raw input, in this case the pixels composing the display on Atari games, and provided with a current score.

When the agent starts playing, he just looks at the in-game footage and randomly presses the buttons to see what happens. “Kind of like a baby who opens his eyes and sees the world for the first time,” Hassabis said.

The agent uses a method called ‘deep learning’ to transform basic visual input into meaningful concepts, mirroring how the human brain takes raw sensory information and transforms it into a rich understanding of the world. The agent is programmed to determine what is significant through “reinforcement learning,” the basic notion that scoring points is a good thing and losing them is a bad thing.

Tim Behrens, professor of cognitive neuroscience at University College London, said: “What they’ve done is really impressive, there’s no doubt about it. They have agents to learn concepts based on fair rewards and punishments. No one has ever done this before.

In videos provided by Deep Mind, the agent is shown making random and largely unsuccessful moves at first, but after 600 training cycles (two weeks of computer time) he has figured out what many games are about.

In some cases, the agent came up with winning strategies that the researchers themselves had never considered, such as tunneling through the sides of the wall in Breakout or, in an underwater game, staying deeply immersed in everything. moment.

Vlad Mnih, one of the Google team behind the work, said: “It’s really fun to see computers discover things that you haven’t understood yourself. “

Hassabis stops before calling it a “creative step,” but said it proves that computers can “figure things out on their own” in a way that is normally seen as uniquely human. “One day machines will be capable of some form of creativity, but we’re not there yet,” he said.

Behrens said watching the agent learn leaves the impression that “there is something human” – likely because it borrows the concept of trial and error, one of the main methods by which humans learn.

The study, published in the journal Nature, showed that the agent performed at 75% of the level of a professional game tester or better on half of the games tested, which ranged from side-scrolling shooters to boxing through car racing 3D. On some games, such as Space Invaders, Pong, and Breakout, the algorithm significantly outperformed humans, while on others it did well.

The researchers said this was mainly because the algorithm, at the moment, has no real memory, meaning it is unable to engage in long-term strategies that require planning. . With some games that meant the agent was stuck in a rut, where he had learned a basic way to score a few runs, but never really understood the overall purpose of the game. The team is now trying to integrate a memory component into the system and apply it to more realistic 3D computer games.

Last year, American entrepreneur Elon Musk, one of Deep Mind’s early investors, described AI as humanity’s greatest existential threat. “Unless you have direct exposure to groups like Deepmind, you have no idea how quickly [AI] grows, ”he said. “The risk of something seriously dangerous to happen is within five years. Ten years at most.

However, the Google team played down the concerns. “We agree with him that there are risks that need to be considered, but we are decades away from any sort of technology that we need to be concerned about,” Hassabis said.


Gordon K. Morehouse