The computer program watches five minutes in the


image: This is Professor Jürgen Gall (right) and Yazan Abu Farha from the Institute of Computer Science at the University of Bonn.
seen Following

Credit: © Photo: Barbara Frommann / Uni Bonn

Computer scientists at the University of Bonn have developed software that can look a few minutes into the future: the program first learns the typical sequence of actions, such as cooking, from video footage. Based on this knowledge, he can then accurately predict in new situations what the leader will do when. The researchers will present their findings at the world’s largest computer vision and pattern recognition conference, to be held June 19-21 in Salt Lake City, United States.

The perfect butler, as every UK social drama fan knows, has a special ability – he senses his employer’s wishes before they are even expressed. The working group of Prof. Dr. Jürgen Gall wants to teach something similar to computers: “We want to predict when and how long activities will take – minutes or even hours before they happen,” he explains.

A food processor, for example, could then pass the ingredients as soon as they are needed, preheat the oven on time – and in the meantime notify the chef if he is about to forget a preparation step. The vacuum cleaner knows he has no business in the kitchen at the time, and instead takes care of the living room.

We humans are very good at anticipating the actions of others. For computers, however, this discipline is still in its infancy. Researchers at the Institute of Informatics at the University of Bonn are now able to announce a first success: they have developed a self-learning software that allows the timing and duration of activities to be estimated. futures with amazing precision over periods of several minutes.

Workout Data: Four Hours of Salad Videos

The training data used by the scientists included 40 videos in which the artists prepare different salads. Each of the recordings was about 6 minutes long and contained an average of 20 different actions. The videos also contained specific details of when the action started and how long it took.

The computer “watched” these salad videos totaling about four hours. In this way, the algorithm learned which actions generally follow each other during this task and how long they last. This is not trivial: after all, each chef has his own approach. In addition, the sequence may vary depending on the recipe.

“Then we tested the success of the learning process,” says Gall. “To do this, we compared the software to videos that it had never seen before.” At least the new shorts fit into the context: they also showed the preparation of a salad. For the test, the computer learned what is shown in the first 20 or 30 percent of one of the new videos. Based on that, he then had to predict what would happen during the sequel to the film.

It has worked incredibly well. Gall: “The accuracy was over 40% for the short forecast periods, but then declined as the algorithm looked to the future. For activities that lasted more than three minutes in the future, the computer still worked 15% of the time. However, the prognosis was only considered correct if the activity and its timing were correctly predicted.

Gall and his colleagues want the study to be understood only as a first step in the new field of activity prediction. Especially since the algorithm works much less well if it has to recognize by itself what is happening in the first part of the video, instead of being told. Because this analysis is never 100% correct, Gall refers to “noisy” data. “Our process works with this,” he says. “But unfortunately nowhere near so good.”

###

The study was developed as part of a research group dedicated to the prediction of human behavior and financially supported by the German Research Foundation (DFG).

Publication: Yazan Abu Farha, Alexander Richard and Jürgen Gall: When will you do what? – Anticipate the temporal occurrences of activities. IEEE Conference on Computer Vision and Pattern Recognition 2018; http://pages.iai.uni-bonn.de/gall_juergen/download/jgall_anticipation_cvpr18.pdf

Sample test videos and resulting predictions are available at https://www.youtube.com/watch?v=xMNYRcVH_oI

Contact:

Teacher. Dr Jürgen Gall

Computer Institute

University of Bonn

Phone. +49 (0) 228/7369600

Email: [email protected]


Disclaimer: AAAS and EurekAlert! are not responsible for the accuracy of any press releases posted on EurekAlert! by contributing institutions or for the use of any information via the EurekAlert system.


Gordon K. Morehouse