Computer System Automatically Solves Word Problems | MIT News


Researchers at MIT’s Computing and Artificial Intelligence Laboratory, working with colleagues at the University of Washington, have developed a new computer system that can automatically solve the type of word problems common in introductory lessons. to algebra.

In the short term, the work could lead to educational tools making it possible to identify students’ errors in reasoning or to assess the difficulty of word problems. But it can also point to systems capable of solving more complex problems in geometry, physics, and finance – problems whose solutions do not appear on the back cover of a teacher’s manual.

According to Nate Kushman, an MIT graduate student in electrical and computer engineering and lead author of the new paper, the new work concerns the field of “semantic analysis” or the translation of natural language into a formal language such as arithmetic or formal logic. . Most of the previous work on semantic analysis – including his own – focused on individual sentences, says Kushman. “In these algebra problems, you have to build these things out of a lot of different sentences,” he says. “The fact that you go through several sentences to generate this semantic representation is really something new. “

Kushman is joined on paper by Regina Barzilay, professor of computer science and engineering and one of his two thesis supervisors, and by Yoav Artzi and Luke Zettlemoyer of the University of Washington. The researchers will present their work at the annual meeting of the Association for Computational Linguistics in June.

Find your place

The researchers’ system makes use of two existing IT tools. One is the computer algebra system Macsyma, whose initial development at MIT in the 1960s was a Milestone in artificial intelligence research. For the purposes of Kushman and his colleagues, Macsyma provided a way to distill algebraic equations with the same general structure into a common model.

The other tool is the type of sentence analyzer used in most natural language processing research. A parser represents the parts of speech in a given sentence and their syntactic relationships in the form of a tree – a type of graphic which, like a family tree, unfolds in successive layers of depth.

For the system of researchers, understanding a word problem involves correctly mapping the elements of the analysis diagram of its constituent sentences to one of Macsyma’s models of equation. To train the system to perform this mapping and produce the equation models, the researchers used machine learning.

Kushman found a website where algebra students posted word problems they had difficulty with and where their peers could then come up with solutions. From an initial group of about 2,000 problems, he selected 500 that represented the full range of problem types found in the larger set.

In a series of experiments, researchers would randomly select 400 of the 500 problems, use them to train their system, and then test it on the remaining 100.

For training, however, they used two different approaches – or, in machine learning parlance, two different types of supervision. In the first approach, they fed the system both word problems and their translations into algebraic equations – 400 examples of each. But in the second, they fed the system only a few examples of the five most common types of word problems and their algebraic translations. The rest of the examples only included verbal problems and their number solutions.

In the first case, the system, after training, was able to solve about 70% of its test problems; in the second, this figure fell to 46%. But according to Kushman, it’s still good enough to hope that the researchers’ approach could generalize to more complex problems.

Featured Performances

To determine how to map natural language to equation models, the system examined hundreds of thousands of “features” training examples. Some of these characteristics associated specific words with types of problems: for example, the appearance of the phrase “react with” was a good indication that the problem was in chemistry. Other features looked at the placement of specific words in analysis diagrams: The appearance of the word “costs” as the primary verb largely indicated which sentence elements should be inserted into which equation models.

Other features simply analyzed the syntactic relationships between words, independent of the words themselves, while still others examined the correlations between the locations of words in different sentences. Finally, says Kushman, he included some “integrity checking” features, such as whether the solution provided by a particular equation model was a positive integer, as is almost always the case with problems with algebraic words.

“The idea of ​​what kind of supervision they have will be useful for a lot of things,” says Kevin Knight, professor of computer science at the University of Southern California. “The approach of creating a generative story of how people move from text to responses is a great idea. “

The system’s ability to perform quite well, even when trained primarily on raw digital responses, is “super encouraging,” adds Knight. “It needs a little help, but it can benefit from a lot of extra data that you haven’t labeled in detail.”


Gordon K. Morehouse