Computer program repairs old code faster than expert engineers | MIT News


Last year, computer scientists from MIT and engineers from Adobe came together to try to solve a major problem that many companies are facing: bit-rot.

A good example is Adobe’s successful Photoshop photo editor, which just celebrated its 25th anniversary. Over the years, Photoshop had accumulated heaps of code that had been optimized for what is now older hardware.

“For high-performance code used for image processing, you need to optimize the software,” says Saman Amarasinghe, professor at MIT and researcher at the Laboratory for Computing and Artificial Intelligence (CSAIL). “The downside is that the code becomes much less efficient and much more difficult to understand.”

The result is what Amarasinghe describes as “a billion dollar problem”: Companies like Adobe have to devote a massive workforce to getting back into code every few years and, by hand, testing a system. bunch of different strategies to try to fix it.

But what if there was a computer program that could automatically repair old code so engineers could focus on more important tasks, such as imagining new software?

Enter Helium, a CSAIL system that reorganizes and refines code without ever needing the original source, in hours or even minutes.

The team started with a simple programming brick that was extremely difficult to analyze: binary code stripped of debugging symbols, which is the only piece of code available for proprietary software such as Photoshop.

A particular type of computational kernel popular for such software are “stencil kernels”, which allow you to perform operations on entire areas of pixels. Stencil kernels are especially important to update because they use huge amounts of memory and computing power, and their performance escalates quickly as new hardware becomes available.

With Helium, researchers are able to remove these kernels from a stripped-down binary and restructure them into high-level representations readable in Halide, a programming language designed by CSAIL and geared towards image processing.

According to lead author Charith Mendis, moving from binary languages ​​to high-level languages ​​was a big step that the team didn’t originally think was feasible.

“The order of operations in these optimized binaries is complicated, which means they can be difficult to disentangle,” says Mendis, a graduate student at CSAIL. “Because the stencils do the same calculation over and over again, we are able to accumulate enough data to recover the original algorithms. “

From there, the helium system then replaces the rotten original components with re-optimized ones. The bottom line: Helium can improve the performance of some Photoshop filters by 75% and the performance of less optimized programs such as Microsoft Windows’ IrfanView by 400-500%.

“We found that Helium can perform updates in a day, which would take human engineers over three months,” Amarasinghe explains. “A system like this can help businesses make sure the next generation of code is faster and save them the trouble of putting 100 people on these kinds of issues.

The research was presented in an accepted paper at the Association for Computing Machinery’s SIGPLAN Conference on Programming Language Design and Implementation (PLDI 2015), held June 13-17 in Portland, in Oregon.

The article was written by Mendis, fellow graduate students Jeffrey Bosboom and Kevin Wu, researcher Shoaib Kamil, postdoctoral fellow Jonathan Ragan-Kelley PhD ’14, Amarasinghe, and researchers from Adobe and Google.

“We are in an era where computer architectures are changing at a dramatic rate, which makes it important to write code that can work on multiple platforms,” says Mary Hall, professor in the School of Computing at the University of London. ‘Utah. “Helium is an interesting approach that has the potential to facilitate high-level descriptions of stencil calculations which could then be more easily ported to future architectures. “

An unexpected by-product of the work is that it allows researchers to see the various tricks programmers used on old code, like archaeologists combing computer fossils.

“We can see the ‘bit hacks’ engineers use to optimize their algorithms,” Amarasinghe explains, “as well as better understand the larger context of how programmers approach different coding challenges.”


Gordon K. Morehouse