Artificial intelligence

DeepMind’s latest AI project solves programming challenges like a new

Enlarge / If an AI were asked to suggest an image for this article, would it think of The matrix?

Google’s DeepMind AI division has tackled everything from StarCraft at protein folding. So it’s perhaps no surprise that its creators eventually turned to what is undoubtedly a personal interest: computer programming. In Thursday’s edition of Science, the company describes a system it has developed that produces code in response to programming typical of those used in human programming contests.

On a medium challenge, the AI ​​system could score nearly the top half of the participants. But it struggled a bit to scale, being less likely to produce a successful program on problems where more code is usually required. Still, the fact that it works without having received any structural information about algorithms or programming languages ​​is a bit surprising.

Try the challenge

Computer programming challenges are quite simple: people are given a task to complete and produce code that should perform the requested task. In an example given in the new article, programmers are given two strings and asked to determine if the shorter of the two could be produced by substituting backspaces for some of the keys needed to type the longer one. Submitted programs are then checked to see if they provide a general solution to the problem or if they fail when additional examples are tested.

Given enough examples of programs capable of solving a single problem, it would probably be possible for an AI system to infer the algorithmic structure needed to be successful. But it wouldn’t be a general solution to solve all the problems; an AI trained on a challenge class would fail when asked to complete an unrelated challenge.

To make something more generalizable, the DeepMind team treated it a bit like a language problem. To some extent, the challenge description is an expression of what the algorithm should do, while the code is an expression of the same thing, just in a different language. So the AI ​​in question was designed to have two parts: one that ingested the description and converted it into an internal representation, and a second that used the internal representation to generate working code.

System formation was also a two-step process. In the first stage, the system was simply asked to process a hardware snapshot on GitHub, a total of over 700 GB of code. (Nowadays when you can put this on a USB stick, that might not seem like a lot, but remember the code is just plain text, so you get a lot of lines per gigabyte.) Note that this data will also include comments, which should use natural language to explain what the nearby code is doing and should therefore help with both input and output tasks.

Once the system was formed, it went through a period of adjustment. DeepMind set up its own programming contests, then fed the results into the system: problem description, working code, failing code, and test cases used to verify it.

Similar approaches had been tried before, but DeepMind says it was simply able to devote more resources to training. “A key driver of AlphaCode’s performance,” the paper states, “came from scaling the number of model samples to orders of magnitude greater than previous work.”

Leave a Reply