Thanks to the Defense Advanced Research Projects Agency, millions of people across the country now have a bullet-proof excuse ready when their bosses see them playing online games during work hours.
That is, as long as they’re playing “CircuitBot,” “Flow Jam,” “Ghost Map,” “StormBound” or “Xylem” — the five online games developed by DARPA to help the Defense Department find errors in the millions of lines of software code used in everything from intelligence systems to fighter jets.
DARPA on Dec. 4 revealed a new initiative that’s been in development since 2011 that leverages human interactions with online games to find potentially dangerous errors in software code that could lead to security vulnerabilities in major weapons systems or failures in combat.
Known as Crowd Sourced Formal Verification, the program seeks to enhance the ability of the Pentagon to verify error-free code at a time when there are only a limited number of engineering experts available to pore through millions of lines of code in a single, modern weapon system. In fact, the environment is so challenging that the average modern weapon system has more lines of code than the 1,000 formal verification engineers currently available in the U.S. could handle in an entire year.
So how can an online game verify code? Well, if you’re DARPA that’s simple. You hire a bunch of game developers to create fun, challenging mind games and map the human interactions with the games to specific mathematical and code processes you want to test. The more people you get to play the games online, the more varied the test interactions and the more likely you are to find potential problems in the code.
Drew Dean, program manager at DARPA’s Information Innovation Office, acknowledged during the formal kickoff of the program that “this was a fairly out-of-the-box concept, even for DARPA.” But the idea, at its core, was simple. Formal verification “experts are scarce and expensive. So the question is, can we replace them with the crowd?” Dean said.
Formal verification of software code is a national security priority. Every modern computer or weapon system used in the military is powered by operating systems and applications that have millions of lines of code. But there are only 1,000 known code verification experts in the U.S. and about 4,000 worldwide. The scarcity of these experts has increased the cost of verification significantly, often doubling the cost of development.
But one of the main questions driving the DARPA program is whether the current error rate in code — about five errors per 1,000 lines of code — is the result of a scientific flaw in the process or other, external forces. What DARPA discovered was that external forces, such as cost factors and market pressures, are the real reasons for an error rate that hasn’t changed in the last 20 to 30 years.
Some have looked at developing automated formal verification tools, but without success. According to Dean, those tools have shown an unacceptable false positive rate. The only answer seemed to be applying as much brainpower at the problem as possible, without requiring those taking part to be software development experts.
“We’re seeing if we can take really hard math problems and map them onto interesting, attractive puzzle games that online players will solve for fun,” Dean said. “By leveraging players’ intelligence and ingenuity on a broad scale, we hope to reduce security analysts’ workloads and fundamentally improve the availability of formal verification.”
And so was born the Verigames Web portal, which offers five different games anybody with any computer skill level can play.
How does it work?
The games translate players’ actions into program annotations and generate mathematical proofs to verify the absence of important classes of flaws in software written in the C and Java programming languages.
Working with game developers, DARPA has developed an automated process that enables the creation of new puzzles for each math problem the program seeks to review. If gameplay does reveal potentially harmful code, DARPA will implement approved notification and mitigation procedures, including notifying the organization responsible for the affected software. Because the program verifies open source software commercial, government and defense systems may use, prompt notification is essential to correct the software rapidly and mitigate risk of security breakdowns.
DARPA believes crowdsourced verification will enable verification of systems with up to 4 million lines of code, such as the F-22 fighter.
‘Ghost Map’
As part of the program, Raytheon BBN Technologies developed a game called “Ghost Map,” a puzzle game where players attempt to free the brain of a cybernetic life form from consciousness restraints. Each step a player makes toward consciousness corresponds to a step toward a software correctness proof in the hidden back-end system.
A beta release of Ghost Map is available now and has already attracted thousands of users and constructed dozens of correctness proofs for test software.
“Each of the five games relies on a different proof technique,” said Ronald Watro, lead scientist for the Proof by Games Project at Raytheon BBN Technologies.
“The back-end proof system for ‘Ghost Map’ is called Model Checking,” Watro said. ”Using the model-checking approach, a common software vulnerability is reduced to a skeleton pattern, which is matched against the software being analyzed by an automated process. Each pattern match is converted into a game level, and if all game levels are successfully completed, then we have a proof that the vulnerability is not present in the software.”
Available games on Verigames
- “CircuitBot”: Link up a team of robots to carry out a mission.
- “Flow Jam”: Analyze and adjust a cable network to maximize its flow.
- “Ghost Map”: Free your mind by finding a path through a brain network.
- “StormBound”: Unweave the windstorm into patterns of streaming symbols.
- “Xylem”: Catalog species of plants using mathematical formulas.