Getting to grips with software testing

During a modern airplane’s flight, hundreds of millions of lines of code will run. Software is present almost everywhere in the aircraft, from mundane components like galley equipment to highly critical ones such as flight control systems.

Each line of code has to be checked for faults. Each software unit has to be tested to see how it integrates with other units, and then tested at a higher systems level. It’s a laborious process. Increasingly, cost overruns and delays in airplane development are linked to software testing. But it’s vital to get software right.

“If software fails, you can get awful incidents like the flight 447 Air France A330 crash over the Atlantic
in 2009, which killed everyone on board on the way from Brazil to Paris. It was caused by one hardware failure, but the root of the incident was the software’s inability to recognize and report the hardware fault,”
says Dylan Llewellyn, international sales manager at software company QA Systems.

Critical planning

The amount of software on aircraft is set to increase. For example, as engines evolve they are becoming more software dependent. FADEC (full authority digital engine control) engines are entirely controlled by software. Among commercial aircraft, the A380 is well known for having large amounts of code. But nothing compares to the amount of code within the F-35 jet fighter, which has been beset by long delays because of the mind-boggling complexity of its software development.

To avoid disasters such as Flight 447, and to reduce costs, software testing should be done as early as possible in a development program. Massimo Bombino, an expert in avionics software and regional manager of Southeast Europe for Vector Software, recommends planning software testing from “day zero” of a development program and forming small agile teams of software developers. These teams should plan all the software unit testing and integration testing from day zero to minimize the risk of nasty surprises and last-minute regression testing, which is conducted when code goes wrong.

“Regression testing is a nightmare and a big issue for the whole software industry. It’s especially a challenge
with safety-critical aviation software. Everything can be running perfectly, then at the last hurdle you introduce a new element and it fails. It’s very tricky and time-consuming to solve unless you have advanced technology,” says Bombino.

There are two ways to prevent regression testing, believes Bombino. The first is to test rigorously from the start. But if there are too many problems and it is too late, it is better to conduct change-based testing. Vector Software has technology that enables a subset of tests to be run according to just the code changes,
thereby improving the efficiency of testing.

F-35 cockpit display — he F-35’s sensors and software detect and process terrain and threats

Independent verification

According to DO-178C, the primary document by which the FAA, EASA and Transport Canada have agreed to approve all commercial software-based aerospace systems, software must be tested by independent parties. The software process is therefore mostly outsourced to third parties, although some aircraft OEMs use separate in-house teams. These independent parties perform verification and validation testing to establish that all the bugs in the code have been removed.

Tucker Taft, a computer scientist and director of language research at software developer AdaCore, says that another problem is that if testers find lots of faults at the verification and validation stage, the errors can be hard to trace back to the original software developers.

“The old-fashioned way of putting software together until you think it’s worth testing doesn’t work with hundreds of millions of lines of code,” says Taft. “Smart companies today frontload their testing processes by finding faults during design and not waiting until integration testing.

“The most modern, agile companies use test-driven development, where you don’t start writing code until you’ve written the pre-imposed conditions that it must pass. The secret is to make testing a fully fledged part of the process.”

When Taft tests software for faults, he carries out static analysis, in which a tester tries to mathematically prove that errors will manifest without running the code. Experience has taught him that certain types of software are hard to examine using the conventional tests written by programmers and that it can be more straightforward to obtain formal proof of software unit’s safety using static analysis. However, the exact opposite can be the case with some other types of software units.

“For part of the system, you focus on mathematical proofs and for other parts you use a more dynamic testing strategy,” Taft says. “Customers often like a combined strategy because they get the confidence
of mathematical proofs.”

As part of more formal testing for certification, Taft prefers to define pre- and post-conditions for software. Pre-conditions imply that before the software sends data to a component, the code has to have certain properties. Unless the software is in an appropriate state, it will not be allowed to send the data. Testing can verify that the pre-conditions are satisfied. Later, the post-conditions state what predetermined data the tester should get back from the software.

“When you use pre- and post-conditions, you can do a lot more formal verification and determine whether the software all fits together,” Taft says. “It makes it easier to do integration testing, which has always been one of the greatest challenges, as each contractor builds and tests in isolation, but when you put it together something inevitably goes wrong.”

The cost of developing the Boeing 777’s software is reported to be US$800m

Models and fuzz

Developments in technology are helping to reduce the software testing workload. As well as automated testing, the use of mathematical modeling for software testing has evolved rapidly in recent years. It can now simulate with great precision what an airplane will do when the software is installed.

Large manufacturers, such as Boeing and Airbus, have embraced the model-based approach to testing software and have created detailed, accurate models. These models can be shared with subcontractors so that they can carry out hardware-in-the-loop tests, where hardware is tested within a simulation of an aircraft’s software systems. “These simulations require a lot of horsepower, so it’s not trivial to create them, but they’re worth their weight in gold,” Taft says.

Another trend is ‘fuzz testing’, which involves blasting the software with large amounts of random data, called fuzz, to try to break it.

Taft compares fuzz testing to the tricks white-hat hackers use to expose weaknesses in corporate security systems. “Fuzz testing hasn’t yet reached the practical level, but there’s a lot of research in academic circles and I think it will play a big part in future testing. You never know when a hardware failure might generate random data, or it could be a hacker trying to break in,” he says.

Despite these technical advances, most aerospace companies find software testing onerous. QA System’s Dylan Llewellyn was part of a team testing the power distribution software on the 777X. The code was written in blocks and tested to ensure that when it was added to the mix it didn’t have a negative effect on what was already there. “If you have bad code in legacy code it causes havoc,” he says. “That’s why when they went from the Boeing 777-300 to the 777X, they didn’t use any of the legacy code.”

Part of the process

The procedure was fairly typical, with multiple teams, each consisting of four or five engineers each, working independently to develop and test the software. Meanwhile a team of around 60 independent experts carried out the verification and validation work.

“There was a huge amount of testing done for the 777X. We had dozens of engineers testing one block of code for 10 months,” says Llewellyn.

Testing of the power distribution software started on what is called non-flyable code A. The teams continued all the way to non-flyable code W, testing various units and integrations. By the time they got to that stage they had tested everything, including at systems level. The next stage was to test the flyable Y level. The teams then installed the software on an iron bird and tested it for redundancies and multiple systems failures before it was installed in a test aircraft.

With software playing such an integral role in modern airplanes, it is perhaps not surprising that software testing is such an involving endeavor that can often prove problematic. But as the complexity and amount of code on aircraft increases, aerospace companies will be forced to get to grips with software testing through both better technology and better management of the process.

This article was originally published in the September 2018 print issue of Aerospace Testing International. To view more and subscribe go here.