Unless airlines improve their IT practices, mega-million dollar stoppages and passengers tweeting fury will continue says Bill Curtis, Senior Vice President and Chief Scientist, CAST.
Nine-digit defects are glitches in IT systems that cause damages over $100 million. Ten-digit damages would be the $8.7 billion the Federal Aviation Administration estimates ground stoppages cost the airline industry every year. IT and its occasional nine-digit defects contribute heavily to these losses.
Last August ‘technical issues’ at Delta Airlines forced it to cancel over 2,300 flights. The delays were so expensive that Delta downgraded its profit guidance for the third quarter – a $100m revenue hit. The culprits were several hundred computers not connected to Delta’s backup system to continue service after a primary system outage.
A month later, many thousands of British Airways passengers suffered hours of delays, some lasting overnight. Stressed ticketing staff presented passengers with handwritten boarding passes due to a software glitch in BA’s new check-in system.
- Launching aeroplanes requires seamless interaction among myriad systems—reservation, check-in, baggage handling, cargo, no-fly checks, fuel projection, flight planning and more. A single failed interaction among these systems can cause an incident, some with global implications.
- These systems have grown staggeringly complex as the airline industry has grown. No single IT professional or team can understand all the systems and their exponentially complex interactions. Too many lack augmentation from sophisticated analysis tools.
- These systems were built in different generations, forcing airlines to integrate different technologies with different architectures. In addition, many of these systems were developed by different companies at different levels of rigour.
- Mergers inject new flaws into systems that had previously worked well. Business processes, IT infrastructures, application portfolios and reams of data must be merged. Unachievable schedules are frequent causes of software flaws because developers are rushed, make too many mistakes and do not have time to find them.
- Some airlines underinvest in the staffing, training and infrastructure required to ensure dependable operations. However, inadequate bandwidth, poor backup and other infrastructure shortfalls are easier to remediate than complex software problems.
All major carriers have experienced expensive IT glitches in recent years. IT problems interrupting business operations are not limited to the airline industry. Many banks and retail chains have suffered embarrassing nine-digit IT defects as well. However, airline outages fill terminals with delayed passengers tweeting their disgust while news outlets capture video. In the airlines’ defence, few industries require such intricate and intertwined logistics on a global scale to conduct basic business operations.
Yet, this begs the question, “When modern aircraft are flown by some of the most sophisticated avionics software ever written, why can’t airlines build IT systems that can stay aloft”? There is no simple answer.
How can these problems be reduced, even as airline operations grow more complex? Here are three recommendations:
- Initiate a rigorous dependability assurance program immediately. Evaluate all operationally-critical systems for correctness and engineering soundness before each deployment. Structural flaws in the source code of computer systems are the most frequent culprits of operational incidents, which from my own experience as a frequent flyer occur far more often than reach the press.
Evaluating these IT applications at the system level is critical. Many disastrous incidents are caused by flawed interactions between different parts of the system that are only detected by evaluating the system from user entry points, through its processing, its querying of the database, its possible interaction with other systems and its response back to the user – which might be another system. Rigorous dependability assurance is not cheap, but the return-on-investment is large compared to nine-digit damages. High quality software takes less time to develop, is much cheaper to maintain and is more speedily enhanced at the pace of business.
- Software-intensive systems must be built in a disciplined environment that provides the time and resources to do professional work. Rushed systems always cost more because of the staggering cost of correcting mistakes, not to mention the damages resulting from serious incidents.
- Software delivered by third parties such as outsourcers, system integrators and software vendors must be thoroughly evaluated before entering operations. In addition, vendors should be evaluated before contracting to ensure their development practices are rigorous and they can retain key staff for the duration of the project.
IT systems reflect the growing complexity of airline operations. There is a level of natural and sustainable IT complexity that roughly matches the intricacy of business operations. Then there is a level of complexity that emerges through integrating systems from different generations, but this complexity can be reduced through IT modernization. Finally, there is a level of complexity that results from bad development caused by poor planning, contorted designs, rushed projects, and inadequate testing. Without attacking this third form of complexity airports will need to increase their supply of cots and blankets.