Skip to content

Voices for Safer Care

Insights from the Armstrong Institute

Voices for Safer Care Home AI on AI AI on AI: Lessons from Rocket Science to Inform Healthcare Artificial Intelligence

AI on AI: Lessons from Rocket Science to Inform Healthcare Artificial Intelligence

By Richard Day.

The loss of the Space Shuttle Columbia and seven astronauts in February 2003 is a tragedy that provides essential learning for organizations and industries beyond NASA and aerospace. In fact, one overarching finding from the in-depth investigation into the technical and organizational causes of the accident concluded that NASA “has not demonstrated the characteristics of a learning organization.”1 Performance as a learning organization is intertwined with the principles of high reliability.2 This and other lessons from Columbia can be applied to healthcare. The National Academy of Medicine (NAM) has promoted multiple facets of a continuously learning healthcare system for several years. One of the primary goals in the NAM Strategic Plan 2024-2028 is to “catalyze transformation toward a health system that is effective, efficient, equitable, affordable, and continuously learning.”3

Figure 1. Space Shuttle Columbia is illuminated with floodlights on Launch Pad 39A during loading tests prior to the launch of STS-3.
Figure 1. Space Shuttle Columbia is illuminated with floodlights on Launch Pad 39A during loading tests prior to the launch of STS-3. (Credit: NASA)

What can healthcare learn from rocket science?

This article begins a series of lessons from aerospace (a.k.a. rocket science) that can be applied to advance the safety, quality, and efficiency of healthcare. With unique experience in both of these industries, I find significant commonality in the challenges and opportunities with respect to continuous learning and high reliability. These lessons are timeless and applicable to many aspects of contemporary healthcare. These lessons from rocket science provide important insights to be tailored for the current development and operational deployment of artificial intelligence (AI) and precision medicine data science to transform healthcare in a safe, reliable, and equitable manner. Continuous learning and high reliability principles are essential components for healthcare transformation.

Space Shuttle Columbia: from heyday to disaster

Columbia was the first Space Shuttle in NASA’s eventual fleet of five spaceships. It was being prepared for the first launch just as I joined NASA after college graduation. I worked on the premier space science mission using the Shuttle – a huge 8,700 lb. pallet of scientific instrumentation to fly on the third flight of Columbia (Figure 1). Coincidentally, the pilot for our mission was the same astronaut that awarded me a science fair blue ribbon nearly a decade earlier. There was world-wide excitement for the future of space exploration evidenced by the continuous flow of domestic and international dignitaries and television journalists. I was often designated to speak to these distinguished groups, including to astronaut candidates and to the author James Michener researching his book Space.

The deliberate process of integrating and testing the complex set of one-of-a-kind scientific instrumentation separately, and together with the even more complex Space Shuttle culminated years of work and required months at the Kennedy Space Center prior to the dramatic launch. After supporting the tightly orchestrated mission operations from mission control in Houston, I learned the importance of contingency preparations as the seven-day mission was extended due to weather conditions at the primary landing sites in California and Florida. Landing was eventually diverted to the desert of New Mexico, the only time in the entire Shuttle program that a contingency landing occurred at White Sands (Figure 2).

Figure 2. Space Shuttle STS-3 landing in the White Sands, New Mexico desert (Credit: NASA)
Figure 2. Space Shuttle STS-3 landing in the White Sands, New Mexico desert (Credit: NASA)

I could not have imagined Columbia and its crew would meet such a tragic end 25 missions and 21 years later. While not involved with the Space Shuttle program and that particular mission, I had developed a keen sense for things that can go wrong in the risky business of space exploration and this accident scenario was a complete surprise. My leadership role at the time was created to enhance the rigor of program management, systems engineering, and mission assurance for robotic space and Earth science missions as a result of some avoidable mishaps and near misses in the context of faster, better, cheaper mandates. The Columbia Accident Investigation Board (CAIB) developed findings over a six-month period followed by a two-year return-to-flight process for the Space Shuttle program. Organizational governance and knowledge management changes were required throughout NASA, to which I contributed.

“NASA has not followed its own rules and requirements on foam shedding” (CAIB finding F6.1-1)

The CAIB determined the proximate cause of the accident was “a breach in the Thermal Protection System on the leading edge of the left wing, caused by a piece of insulating foam which separated from the . . . External Tank at 81.7 seconds after launch, and struck the wing . . .”1 Several days later during re-entry, this hole in the wing’s thermal protection resulted in melting of the aluminum wing structure that led to loss of control and breakup of Columbia in the skies over the Western U.S.

A critical part of the learning from Columbia is that the risk of foam shedding was identified during the design process. According to the CAIB, Space Shuttle design requirements were documented early in the program to prevent shedding of debris that could “jeopardize the flight crew, vehicle, mission success, or would adversely impact turnaround operations.” However, none of the flights met this debris-prevention requirement and this was the 14th mission to experience significant damage during launch. This prospective risk materialized as a real, recurring safety-of-flight issue. This serious issue further evolved into events documented as “in-family”, “within experience base”, and “accepted risk”. These terms indicate the damage had come to be considered normal and more of a nuisance for ground processing. On this final flight, foam-shedding and potential damage was identified early in the mission based on launch and ascent video. However, the management team focused on the time required for repairs prior to the next flight rather than the risk to flight safety.

Avoid oversimplification

This evolution from risk to issue to normal illustrates the antithesis of “reluctance to simplify”, one of the essential high reliability principles.2 Avoiding oversimplification is critically important with respect to complex systems and a principle that I advocate strongly. However, my observation is that organizations find avoiding oversimplification to be elusive in practice. Oversimplification will be discussed further in future segments, along with the lack of attention to specific warnings of safety-of-flight issues in Columbia, Challenger, and other case studies.

Managing risks and issues in healthcare: innovations and insights

These same lessons hold true within healthcare:  we need to be alert for risks defined by a future probability and severity of occurrence that materialize into real-life issues. Risks and issues are distinctly different, but require similarly rigorous treatment. Risks require robust mitigation to eliminate or reduce the probability and/or the severity should the risk materialize. Issues may be risks that were identified and insufficiently mitigated or completely unforeseen. Issues are best resolved with robust corrective actions along with effective actions to prevent reoccurrence. Process discipline is needed to defend against risks and issues evolving into implicitly accepted risk. Indications that such an evolution is occurring may be subtle and unnoticed.

Opportunities for AI to augment human cognition and decision-making

Consider how the Columbia accident might have been prevented if modern AI was available and deployed at the time. Machine learning might have augmented human capacity to analyze the longitudinal flight data set and to understand the emergent risk. Perhaps AI solutions would have drawn attention to the evolution in event documentation, and insufficient corrective and preventive action. With robust AI tools, the disconnect between the design requirements and actual implementation could have been addressed. With autonomous image analysis, digital modeling and simulation (i.e. a digital twin), a precise damage assessment and projected catastrophic failure upon reentry might have been available immediately upon launch. Well-validated AI-based tools may have provided mission managers and crew the time-limited opportunity to execute a suborbital mission abort scenario such as return to the Kennedy launch site or transoceanic abort to a contingency landing site in Western Europe or Africa.

Could these AI use cases have analogs in healthcare?

It has been suggested that despite the development and maturation of patient safety, there is no sign of an overall decrease in adverse events. Worldwide, the frequency of patient harms in acute care hospitals is approximately 10%. The National Academies of Sciences, Engineering, and Medicine has stated that “the country is at a relative standstill in patient safety progress,” with an estimated 5% of all patients experiencing preventable harms.  "12 million Americans are affected by diagnostic errors each year, and perhaps one-third are harmed as a result." An estimated 795,000 Americans become permanently disabled or die annually across care settings because dangerous diseases are misdiagnosed. "Of the 50 million people in the United States who have surgery each year, approximately one million develop serious complications and more than 150,000 die within 30 days." Over 1 million provider-device interactions result medical errors each year in the United States.

These figures paint a compelling picture of the opportunity space for AI augmentation to facilitate continuous learning and high reliability in healthcare.

So, are AI use cases analogous between rocket science and healthcare?

There are, of course, significant distinctions between the delivery of health care and the delivery of astronauts to Mars. However, the complexity of systems, the margins for error, and the consequences for failure offer very similar challenges and opportunities. As two scientific fields on a common journey toward increasingly higher levels of safety and mission success, it is both appropriate and of paramount importance that we share our lessons and best practices for maximal public benefit. For example, a core aspect these fields have in common is the human element. As we well know, to err is human. Both fields will benefit from AI to share the burden of routine tasks, and to complement human patterns of behavior and thinking. AI will help focus attention on the crucial tasks at hand, and alert us to incremental deviations from the expected system performance. AI will amplify weak signals of preventable harms that lead to system failures with devastating impacts.

These hypothetical advanced technology solutions were not available twenty years ago to help prevent the Columbia accident or other catastrophes in healthcare. However, we are now on the cusp of developing and deploying advanced data science and AI solutions in healthcare to translate these lessons into real life practice. Trusted, context-aware, AI solutions can transform healthcare by augmenting human capacity for the identification, analysis, and mitigation of risks and issues with respect to patient safety and quality of care. AI will enable rapid root cause analysis with precise understanding of proximate and contributing causes, and subsequent corrective and preventive actions will be suggested and the effectiveness of implementation evaluated. A common challenge in the establishment of hazard controls that AI can now assist is verification that the established controls are still in place and effective over time. AI tools will help avoid oversimplification and facilitate continuous learning for effective, efficient, equitable, and affordable healthcare.

Rocket science provides a wealth of lessons that can be tailored for AI adoption in healthcare.

In addition to notable mishaps, there are numerous proven best practices from the vast majority of space missions that are successful and exceed expectations. These systems engineering and mission assurance practices should be applied to assure that healthcare AI is designed, verified, validated, deployed, and maintained in a safe, reliable, and effective manner.

In future articles we will discuss other relevant findings and lessons learned from aerospace. We will explore communication and decision-making in light of specific warnings of safety-of-flight issues with Columbia and Challenger. Lessons from other robotic mission mishaps that may sound like silly, obvious mistakes based on news reports or late-night comic musings will be translated for healthcare AI. The silly-sounding one-liners belie the subtle and pervasive nature of the true root cause and how similar errors are prone to be repeated.

To be continued . . .

About the author:

Richard M. Day, MS, SES NASA (ret)

Richard Day is a former NASA senior executive with leadership roles to assure safety and mission success. He co-chaired the Steering Committee for Mission Assurance Improvement among U.S. space agencies and major aerospace corporations during his tenure as Chief of Mission Assurance at the JHU Applied Physics Laboratory. This is his 12th year with Johns Hopkins Medicine and the Armstrong Institute working to eliminate preventable harm and improve patient outcomes through the application of systems engineering and mission assurance methodology. Similar to his work at the national level in aerospace, he served for several years as an active member of the Steering Committee for the Chief Quality Officer Network of Vizient. Richard now plays a key role in the development and operational deployment of artificial intelligence and precision medicine data science to enhance the safety and efficiency of healthcare with a focus on surgery and perioperative care.

Cited Works

1Columbia Accident Investigation Board Report, Vol. 1 (Washington, D.C.: Government Printing Office, August 2003).

2Weick, Karl E, and Kathleen M Sutcliffe. “Managing the Unexpected.” Managing the Unexpected. 3rd Edition. United States: John Wiley & Sons, 2015. 1–3. Web.

3National Academy of Medicine. (2024) NAM Strategic Plan 2024-2028


The opinions expressed here are those of the author and do not necessarily reflect those of The Johns Hopkins University.


Leave a Reply

Your email address will not be published. Required fields are marked *