The Decision That Was Documented#
In the development records of the Boeing 737 MAX, released under legal discovery in 2020 and subsequently reviewed by the Joint Authorities Technical Review (JATR) convened by the FAA, a striking sequence of design decisions is visible. The Manoeuvring Characteristics Augmentation System — MCAS — was a software function added to the MAX's flight control system to counteract a tendency toward pitch-up at high angles of attack introduced by the aircraft's new, larger LEAP-1B engines mounted further forward and higher on the wing. MCAS was a stability-augmentation feature; in its final production implementation, it was also, in the exact terminology of the JATR's 2019 report, "a single point of failure linked to a single angle-of-attack sensor."
The original MCAS design used inputs from both of the aircraft's two angle-of-attack (AoA) sensors. A comparison of both sensor readings would allow the system to detect if either sensor had failed — one common failure mode for AoA sensors is a physical impact from a bird or debris that causes the vane to jam at an incorrect position. If the two sensors disagreed by more than a defined threshold, MCAS would stop activating. This architecture treated the single-sensor failure mode as a credible risk and designed a detection mechanism around it.
During the development process, Boeing engineers expanded MCAS's maximum stabilizer trim authority from 0.6° to 2.5° per activation to ensure the system would be effective across a broader flight envelope. At the same time, the analysis supporting MCAS certification used a single failure hazard classification of "major" — serious but not catastrophic — based on the assumption that pilots would recognise and respond to a runaway stabilizer condition using the existing runaway stabilizer procedure. The combination of increased authority and unchanged failure classification created the preconditions for the 346 deaths that followed. The first casualty was the dual-sensor input architecture: it was removed in the final production design to reduce complexity and avoid crew training requirements that a disagree annunciation would have triggered.
A Safety System With Negative SRR#
The Safety Return on Redundancy (SRR) framework applied to MCAS asks a specific question: did the MCAS system, as implemented in the Boeing 737 MAX production configuration, reduce the aircraft's overall failure probability per unit cost of its inclusion? The answer is quantifiably negative — MCAS-as-implemented had an SRR below zero, meaning it increased the system's overall failure probability by introducing failure modes that exceeded the risk it was designed to address.
When Adding Safety Creates Danger#
The Single-Sensor Architecture and Its Consequences#
MCAS in the 737 MAX production configuration received angle-of-attack data from a single AoA sensor, selected alternately on each flight by the aircraft's flight management system (left sensor on odd flights, right sensor on even flights). The system had no cross-comparison capability — it received a sensor reading and acted on it without validation against the other sensor. A single AoA sensor producing an erroneous high-angle-of-attack reading would cause MCAS to activate repeatedly, applying nose-down stabilizer trim, regardless of actual flight conditions.
The Lion Air Flight 610 accident sequence, reconstructed from the flight data recorder and cockpit voice recorder recovered from the Java Sea in late 2018, illustrates the MCAS failure mode with painful precision. On the morning of October 29, 2018, PK-LQP (a Boeing 737 MAX 8, seven weeks old) departed Jakarta Soekarno-Hatta International Airport. The left AoA sensor, which had been incorrectly installed during maintenance the previous day, was reading approximately 20° higher than the actual angle of attack. MCAS activated, pushing the nose down with approximately 2.5° of stabilizer trim. The crew used electric trim to counteract the nose-down motion. MCAS activated again five seconds after the last stabilizer trim input, pushing the nose down again. The crew trimmed again. This cycle repeated forty-one times in the approximately twelve minutes between takeoff and the aircraft's impact with the sea. The 189 people aboard survivved none of the impacts.
The critical characteristic of this failure mode was its lack of any warning prior to the crash that indicated a flight control system anomaly. The crew received a stick shaker activation (indicating a high-angle-of-attack warning from the erroneous sensor), which they were attempting to address. But the MCAS activations were not annunciated — the MCAS system itself was not described in the flight crew operations manual provided to Lion Air's pilots, and the crew had not received training on MCAS. Discovery of MCAS's existence was, for many 737 MAX operators worldwide, a consequence of the Lion Air accident report — not of Boeing's pre-delivery disclosure.
The Authority Escalation and the Hazard Reclassification That Did Not Happen#
The increase in MCAS maximum trim authority from 0.6° to 2.5° per activation, made during development without a corresponding reassessment of the MCAS failure hazard classification, is the specific engineering decision that transformed MCAS from a manageable nuisance into a potentially unflyable emergency. At 0.6° per activation, an erroneous MCAS activation would be detectable as an anomalous nose-down tendency and correctable through standard stabilizer trim countermeasures. At 2.5° per activation — approximately 2.5× the pitch authority of the MAX's control column — a single MCAS activation could overpower the crew's ability to maintain level flight using the control column alone, requiring explicit electric or manual stabilizer trim to counter.
Boeing's internal safety assessment for MCAS, reviewed by the FAA's Aircraft Evaluation Group during certification, categorised the MCAS failure scenario as "major" — meaning a hazard that could result in significant reduction of safety margins or physical distress to occupants but not in catastrophic outcomes. The "catastrophic" classification threshold (loss of aircraft with fatalities) would have required MCAS activation due to erroneous sensor input to be assessed as an extremely improbable event (failure probability less than 10⁻⁹ per flight hour under FAR 25.1309). The "major" classification allowed MCAS to depend on a single sensor and a single point of failure, because a single-point failure causing a "major" consequence — rather than "catastrophic" — did not trigger the independence requirements that FAA regulations impose on catastrophic failure mode pathways.
The JATR investigation, examining Boeing's certification documentation and the FAA's review process, found that the assessment of MCAS as "major" rather than "catastrophic" was dependent on assumptions about crew response time that, in practice, under actual flight conditions in the Lion Air and Ethiopian accidents, proved incorrect. The hazard was catastrophic. The classification was not. In the SRR framework, this represents a fundamental failure of the denominator: the failure probability assigned to the MCAS failure scenario — and therefore the SRR calculated for the single-sensor architecture versus a dual-sensor architecture — was wrong by several orders of magnitude because the consequence severity was misclassified.
The Cost of the Sensor That Was Not Added#
The cost of the second angle-of-attack sensor disagree warning system — an optional avionics package on the 737 MAX — was approximately $80,000 per aircraft at Boeing's list price. This system would have provided a cockpit annunciation when the two AoA sensors disagreed beyond a defined threshold, alerting the crew to an instrument discrepancy and allowing MCAS to be identified as a potential source of the nose-down tendency. American Airlines and Southwest Airlines had ordered this equipment as standard. Lion Air and Ethiopian Airlines had not.
The JATR recommended that after the accidents, Boeing make the AoA disagree warning standard equipment on all 737 MAX aircraft rather than optional — a recommendation Boeing implemented. The $80,000 equipment package represents approximately 0.05% of the list price of a 737 MAX 8 (approximately $121–135 million at then-current catalogue pricing). In a SRR framework: the disagree warning system would have provided a meaningful probability reduction for the MCAS-activation-on-erroneous-sensor failure scenario, at a cost representing a trivial fraction of the aircraft's value. Its SRR, evaluated honestly against the failure scenario it was designed to address, was strongly positive. Its optional status — created by a production cost optimisation and a certification hazard classification that underestimated the consequence severity — meant the SRR contribution was not captured across the customer fleet.
The total cost of the 737 MAX grounding — 346 deaths, two aircraft hull losses with a combined insured value of approximately $250 million, a 20-month global grounding of approximately 800 aircraft, direct costs to Boeing estimated at approximately $21 billion through 2022 — was so vastly disproportionate to the $80,000 per-aircraft fix that the comparison constitutes its own argument for the SRR framework's institutional adoption. The question of whether a safety investment produces a return exceeding its cost is not a question that requires sophisticated analysis in this case. It requires only an honest assessment of the hazard scenario being addressed — which is precisely the analysis that the certification process failed to produce. The following post examines what that honest assessment looks like when it is performed correctly — and the extraordinary safety record it has produced in high-stakes environments where the stakes of failure are measured in the same terms as here.






