The SLS axiomatically cannot provide good value to the US taxpayer. In that regard it has already failed, regardless of whether it eventually manages to limp to orbit with a Falcon Heavy payload or two.
The question here is whether it is allowed to inflict humiliation and tragedy on the US public, who so richly deserve an actual legitimate launch program run by and for actual technical experts.
The best time to cancel SLS was 15 years ago. The second best time is now.
Oh yeah, the disclaimer. I do not speak for my employer. This blog should not be construed as an attack on the rank and file staff who have no control, or who are ideologically motivated, or believe that it’s better than nothing. I know and respect many people who will disagree with at least some of what’s in this blog. My usual fare is more constructive, forward looking essays but occasionally one has to do the dirty work.
Because everyone has a soundcloud now, a recording of this blog can be heard here:
Unlike the mainstream of my recent blog series on popular misconceptions in space journalism, the SLS is often covered accurately, that is to say, negatively, in the mainstream press, at least in recent years. Despite that, the only thing that ever seems to change is that the schedule moves to the right and the budget overruns go up. After the recent SLS Green Run test failure I went looking for an article that dug into the architectural and organizational issues at play, and didn’t find much. This blog, therefore, serves as an annotated index documenting a huge, complex, multileveled and ongoing failure.
How hath SLS offended me – let us count the ways. It is hard to know where to start. The SLS is such a monumental, epochal failure at every possible level that at any level it’s self-similar – a fractal. This post was initially intended to be lean and concise (like the thread it’s based on) but like its subject has ballooned in size and scope beyond all reasonable limits.
Brief history – necessarily incomplete
The Saturn V was the most powerful launch vehicle ever flown, and it took 12 people to the Moon’s surface between 1969 and 1972. It was the crowning pinnacle of a decade of frenzied development under the Apollo program, a monumental achievement and in many ways the best possible rocket to solve the given problem: men on the Moon, waste anything but time.
It did not, however, have a credible path to anything like a sustainable cost so it was summarily canceled and the search for a successor launch system began. Reusable vehicles then, as now, seemed like a good idea and, after the many various stakeholders had had their say, we got the Shuttle.
This alone should be reason for pause. Flying vehicles are by their nature marginal, and if well designed they can do one thing well. As the old saying goes, a turkey is a chicken designed by committee and the Shuttle, serving too many masters, turned out to be incurably problematic.
Awe inspiring, technically reusable, enormous cargo bay for delivering and retrieving satellites and space station parts. The Shuttle promised routine, inexpensive space launch and it never quite got there. The program took more than a decade to develop and by the time it ended 30 years later, none of the initial requirements had been achieved. Launch costs were a hundred times higher than projected, launch cadence 20 times less, and safety… we lost two crews. Far from the official value of “one in a million”, rigorously estimated Shuttle risks ranged from 1 in 10 for the early flights to about 1 in 100 for the later ones. A lot of time and effort went into that improvement but the consensus is that we were lucky to lose only two.
BASE jumping is safer. Much safer. This does not make a good slogan for routine space flight.
How could this be? Does spaceflight necessarily have to be unsafe? No. But aspects of the shuttle design baked in risk at the architectural level and, as a result, that tech stack could never progress to something more reliable than a science experiment.
The Shuttle program achieved incredible things but it did not make space travel affordable or routine. And so it, too, came to an end without a viable replacement. They were too expensive to run and too expensive to lose.
Starting in the early 1990s, a variety of studies examined re-configuring Shuttle hardware to produce a more conventional launch vehicle, one with a big new upper stage and commensurately impressive launch capability. Remember, even though the Shuttle’s payload capacity was 25 T to LEO, the orbiter itself weighed about 80 T so the stack, even without an upper stage, placed 105 T in LEO, nearly as much as the Saturn V.
For Earth-launched rockets with conventional propellants, 2 or 3 stage rockets can deliver more mass to LEO than the Shuttle’s “stage and a half” design, so the addition of a second stage should see the performance increase to around 200 T to LEO.
This heavy lift launcher design formed the basis of Zubrin and Baker’s Mars Direct concept, where one launch per year could sustain a program of Mars surface exploration by 4-6 people at a time. The SLS can trace its conceptual ancestry, via the Ares V, to the Mars Direct Ares launch vehicle that grew out of the ashes of the 90 Day Report.
More info on SLS and Starship by Tim Dodd, the Everyday Astronaut.
We’ve seen how SLS was chimeraed together out of politically valuable pieces of the Shuttle program, but what mission is it designed to serve?
Not every aspect of mission confusion can be placed at the feet of the SLS program. In the early 2000s, Congress switched human space exploration priorities with every new president, leading to a “Lucy and the Football” type situation. Moon? Mars? Asteroid? How about a program that can do none of them?
The problem is that high performance systems, including things that have to fly, can in general do at most one thing well. The Saturn V was pared down, brutally, in answer to the precise requirements of the mission at hand. As Houbolt said “Do we want to go to the Moon or not?” It is not clear that his ideological descendants have been either willing or able to inject such clarity into any subsequent NASA launch vehicle development program.
SLS, together with the Orion space capsule, have no destination. Together, they lack the performance to even get into low lunar orbit and return – something that crewed Soviet rockets could do in the early 70s, though in the event the only passengers were a couple of turtles.
I have participated in several mission development projects, of varying levels of seriousness, such as the Caltech Space Challenge. Even in these hypotheticals where SLS is notionally available to use, even though no-one wanted it, it’s still remarkably hard to shoehorn into any useful human exploration architecture for the Moon, Mars or deep space. While its payload capacity to LEO is marginally higher than Falcon Heavy, its low flight rate, unproven nature, uncertain schedule, and weird political attachments make it way too sporty for anything on the critical path. Even if SLS wasn’t architecturally unsafe, poorly managed, incredibly expensive, a technological dead end, obsolete, and cursed by a low production rate, it would still have nowhere to go.
And so the SLS has been developed for two decades without a destination that could actually drive requirements. This is no way to run a program, and early managers of the SLS program under Obama said as much. In public!
I’m 100% okay with Congress cutting generous checks to politically important constituencies to develop hardware for human space exploration, but maybe, just maybe we should have worked out where we were going and what we were going to do there first? Can we now act surprised that SLS is, at best, an incandescently expensive turkey that’s not much use to anyone?
Every decision was made to reduce cost and risk and had the exact opposite effect
Reconfiguring existing Shuttle hardware appears great on paper but reality had other ideas. Take away the Shuttle orbiter and the proto-SLS, now called the Ares V as part of George W. Bush’s Constellation Program, was still hobbled by a hydrogen first stage, giant solid boosters, finicky SSME engines, and a loss of institutional expertise. Reusing Shuttle hardware should retain workforce and derisk development, but it’s important to remember that despite flying more than 100 missions, practically none of the Shuttle hardware was reliable enough, at an architectural level, to meet any kind of certification standard.
What does this mean?
If I want to modify a C-172 to fly at 300 kts that’s fine, but I won’t be able to do it without a much more slippery wing with far less forgiving flight characteristics. And if my new plane stalls dirty at more than 61 kts, the FAA simply won’t certify it for general aviation under Part 91 of the FARs. I can slap an experimental sticker on the side and cancel my life insurance, but I can never fool myself that my hotrod Cessna meets basic certification requirements.
The Shuttle hardware stack is no different. In pursuit of ultimate performance and top-down design, we have a case study in Conway’s Law with respect to the major aerospace contractors in 1970, and the tech to match. This was 50 years ago. Can we do better today? Yes, and not by a small amount. Multiple private rocket companies are shipping rocket motors today which are both more reliable and higher performance (ceteris paribus) than the SSME.
The Shuttle parts bin can never be low risk.
The SLS attempts to limit development risk by reusing bits of the Shuttle, which itself often drew from parts of the Saturn V. So the new best rocket will use tech that’s 50 years old. The Shuttle parts, however, are not exactly rock solid reliable. We lost two Shuttles but comprehensive FMEA found thousands of potentially fatal failure modes. Playing whack-a-mole with architecturally unsound design flaws, decades after the designers have retired, at a rate of one per lost mission is no way to run a program.
For example, Challenger was lost because hot gasses from the solid boosters escaped through leaky O-rings and damaged the main fuel tank.
Why? Because it was too cold, engineering was overruled in favor of launch fever, and the O-rings weren’t flexible enough when cold.
Why? Because program management was incentivized to subscribe to a version of reality where the Shuttle was a reliable, dispatchable launch system, not a ticking time bomb.
Why? Because the solid boosters had to be integrated from multiple sections and sealed with rubber O-rings.
Why? Because previous launches had seen evidence of O-ring blow by and hadn’t crashed, even though they probably should have.
Why? Because humans normalize deviance.
Why? Because the design needed enormous and unstoppable solid boosters to compensate for the main engine’s low thrust at lift off. Just ask any Shuttle astronaut what they thought of the RTLS abort scenario.
Why? Because the design insisted on a SSTO-like architecture in the belief that it would reduce costs.
Why? Because the Germans had largely left the US rocket development program by this point.
Why? Because veterans of Vanguard thought it was time for Americans to lead the rocket development program.
Why? Because veterans of the airforce X-15 program thought that space planes were superior to conventional rockets.
Why? Because the result of this cascade of design decisions landed at the feet of the execution team and no-one was sufficiently empowered to say “clearly a terrible mistake has been made, this architecture can never achieve its promised performance, let’s start again.”
So they went forth and built the impossible. This is hubris.
For example, the SSME was the first staged combustion cycle engine developed in the US. It too had a series of terrifying “teething” issues, complete with combustion instabilities, cracking turbine blades, and leaking seals. At launch they were operated at up to 109% design power, with no engine-out capability until very late in the launch profile. At the design stage, this made sense – they would be no less reliable than a modern jet engine. Unfortunately, the reliability estimate was off by a factor of a thousand or so, so each re-usable engine had to be meticulously disassembled and reassembled and test fired before every flight, at enormous expense.
Richard Feynman was part of the Rogers Commission investigating the loss of Challenger and ended up publishing a sort of minority report. Due to his prominence and contrarianism, numerous members of the NASA rank-and-file approached him with otherwise obscure or obfuscated information leading him to some insights about the nature of the problem at the managerial level, which only made it into the report’s appendix when Feynman threatened to publicly repudiate the entire process if excluded.
Even at the time of the Rogers Commission in 1986, Feynman’s report pointed out that no SSME ever built had even got close to the design qualification requirement. Ordinarily, this would trigger a ground up redesign and, if necessary, modification or cancellation of the entire program. Instead, program managers routinely shifted the goalposts over objections from engineering and without any kind of rigorous analysis. For example, instead of working out why the engines showed signs of damage after a single full duration test firing, when they were designed to last for dozens of flights between inspections, they decided that anything short of a catastrophic failure after a single flight meant that they were safely within design margin.
The broader point from Feynman’s politically contentious minority report was that he accurately perceived that the Rogers Commission began with its conclusion already decided, even after the attempted NASA cover up fell apart. Namely, that with a year or two of frantic investment in risk mitigation, the Shuttle could be “fixed” and live up to its original promise of routine affordable safe space flight. Of course this was axiomatically impossible, Feynman (among thousands of others) recognized this, and that aspect of the Commission for the sham it was. But anyone who has been around major development projects, public or private, knows that no Commission that reports such obvious facts is politically tenable. The reality in 1981, 1986, 2003, and today is that the most technically accurate summation of the Shuttle was that it was fundamentally ill-conceived and ill-executed, and not just irredeemably unsafe but downright dangerous.
There is precedent for testing surfacing such egregious safety issues that an entire aircraft development program is cancelled. Instead, the Rogers Commission buckled to perceived political pressure, kicked the can down the road, and sealed the fate of the Columbia 7. Then, as now, it was just a matter of time.
Not convinced? Let me quote a paragraph of Feynman’s Appendix directly, addressing the concept of “factor of safety”.
“In spite of these variations from case to case, officials behaved as if they understood it, giving apparently logical arguments to each other often depending on the “success” of previous flights. For example. in determining if flight 51-L was safe to fly in the face of ring erosion in flight 51-C, it was noted that the erosion depth was only one-third of the radius. It had been noted in an [F2] experiment cutting the ring that cutting it as deep as one radius was necessary before the ring failed. Instead of being very concerned that variations of poorly understood conditions might reasonably create a deeper erosion this time, it was asserted, there was “a safety factor of three.” This is a strange use of the engineer’s term ,”safety factor.” If a bridge is built to withstand a certain load without the beams permanently deforming, cracking, or breaking, it may be designed for the materials used to actually stand up under three times the load. This “safety factor” is to allow for uncertain excesses of load, or unknown extra loads, or weaknesses in the material that might have unexpected flaws, etc. If now the expected load comes on to the new bridge and a crack appears in a beam, this is a failure of the design. There was no safety factor at all; even though the bridge did not actually collapse because the crack went only one-third of the way through the beam. The O-rings of the Solid Rocket Boosters were not designed to erode. Erosion was a clue that something was wrong. Erosion was not something from which safety can be inferred.”
I am far from a professional clipboard wielding box checker but I’ve done my share of checklist-oriented activities, including earning a PPL, skydiving, SCUBA diving, and mountaineering, and I lived to tell the tale. This description of outcome-oriented real time faking of design criteria has never failed to trigger internal screams. The Shuttle was riddled with systems where the design never met requirements, failed in all kinds of odd and scary ways, and were basically ignored in service of the larger goal. Namely, expediting the next round of Russian roulette. So it’s hardly a surprise to find that “flight proven” Shuttle hardware which never achieved remotely safe performance has continued to fail throughout the SLS development program, though apparently without ever triggering a now-40-year-overdue trashing of the entire tech stack.
In 1986 there were still enough Apollo veterans floating around that, had anyone involved had the courage to declare Shuttle a total loss and permanently ground it, there is every chance that a rocket in the style of the Falcon 9 could have been human rated and operational by 1990, representing a genuine path to steady improvements in reusability and cost, and a commitment to fact-based reality as the program’s guiding star.
Indeed this post isn’t intended to cheer endlessly for SpaceX but it is telling that with a tiny fraction of the time and money they, motivated by correct interrogation of the fundamental architectural questions, developed the F9 rocket and Dragon spacecraft, a launch system considerably less exotic than Shuttle and yet orders of magnitude cheaper, safer, and better performing.
Today, we have a few dozen SSMEs left in warehouses, exquisite examples of 1970s-era tech, every bit as wonderful as a Faberge Egg. Fit for a museum, not a modern rocket. Are they reliable enough? No. But are they expendable? No. But are they at least affordable, because we already have them? Also no. The contractors involved are providing them for the SLS at a cost of $150m per engine.
Let’s get this straight. We’re going to take these priceless antique reusable rocket engines and fly them once and drop them in the Atlantic Ocean. And the engines alone will cost us about the same as 10 Falcon 9 flights.
Let’s talk about hydrogen. Shuttle designers gravitated towards hydrogen as a fuel because its specific impulse is substantially higher than other fuels, and the Shuttle was not a design that could afford to leave performance on the table. Modern consensus is that once the cost, complexity, and additional mass required to deal with hydrogen is factored in, its performance is not favorable compared to other more conventional fuels, such as RP-1 or methane, especially for launching to LEO. For example, hydrogen is a hard cryogen, requiring extra insulation and mass. It easily leaks through sub micron valves, seeps through metals causing embrittlement, and is so cold that it restricts choices on materials. Further, ISP isn’t everything. At launch, a rocket is moving slowly and doesn’t particularly need high exhaust velocity and efficiency, it needs high thrust to punch through the atmosphere. Here hydrogen is next to useless, which is why the Shuttle needed to play Russian roulette with the giant solid boosters.
By comparison, the Falcon 9 has a similar payload capacity (~22 T expendable), uses kerosene and oxygen propellant, and costs 5% what the Shuttle does on a per kg to LEO basis. Not a 5% cost improvement. A 20x cost improvement. The ISS required about 35 Shuttle flights to assemble. If the modules had been flown for Falcon 9 prices, the entire thing could have been launched for less than the cost of two Shuttle flights, or a single year of the program’s operation, and it might not have taken 20 years to build.
The SLS is about 30% bigger than Shuttle. It has 4 engines, not 3. It has bigger solid boosters, and more fuel. And yet its supposed LEO capacity is a mere 70 T, a whisker more than a well-motivated Falcon Heavy. The Shuttle could do 105 T including the mass of the orbiter. The SLS is the Shuttle stack simplified, straightened out, and increased in size, and it lost, relatively speaking, more than half of its overall performance. Adding a second stage should improve that, and the Exploration Upper Stage is planned to do just that. But engineering and building that stage has never been funded and is expected to take at least another decade (why so long? I have no idea) so the SLS cannot even be useful until 2035, almost a century after its earliest flight heritage components were first designed.
How can it be that a rocket assembled from existing flight-tested components can be the most expensive, hardest to build, have the worst performing schedule, and by far the least safe of any contemporary option? I’ve asked around and no senior engineer at any major aerospace company could really explain how this was possible. The best guess was that as engineers at MSFC hadn’t built a new rocket system in a few decades (which may as well be eternity) they had recursively put in fudge factors until more than half the rocket’s potential performance had been thrown away.
I am far from the first person to identify insurmountable structural issues with the SLS program. https://en.wikipedia.org/wiki/Space_Launch_System#Criticism
Indeed, as long ago as 2015 when the schedule had barely begun to slip, Rand Simberg pointed out that the SLS program is, at best, a reconstitution of “Apolloism”, the idea that deep space exploration is only possible with a national mandate and huge expensive rocket, because that’s how it happened the first time. Simberg’s analysis includes deep discussion of the text of the acts of Congress that brought SLS into being.
Last year, Eli Dourado published “The Space Launch System Is An Irredeemable Mistake”.
My focus here is on risk mismanagement at the architectural and organizational level – the ultimate root cause of NASA’s previous tragedies and the lesson that is yet to be learned.
Normalization of deviance
NASA has made mistakes and lost astronauts. Like the FAA and the NTSB, NASA investigates accidents, studies the root causes, and publishes the reports so that everyone can learn from them and try to prevent them from happening again.
In total this is more than 1000 pages of expert commentary on a variety of root causes.
Here, we return to Feynman’s appendix within the Rogers Commission Report, which closes with the sentence:
For a successful technology, reality must take precedence over public relations, for nature cannot be fooled.
It goes without saying, but if Feynman of all people is calling a deliberative process out for arrogance then there might be a real problem.
I’ve touched on aspects of Feynman’s report earlier but here my specific concern is the thread that runs through all four accidents. Not the specifics of system failure, though they are of course interesting, but the malfunctioning of the human-management systems that are intended to prevent this exact sort of problem and instead often becomes its root cause. Normalization of deviance. Group think. Selective re-interpretation of results. Acceptance of the ways things are done, even if they lack any justification in terms of first principles.
After Challenger, substantial effort went into redesigning parts of the solid boosters to avoid a repetition of that particular failure. In the process, dozens of other critical design flaws were discovered and rectified. Do you think they got them all? Of course not, not even close.
There are multiple ways to approach big development programs like SLS or Shuttle. One favors detailed analysis, careful design and review, and admittance testing of the finalized design. Another favors moving fast and routinely breaking things, but treating it as a learning experience and aggressively investing in the execution capability of the underlying team.
The Shuttle started out as the former but, in fixing Challenger-derived issues, jumped to the latter. Yet we can’t afford to crash 300 Shuttle orbiters to surface the 300 most critical design flaws. And so the Shuttle program continued to operate with thousands of architecture-level design flaws, any one of which could kill everyone on board.
I am writing this blog because, in every possible way, the SLS program embodies the logical union of all of these organizational problems that have troubled NASA since its inception in 1958. This isn’t bureaucratic inefficiency at the local DMV. This is a multi billion dollar multi decade national flagship project oriented towards launching living breathing humans into deep space, and it’s being run in a way that maximizes the odds of very public failure.
To a certain extent all organizations have areas of suboptimal performance. Building cutting edge flight systems is a known hard problem, which is precisely why it must jealously avoid adding marginal requirements. This is the too-many-cooks problem, warned about concisely by Kelly Johnson in his rules for operating the Skunk Works.
I don’t know anyone who has read about NASA’s major disasters and read about the SLS and has failed to join the dots. It’s not a huge leap of the imagination. It’s hard to imagine a way of making the system more likely to deliver national humiliation and tragedy.
Combine organization hubris, political expediency, thoroughly characterized and utterly obsolete flaky hardware, questionable design methodology, piss poor program management, unaccountable contractor behavior, normalized creative accounting, and routine denial of reality. Not a recipe for success.
We have seen how Shuttle and SLS came to exist, and we have seen how utterly unjustifiable the architecture is in terms of first principles. Not only is it axiomatically unsafe, it is incredibly expensive, a technological dead end, utterly irrelevant, and of terminally low economic value.
Wasn’t this already canceled before?
We have seen how utterly unfit the SLS is at every possible level, so it may not surprise the reader to learn that it was already canceled once, resurrected, and zombified. In 2009, the Augustine Commission found that the Ares V, an SLS predecessor with slightly better performance, could not fly before the late 2020s at then-current funding levels. This turned out to be remarkably prescient.
Obama agreed and tried to axe the entire Constellation program. SLS and the spacecraft Orion (also a bloated contractor cash cow, crippled by poor design and no reference mission, ripe for immediate cancellation, but otherwise beyond the scope of this post) were dragged kicking and screaming back out of the dumpster and re-animated to restore the money laundering mechanism required to satisfy the program’s key constituents: the NASA human space flight office and the science community.
Ha, I joke. I mean, Boeing with its army of well-connected lobbyists and key congressional districts in southern states, who have made out like bandits.
In return for acceding to the prolonged torturous confabulation of a Frankenstein’s Monster Rocket by pillaging parts of the dismembered Shuttle corpse from beyond the grave, Obama was able to kick off the commercial cargo and crew programs. Despite deliberate Congressional sabotage through withheld funds, these programs have delivered on their mission basically on time and budget, at something like 50 times the value for money that SLS could have achieved under its best case scenario.
The SLS was reconstituted but at the head of a non-existent program. Where would it go? What sort of mission would it serve? LEO? Space stations? Moon? Mars? Asteroids? Don’t know, don’t care, doesn’t matter. In the end, it went to none of them.
The entire damn lunar gateway only exists because SLS is too anaemic to launch the incredibly overweight Orion anywhere useful, so perhaps we should just drop the whole thing into the Atlantic ocean and be done with it.
Are you not entertained? Let’s talk about an extensive list of fubared systems.
This section is indebted to NASAWatch for collating industry scuttlebutt over the decade or so that this zombie program has assiduously picked the public pocket and delivered negative value.
Due to sheer quantity of major programmatic hiccups I must necessarily be brief and mention only the highlights. Of course, every project has problems. Problems cost time and money. And the more time and money is spent, the more money the cost plus prime contractor makes. Can we really profess surprise then that SLS has cost $20b and taken more than a decade and still hasn’t flown?
The tank broke
Someone dropped an incredibly expensive tank dome, damaging it beyond repair. Of course, if this was a real production facility it wouldn’t be that hard to replace but because everything is meticulously handcrafted by elves with nail files, this caused a major delay.
The welder broke
This one digs into the archives but NASA spent months installing a new friction stir welding machine only to discover that some subcontractor hadn’t reinforced the floor, causing the machine to break and need to be rebuilt from the ground up.
Two years later the welding machine broke again.
The engines needed extra work.
I previously mentioned that despite 40 years of flight heritage, the once reusable SSMEs are still subject to a variety of technical issues and have never gotten close to the original certification criteria.
In addition to that, Aerojet Rocketdyne (the engine contractor) has managed to charge more, per engine, for the second hand engines that are already gathering dust in some warehouse, than they cost to build in the first place.
A cheaper, expendable variant of the SSME called the RS-68 was certified 20 years ago and originally slated to be used on the Ares V, until it was canceled the first time. It is routinely used on the Delta IV, which is now defunct due to highly non-competitive launch costs, and would have required over 200 design changes to meet human rating standards.
Between the fuel tank and the engine is the thrust structure plumbing. The SLS project managed to contaminate that plumbing with paraffin, and it wasn’t detected until after the plumbing had been completed. FOD in tanks and engines is a surprisingly common source of rocket crashes (such as the Antares), and for that reason everyone knows to look out for it and take precautionary measures. Well, nearly everyone.
For some reason it takes a month to attach each engine to the rocket, maybe longer if it’s done side on. By comparison a 737 spends just nine days on the assembly line.
The software, oh, the software!!
The process of developing flight software is (usually) a very serious endeavor requiring experience and deep expertise to ensure that something like Ariane 5’s first launch doesn’t happen.
Should we be worried, then, that the NASA OIG continues to report that the MSFC software team can’t get their act together and that the SLS still doesn’t have either a complete flight software stack or any kind of integrated test environment?
I think it’s safe to assume, given how confused everyone was over the TVC parameters in the Green Run test failure, that the software is still an utter shambles.
It is not well known anymore, but leading up to the first (and much delayed) Shuttle launch in 1981 the entire software stack and test system had to be rewritten from scratch. Having endured hundreds of design changes it was such a mess that tests couldn’t even be reliably started.
The booster test failed
Again, in these programs the tests are admittance tests. They are specifically designed to be boring. Nothing in this 50 year old tech stack is meant to break. And yet the mission critical, single point of failure, thrust vectoring nozzle on a closely related solid motor failed for no apparent reason. If this happened on a Shuttle or SLS flight, it is safe to assume total loss of crew, payload, and vehicle. Solids have no engine out capability. They have moderately common failure modes which are apparently still not understood and still not corrected. It is safe to assume that the “black swan” failure rates of solids are around 1 in 100, which is nowhere near good enough for a modern rocket.
A failure rate of 1 in 100 in any critical subsystem means that overall system performance will never be better than that. Almost infinite levels of engineering effort can continue to be expended on solid boosters but they will never be reliable enough for any kind of commercial flight certification, not even close. If we were talking about aircraft we’d be talking about catastrophic structural failure, and aircraft that occasionally shed wings cannot be certified.
Scream it until their ears bleed: Solid-boosted hydrogen first stages are architecturally unsafe.
The political insistence that the Shuttle and SLS use solid boosters means that no amount of time or money will ever make them safer than a 1 in 100 chance of catastrophic failure. Why spend infinite money on an obsolete system that has failure baked in?
How’s that billion dollar launch tower going?
NASA spent billions of dollars on upgrading the mobile launch tower – the big truss that sits next to the rocket on the launch pad. Unfortunately it leans and will only be used for, at most, one flight. Tweaking an existing steel truss is not exactly rocket science. For the same price, you could have 20 Falcon launches. Because it would take three years (why? I have no idea.) to upgrade the launch tower for SLS Block 1B, instead they’ll build another one for another billion dollars.
Orion’s PDU broke and needs a year of work to replace. Is this a joke?
I’m not focusing on Orion here but it’s worth mentioning that during some recent testing it was discovered that a redundant power/data unit (PDU) had failed, and what’s better, it’s an essentially permanent part of the spacecraft structure so replacing it will take the better part of a year. I guess if it takes 12 years to build a spacecraft and launch it, one of the ten thousand fancy widgets inside will exceed its shelf life and let out the magic smoke? Well and good, but don’t put stuff that might break inside the wall.
It’s also impossibly expensive. Who are the stakeholders?
Cui bono? Boeing is the prime contractor and they’ve made lots of money on screwing SLS, way way more than they ever could have by managing it properly and delivering it on time. Obviously NASA employees and local contractors at MSFC and JSC have earned a livelihood. Boeing spends squillions every year on lobbying and other government activities, which supports both a staggering number of lobbyists, fancy lunches, and political campaigns, but is also basically petty cash even in the context of the contractor performance bonuses which, somehow, Boeing has mostly received.
The picture here is that all the key constituencies with power and money are very satisfied with the SLS, because it provides the operational cover necessary to move such vast sums of public wealth into the favored hands. Indeed, actually flying the rocket is pretty risky in terms of the existing scheme, because it might fail and thus end the good times.
Former Ames director Pete Worden described this system as a self-licking ice cream cone in 1992, nearly 30 years ago.
Somehow, the human space exploration budget has been “favored” with this sort of unwinnable grift, with the result that yet another generation of idealistic engineers has aged into retirement, most of the Moon walkers have died of old age, and a basic Moon base or Mars base seems more out of reach than ever. It doesn’t have to be this way.
Every month I stare at the full Moon and it mocks us humans with our weak rockets and lame technology and cruddy organization. We should be up there playing Lunar tennis.
Well I have an axe to grind and said so at the top. But don’t take my word for it. Read yourselves, how greed can ruin a good thing.
Endless reports by toothless government and agency watchdogs pointing out the obvious – the rocket is expensive, unsafe, and will never work properly. One of them even found that NASA hid nearly $800m in costs to avoid a Congressionally mandated budget cap by spending money earmarked for future development programs. Just try that in your annual tax filing and see how far you get. (Actually, don’t.) There’s a word for this and it rhymes with broad.
Here are some articles on how literally everyone finds the SLS management to be worse than useless and yet all the key players keep awarding themselves and each other with performance bonuses. If only gaming the system was their only task – they sure act like it is.
Here are a few articles on the absurd cost of SLS. Remember, even if it actually was fast and safe to reuse Shuttle hardware, even if the program was well managed, it would still only manage an SLS flight every year or two and cost between $2b and $3b per flight. Maybe more. Actually, no-one knows for sure. Can you stop asking?
[Update March 2022: During a House Science Committee hearing NASA OIG Paul Martin revealed that the marginal launch cost for each of the first four Artemis SLS launches was $4.1b. This doesn’t include any development costs, which will total $93b by 2025. Incredible!]
Can you imagine showing up for your day job and telling your boss that your salary is now a secret, but at least 3x higher than the day before, and that your work product was going to be a decade late? Even the people whose only job is to know exactly how much the SLS costs apparently do not know.
The only metrics that matter for big rockets and humans in space is $/T and T/year. By an unbelievably huge margin, the SLS has mismanaged itself into the wrong end of the field on both these axes, with a rocket that costs maybe 20x more per tonne and, due to its appallingly low flight rate, delivers less mass to orbit in a year than SpaceX can in a fortnight, in 2021.
The SpaceX Starship is designed to deliver on order a million tonnes to orbit a year, for about $100/kg. That’s 15,000 times the stuff for 1/500th the cost. I have no doubt that the Starship development program will have its surprises and setbacks but they’ve already flown to 12.5km – roughly as high as the stack of $100 bills already spent on SLS would reach. Even if Starship comes in at 10x the design cost it will still be 50x cheaper than the competition. Would you spend $20k on a car, or $1m on the same car? It’s hard to even make meaningful comparisons here.
You want more? Here’s some dessert. Boeing spends public money on multiple targeted advertisement campaigns (i.e. propaganda) about SLS which are factually incorrect.
Oh, but the Green Run Test
Just days before President-Elect Biden was inaugurated, NASA conducted the crucial Green Run Test for SLS. This is basically a full duration static fire intended to show that the flight hardware on the test stand is ready and able to operate for the full 8 minute launch profile. A test at this stage of development is an admittance test. It’s meant to work properly, and not to surface any unexpected issues.
Full duration static fires are a standard part of any modern rocket’s development, and the test is designed to mimic the launch profile as closely as possible so that the rocket is actually subjected to flight-relevant conditions. Test as you fly, fly as you test.
Program management at one point suggested skipping the Green Run entirely to make up for some of the endless program delays, especially as the testing program would take many months to execute. The rocket should be safe enough to launch people (presumably someone else’s parents/spouses/children), first time, they said. Fortunately, cooler heads prevailed.
The timing of the test was important to raise the profile of the Artemis program within the new Biden administration. With that in mind, it appears that several key test parameters were altered to reduce the odds of test failure. This is not the same as avoiding catastrophic failure by shutting everything down before a major explosion. The use of “precautionarily conservative test parameters” is cooking the books, plain and simple.
Nevertheless, after a mere 67 seconds the rocket cut out, providing a stream of confusing error messages that took days to sort out. Public comments mentioned something about hydraulic gimbal actuation being out of very conservative bounds, but came with very sincere assurances that in an actual flight it would have been okay. Maybe. So say the managers for whom a successful green run test might make the difference between getting another ~$100m tranche of award fees or not.
Of course, it’s impossible to say for sure since the test parameters were nothing like the actual launch. What we know is that the SLS has at least one obscure failure mode that results in the engines shutting off for no immediately apparent reason. If this happened in flight, the axiomatically unsafe boosters would keep firing for a couple of minutes before the stack could pull apart and abort procedures occur. Of course, analysis of the failed Ares I launch vehicle found that exploding boosters would incinerate a launch abort parachute. So, basically guaranteed total loss of mission, payload, and anyone riding it.
Now we’re on track to do the Green Run Test again, but this time maybe for just 4 minutes, because maybe the booster structure won’t like the thermal cycling of multiple tanking/detanking operations. In any normal program, an unexpected failure would generate a detailed inquiry and more comprehensive testing regime, but for the parallel reality of SLS, the previous failure justifies cutting corners on the next test.
The cadence and content of these releases mirrors, almost exactly, the statements and reporting after the Starliner test failure, also due to untested software failures.
I wrote this blog because this test failed in a way that reveals that exactly the same sorts of issues that directly caused previous catastrophes are still alive and well at NASA.
- Schedule pressure and political expediency leads to cutting corners and running fake tests of extremely limited value, just like in Apollo 1.
- Contractor-provided systems fail in obscure ways and a clear disconnect is apparent between the contractor’s engineers, NASA’s engineers, and management, just like Challenger. And Mars Climate Orbiter. And Mars Polar Lander.
- Excuses are being made for failure modes that were not anticipated, and deviance is being normalized, just like Columbia. And Challenger. And half a dozen other near misses.
Is this unique to NASA? No, but it’s on brand for Boeing
Boeing was once synonymous with innovation, elite achievement, and flawless execution. By most accounts, the 777 was a masterclass in program management, with ~280 separate sub-system teams performing decentralized design and optimization to bring the plane into production on time and schedule.
With this enviable legacy built up over decades of hard work, Boeing jealously safeguarded the institutional knowledge that they had hard won. Right?
No. They bought out McDonnell Douglas, their previous competitor whose passenger aircraft had failed in the market due to questionable accounting decisions recounted with admirable thoroughness by John Hart-Smith. https://s3.amazonaws.com/s3.documentcloud.org/documents/69746/hart-smith-on-outsourcing.pdf
So naturally they installed this failed executive team at the top of their org, who then pushed out Boeing’s existing management, moved HQ to Chicago for no reason, stovepiped the organization, pushed about 10,000 experienced engineers and technicians into early retirement, embarked on the enormously ambitious 787 project, pushed design work out to ~50 subcontractors and acted surprised when this failed egregiously, in exactly the same way as it had for McDonnell before and Douglas Aviation before them.
After blowing $50b on a $5b program, the 787 limped into service, only to suffer a series of agonizingly embarrassing failures. In every case, their sub (sub sub) contractors had banked profit in direct proportion to how much value they failed to deliver to the project integrator, whose outsourcing had spectacularly failed to diversify risk.
Boeing learned their lesson and rehired the sort of in house talent necessary to once again vertically integrate construction of advanced aircraft. Right? Of course not. Instead, they squandered billions in lobbying, regulatory capture, and self-dealing stock buybacks. Then they fumbled development on the 737 MAX (killing 346 people), F35 (not the prime), KC-46, Starliner, and SLS. In the case of Starliner, untested outsourced software was so abysmal that extremely public system tests failed in ways that turned out to be stupidly obvious – in this case the software could not agree which way was “up”. What sort of Prime contractor flies untested third party software on NASA’s commercial crew test flight?
Should we act surprised that Boeing even fumbled some sketchy procurement inside advice from Doug Loverro, costing him his career while still getting nowhere near the Lunar lander contract?
Boeing has lost the plot. All orgs get to the point where their internal processes and systems are decaying faster than any sort of intervention can save them. Like leprosy or necrotizing fasciitis, the patient still lives but their days are numbered.
Is it any surprise, then, that Boeing runs the SLS program, bamboozles friendly NASA program management into giving them most, if not all, of their award fees,
bribes lobbies the crap out of Congress to keep the gravy flowing, and delivers half-baked hardware six years late that can’t even pass the sandbagged PR test firing in front of the new Presidential transition team?
America is a place where getting rich by doing good is celebrated. No-one objects to a contractor or private company making a tidy profit where they deliver exceptional service or generate and share incredible wealth. For example, compared to SLS even Microsoft is a family friendly American institution!
Where I object is when a politically-connected Prime makes bank at the expense of the ultimate customer, the US people. The SLS is not in anyone’s best interests, except the well-positioned few who are raking in the dough hand over fist. The SLS should be a machine for transporting stuff into space, not pumping public money into well-connected pockets. Shame!
The rest of NASA is not like this
At least, it tries not to be. The science programs are run with reference to the decadal survey. This provides a mechanism for programmatic discipline. Decadal surveys represent the consensus of the scientific community. They drive steady progress in key questions, at affordable prices. Generally they have a great track record.
Prominent exceptions, such as JWST, unsurprisingly bear the fingerprints of a major aerospace contractor (Northrop Grumman) who specializes in buying up smaller contractors with new government contracts and then bleeding them for all they’re worth.
Here’s a good business plan if you’re the third best major aerospace contractor, in a field of three. (Raytheon and GD don’t generally play in this space). Get more lawyers than the government. Execute hostile takeovers of big government contracts. Shaft engineering, obstruct product development and delivery. Demand more money. Get it. Spend some of that money on political contributions and lobbying. Keep the rest. Rinse repeat.
Jim Bridenstine attempted to cleave SLS and Orion, suggesting Orion could be flown on Falcon Heavy. Senator Shelby of Alabama, home of MSFC, demanded his resignation. With VP Pence’s support, Jim swiftly backtracked and just barely kept his job. The SLS lived to fight another day – but at what cost? Just contrast the tone in public statements made at the same time, which repeat the same tired talking points of severely questionable technical accuracy.
Human space flight should have a decadal survey. Do we wonder why it doesn’t?
Not even a jobs program
There is an argument that Space Coast senators using NASA funding as a personal piggy bank are kinda sorta legitimately investing public money in industrial development in the historically under-developed Deep South.
MSFC employs about 6000 people and runs an annual budget of $2.8b, which works out to $466k per head, quite a bit higher than salaries. While the end of the Shuttle Program otherwise implied a loss of livelihood for about 10,000 people (no wonder it was so expensive to run), keeping them all busy on some make work Potemkin project doesn’t provide job security. It just kicks the can down the road.
To illustrate this, let’s say that in 2011 at the end of the Shuttle program, MSFC employees were offered voluntary redundancy with a hefty bonus, equivalent to $100,000/year for ten years. A million dollar payout to anyone (or everyone) who wanted it. If everyone took it, not only would this still cost less than a quarter of what was spent on the SLS, at less than $600m a year, all of these talented and well-trained people would be free to invest that money in local industrial development that actually met a market need and generated economic value.
Instead, these people are trapped in salaried positions building a nearly fake rocket so that Boeing can skim off 300% overhead and a handful of antique maintenance tools can be kept operational. Is any of the Shuttle-related tooling or processes relevant to any other modern market need? No.
Instead of every dollar being invested in actual productive industry, ~25c went to a local jobs program with no useful industrial development, and ~75c disappeared into “overhead”. This is significantly less useful than Universal Basic Income. Is it any wonder that despite a reliable river of public treasure (by mostly “small government” politicians no less) the South remains relatively unindustrialized?
Astronauts want to risk their lives
Astronauts are adults in a risky profession and it’s their call whether to risk their lives or not, ultimately. I respect and admire that. But let’s not pretend that that somehow erases the responsibility for their deaths.
Astronauts are willing and able to take enormous personal risks, just like the brave men and women in uniform. This doesn’t provide carte blanche for their superior officers to throw their lives away. This doesn’t justify purposely building unsafe launch systems to gratify political or contractor expediency. It certainly doesn’t build a legacy of steadily more routine, cheap, and safe access to space and a culture of exploration.
There will always be a few brave souls willing to get on any rocket that points up. But we cannot regard ourselves as a serious space faring nation if we run the space program like an exercise in Patagonian BASE jumping. Which, incidentally, is safer and much cheaper.
Seventeen NASA astronauts have died in spaceflight related accidents, and a few others in training accidents. Like the four cosmonaut deaths, which occurred due to similar organizational issues, not one single death happened for a good reason.
- 27 January 1967. Apollo 1. Virgil “Gus” Grissom, Ed White, Roger B. Chaffee. Launch fever, atypical testing situation, incomplete FMEA, architectural safety issues.
- 24 April 1967. Soyuz 1. Vladimir Komarov. Launch fever to meet political anniversary, in a woefully incomplete capsule.
- 30 June 1971. Soyuz 11. Georgy Dobrovolsky, Viktor Patsayev, Vladislav Volkov. Political demand to include three cosmonauts excluded pressure suits, loss of cabin pressure during re-entry.
- 28 January 1986. Challenger. Gregory Jarvis, Christa McAuliffe, Ronald McNair, Ellison Onizuka, Judith Resnik, Michael J. Smith, Dick Scobee. Launch fever, engineering concerns overridden, safety parameters out of bounds.
- 1 February 2003. Columbia. Rick D. Husband, William C. McCool, Michael P. Anderson, David M. Brown, Kalpana Chawla, Laurel Clark, Ilan Ramon. Normalization of deviance allowed known recurring problem to continue until luck ran out.
In every case, the loss of crew and of mission was for a stupid reason, usually well understood, anticipated, forewarned, and ignored. I believe it tarnishes the legacy of these victims to foist the language of posthumous heroism upon them, because it cynically obscures the culpability of the people responsible for these catastrophes. It is not something the loved ones of these people relish hearing, but their lives were wasted and their sacrifice achieved nothing.
I am a proud proponent of insanely ambitious space exploration policy. I want to see hundreds of people on the Moon and on Mars by the end of the decade. In no way is that vision impeded by the laws of physics – only by the so called “laws” of organizational inefficiency and trained myopia. I have no doubt whatsoever that, over a long enough time scale, many many more people will die unnatural deaths in space. If we are lucky, they will die pushing the boundaries of human experience, ambition, and imagination. Not in a tin-foil rocket that everyone knew was a death trap waiting to happen.
Culture of silence
There is a common (though not universal) line of thought within NASA which is congruent to a form of learned helplessness. SLS was forced on NASA by powerful senators. The decisions are made above my pay grade. My job is to execute and implement the grand plan specified by my political masters. I can’t see the big picture. I’m just watching the clock. I’ll retire before it flies. Don’t rock the boat. This is better than nothing, or the alternative. Kick the world, hurt your foot.
I reject this point of view. It is cowardice. It is malpractice. Civil engineers are certified and often have rings supposedly made from a bridge that collapsed. The ring touches the page as we work and our work determines the safety and well being of our fellow humans.
All it takes is one brave person to take a stand and say “no”.
It could be the NASA administrator or the tech about to take off the last red “remove before flight” tag, or anyone in between. All these people are hired to apply their professional judgment and skill in the execution of a collective noble endeavor and its success will often come down to the actions of the right person at the right time making the right judgment call. It can be difficult in a culture that expressly values conformity or punishes excessive displays of critical thinking, not to mention one that disposes of widely respected senior leadership such as Doug Loverro, or Bill Gerstenmaier before him, with scarcely a second thought. But in every case of launch vehicle failure I am aware of there was always someone on the ground who knew right away what had gone wrong. It is obviously not widely advertised but that person, or any person, can stop this crazy train and prevent disaster.
All it takes is one brave person to take a stand and say “no”.
No, the SLS is not safe enough for robots or humans, and nor can it be.
No, the SLS is not good value for NASA or the US taxpayer. It’s not even bad value. It is negative value.
No, I do not accept that the laws of physics are of secondary importance to fleeting political expediency.
No, I will not build a system I know has no path to future improvements in performance and safety – a dead end.
No, I will not build a system where tragedy and national humiliation is only a matter of time.
What is the point of ten thousand engineers devoting lifetimes to developing deep insight into the workings of the universe if these, the cream of the cream who run space flight at NASA, cannot be trusted to know what’s wrong and what’s right.
What sort of program do we deserve if we let non-technical political leaders force scientifically wrong decisions for decades? Lysenko? Great Leap Forward? Mark 14 torpedo? Another Challenger? Another Columbia? Consequence-free profiteering at public expense by major aerospace contractors?
Not on my watch. Enough is enough.
Physics wins. Repudiate everything. Salt the Earth.
Hold the line. Say No.
SLS must be cancelled.