JPL and NASA News

Bill Wheaton, IPAC

1999 November

Reflections on Mars Climate Observer

As all of you know, Mars Climate Observer (MCO) suffered a truly horrible fate, burning up in the atmosphere of Mars due to a navigation error that placed it about 90 km lower than intended during its orbit-insertion burn on September 23. It was obvious from the first that such a disaster could really only be due to some dreadful human error. This makes it all the more painful to JPL people, because nobody in their right mind goes into astrodynamics or spacecraft navigation unless they are really deeply committed to the value of what they are doing. Worldly fame and riches are certainly not likely to be in store for the JPL navigation team, no matter how well they do their job; only the satisfaction of doing something well that is important and difficult.

So my first response to the awful news was a sick feeling in my stomach, thinking of the pain of the key people involved, and the agonizing fear each had to be feeling: that a mistake or lapse on their part could have caused the loss of the mission. Second, surely with many others, I wondered whether the error would turn out to be due to a failure by one or a few responsible individuals, something (like the Challenger accident) traceable to a broader management breakdown, or something where one could not easily point to a clear-cut mistake at any particular point. Finally, I wondered what useful lessons we might learn that could help us in the future--and whether we would in fact learn them.

The main technical thing to understand about MCO is that, by radio tracking and ranging, it is possible to determine the distance of a spacecraft from Earth with high accuracy at any particular moment, and also the component of the velocity parallel to the line of sight. The position of Mars itself is also known with great accuracy. However, the position of the spacecraft perpendicular to the line of sight, and the corresponding components of its velocity, can only be determined accurately by tracking over an extended period of time. What one must do is observe the range and range-rate over a long enough time that the gravity of the Sun and planets causes substantial effects. Then, knowing the masses and positions of these Solar System bodies, it is possible to figure out where the spacecraft must have been, in three dimensions, in order to have its line-of-sight range and velocity behave in just the way observed. This seemingly unpromising procedure is actually perfectly well-defined, a high art developed over the past 300 years, thanks to the efforts of many of the greats of mathematics and physics. In the past 40 years it has been honed to a fine edge by the navigation and trajectory computation experts at JPL and their counterparts around the world.

Given a reasonably long stretch of typically accurate range and range-rate data, the method normally produces marvelously accurate results. Yet it is obviously the sort of thing that one cannot do on the back of an envelope, nor cross check in five minutes. Also, everything depends on the knowledge of the forces on the spacecraft to high accuracy as it falls along its trajectory in the grip of gravity. Here is where the infamous English-to-metric units conversion error struck. For (due to radiation pressure torques on an asymmetric solar panel) it was necessary to periodically use thrusters to maintain the spacecraft orientation. Unknown to the trajectory computation team, these thrusters produced an effect about 5 times larger than they realized, the thrust being expressed in pounds rather than Newtons. As a result, the forces they believed were shaping the trajectory were in fact not quite correct, so the whole calculation was slightly in error.

As MCO approached Mars, the planet's gravity began to bend its trajectory, yielding a check on the solution previously obtained. As the distance closed, this check became more and more accurate--but there was also less and less time to do anything about it. Ordinarily the preparation and checking of a significant trajectory correction maneuver requires many days or weeks of work, although it can be done, under some duress, in a few days; or even, in a real emergency, in hours. It also takes a little while to incorporate new data into the global trajectory computation and obtain updated answers. Finally, the light travel time back and forth to the spacecraft, around 10 min each way, was a further barrier to any quick response to the situation as it developed on September 23. In the end it seems to have become clear that a critical situation existed only about an hour before closest approach to Mars, impossibly late for corrective action.

Yet, despite the above contributing causes, it is a fact that the trajectory and navigation methods are ordinarily extremely precise, in theory and in practice. The trajectory error really should have been apparent (one would normally expect) from an earlier failure to find any satisfactory trajectory that precisely reproduced the tracking data. Then careful follow-up investigation should have revealed the cause. It is inappropriate and unfair to speculate about the details beyond a certain point until the investigation is complete. Perhaps, for example, the effects of the miscalibrated thrusters were so small and so chaotic that no one could reasonably have noticed the problem and put the finger on it as a serious issue. Or perhaps any number of other things no one would ever think of beforehand combined to make the cause.

Given the thousands and thousands of critical issues involved in any sizable space mission, it has actually always astonished me that it is possible to do them as successfully as it is. I feel that I understand the broad outlines of the physics and engineering of spaceflight fairly well; the miracle I don't understand is how it is possible to make sure that nothing falls through the cracks. It is really incredible to me that thirteen immense Saturn V launch vehicles, every one, flew without causing a single mission failure, or that no JPL spacecraft failed after launch, from about 1965 until Mars observer disappeared in 1993.

However, the kind of careful, across the board, fine-toothed comb checking and rechecking that might possibly have detected the MCO trajectory error is not just morally admirable. Realistically, it is also expensive. It takes a great deal of time and effort by highly trained people. Those people have to eat and pay their mortgages. If one intends to pay a large team of outstanding people a reasonable wage, the total cost is inevitably going to be substantial. Long ago someone made up a budget for MCO that assumed a certain level of staffing would be sufficient for the trajectory and navigation work. That person knew that if the mission was going to go at all, it had to be done for no more than a certain amount of money.

The success of the Apollo program came at a high price. Similarly, JPL missions were starting to cost $1 billion each, and one reason there were no failures from 1965 to 1993 is probably that there were no missions launched from 1978 to 1989. If a 97% mission success-rate costs twice as much as an 80% mission success-rate, we would get more science per dollar by accepting more failures but doing more missions (by a 160/97 ratio, for these numbers I have made up) than by indulging in excessive perfection. This is the logic of NASA administrator Goldin's drive to do more missions at lower cost. But nothing comes free, and in this case the price is inevitably sometimes going to be paid in failed missions.

Whenever that happens, it will always be painful, the causes will typically be somewhat murky, we will always wonder and worry whether we have gone too far one way or the other. If we don't wonder, and worry a little, we are almost certainly way off the path we want to be on. There will often be a natural urge find the guilty and bring them to justice. We need to be sophisticated enough to realize that if there are no failures, we are doing something wrong. So, let us try to learn from our mistakes, so we won't fall into that hole again. If we save the cost of a Titan IV launch vehicle by doing complex astrodynamical maneuvering, maybe after all we should have two teams do the navigation calculations independently. We should probably tell the troops to get some sleep and encourage them not worry too much about spilled milk.

Oops

Speaking of mistakes, last month in my discussion of the skyhook, I mentioned that a uniform cable made of ordinary steel, of the kind used for making bridges, could only support itself in the Earth's gravity if it were shorter than about 90 km. Actually, I slipped a decimal and the correct number is more like 9 km. Because this factor appears in the exponent, such a cable would have to taper up in area by a factor of something like e⁵⁴⁰ instead of "merely" e⁵⁴. Embarrassing as such careless errors are, at least comparison of the two figures helps to illustrate how incredibly sensitive the practicality of such space tethers can be to the characteristics of the materials: here a factor of 10 in strength translates into a factor of e⁴⁸⁶, about 10²¹¹, in cable mass. Thus rather modest improvements in materials can make a great difference in practicality.