Only Rational

We continue our exploration of the work of Sidney Dekker with what is the single hardest idea to grasp:

Failure is rational.

We are not talking here about the Silicon Valley cliché of failure, the failure that people point to when they tell you to "fail forward" or "fail faster." We are talking here about catastrophic failure, the kind of failure that results in loss of life and happiness.

How can this kind of failure be rational? How can it, instead of a sign of institutional rot or malfeasance, be a natural outgrowth of that which works?

To understand this, we return again to the concept of complexity. Complex systems are those (to use Dave Snowden's definition) in which the interactions of the system's elements change the way in which those elements interact. The stock market is a good example: how people trade changes the way in which people trade. If everyone's scared, then everyone sells, which lowers stock prices, which makes everyone scared...and so on.

In this way, complex systems lurch - they move erratically and, ultimately, unpredictably. It takes very few interacting components to make the actions of a complex system completely unknowable.

One characteristic of these kinds of systems is that small inputs can create large outputs. Most people have heard of the "butterfly effect," wherein the flapping of one butterfly's wings creates erratic weather patterns halfway across the world. I'm tempted to use a "falling dominoes" metaphor here, but dominoes are too ordered; a better image would be tipping a domino over in your kitchen only to find out that your neighbor's cousin subsequently came down with pneumonia. The outcome is not predictable; that there will be an outcome is.

It is important for us to have that background understanding of how complex systems work to bridge the gap between the theoretical and the practical. After all, it is hard for us to think at this kind of systems-level. By reflex, we tend to think at the level of the individual: who made what decision, and why?

So: let's zoom in a bit. In any given organization, the people on the ground are making rational decisions. They have access to a certain amount of knowledge, to certain tools, to an understanding of the priorities of those around them. Given these resources, we all must make decisions as best we can in the moment.

Making decisions generally means making trade-offs. No organization solely prioritizes anything, no matter how important it is. Airlines don't solely prioritize safety; the safest way to fly a plane is not to fly it at all. Airlines instead make trade-offs between different priorities, such as safety and productivity, or quality and affordability. This is not because they are evil capitalists, but a reflection that every decision we make involves a trade-off of one kind or another. In the best case scenario, deciding to do something means not deciding to do something else. We are always forced to make a trade.

Making these trade-offs is not easy, but people do the best they can with the tools they have. In one instance, someone skips a safety check to get a plane off the runway on time (thus preventing cascading delays and costs down the line). The penalty for such a trade-off is....nothing at all. In fact, the overall outcome is likely to be positive: after all, the person on the ground had a good reason for making the decision in the first place. This adaptation to a goal conflict resulted in only a small deviation from the standard in the moment, didn't incur any immediate cost, and resulted in significant benefit...in other words, it was a success. It was a rational decision in the moment, and its success can be seen as further evidence of it being a good idea.

Over time, these kinds of locally-rational decisions become normalized. If you've ever had someone tell you "that's just the way things are done," you are hearing the echoes of decisions past. Decisions are made, benefits noted; these decisions become regular and part of "how things are done;" new hires are trained on existing processes. In this way, past decisions become enshrined as "company culture." After all, all evidence points to these adaptations being a good idea.

Complexity means that identical inputs will not always produce identical outputs. Doing the same twice will not always produce the same result. Past performance is no guarantee of future performance, and not incurring a cost one time or hundreds of times is no guarantee that there is not a bill coming due.

These types of dynamics can make sense in theory, but escape us in practice. We think in terms of average performance, in terms of what happened in the past, and we extrapolate that forward. As Taylor Pearson once put it, "5 out of 6 players recommend Russian Roulette as a fun and profitable game."

The problem is that most systems which involve human beings are non-ergodic. That's a fancy term for games in which past outcomes affect future outcomes.

Imagine running a marketing campaign, to use a mundane example. I guarantee you that the average return of this campaign is 3x, meaning for every dollar you put in, you will make 3. Sound good?

Of course it does! In fact, many people will throw large amounts of money at marketing campaigns based on (absolutely accurate) numbers like these. We won't ever deal with the average; not in real life. In real life, timing matters.

If every single week we lose everything we put into the campaign, only to make it all back and then some in the last week of the year, we aren't going to make it. We'll run out of cash before we hit a positive return. The average return of 3x is no comfort to us now; we got knocked out of the game before we got to the end, like playing Russian Roulette with a 52-chambered gun.

Just because something went a certain way in the past does not mean it will go that way in the future. Just because a certain action has never resulted in a negative outcome does not mean it never will. Just because our individual decisions seem small and insignificant does not mean they can’t result in disaster.

And just because an organization drifts into catastrophic failure does not mean anyone did anything irrational or immoral. It is entirely possible and, in fact, likely for organizations to drift into failure when they are doing everything they can with the tools and knowledge they have.

As we continue our exploration into Dekker's work, we'll talk about what we do about this unsettling fact. Knowing about it - distinguishing drift into failure as a thing which can happen - can help us guard against it and analyze our own decisions in a more effective way.

Yours,

Dan

SOMETHING I'M READING:

A bit wonky here, but the absolute best description of how to write powerful prompts for spaced-repetition learning.

How to write good prompts

Using spaced repetition to create understanding

Andy Matuschak