I was watching an episode of Chicago Fire on NBC this morning. the firehouse was being audited by an efficiency expert who was not the most welcome addition to the day to day lives of the first responders. He accompanied them on a rescue mission and after they returned to the firehouse the ‘expert’ asked the (acting) chief why he had dispatched a second option when the first one was working. The reply was obvious to me… something along the lines of ‘in case the first option did not work.’
I had a lot on my mind so I was not paying close attention to the details, but the response from the ‘expert’ floored me. ‘Redundancy is the enemy of efficiency.’
Over the last couple of years I have spent more time teaching than I have consulting and architecting systems, but I would be surprised if a single student of mine has not heard me say that one concept that I absolutely abhor is ‘single point of failure (SPoF). The concept in my mind leads to inevitable avoidable downtime (which is the enemy of productivity).
While I do not teach this to all of my classes, it is not the IT consultant or technician who makes financial decisions; it is the business people. A Business Impact Analysis (BIA) is performed and a quantitative value is assigned to downtime, after which a Quantitative Risk Assessment (QRA) is performed and the Single Loss Expectancy (SLE), Annualized Rate of Occurrence (ARO), and Annual Loss Expectancy (ALE) are calculated. After that a probability graph is created which estimates the probability of an event happening (X-axis) and the impact of the potential threat (Y-axis). The X axis ranges from Very Unlikely (1) to Almost Certain (10). The Y axis ranges from Very Low (1) to Severe (10). All of these numbers (although in truth the assessment graph is based on guesses and approximations) are considered before determining if the cost of redundancy outweighs the possible lost revenue, productivity, reputation of the potential likely threat.
Of course, it is much more in-depth than that. I cannot justify in a single paragraph the need to spend money on redundancies, but I hope that you get at least the gist of my point.
Redundancies matter. We are all familiar with the cost of car insurance, and that if you never get into an accident it is easy to think that you have wasted all of that money. Insurance is a complete waste of money until the one time that you need it, and then it can be a life saver. Redundancies are the insurance for your systems. They are a complete waste of money… until that one time you need them, and then they can not only save your business money, but they can actually save your business.
There is, of course, the other side of the coin. I have been telling my customers and my students for fifteen years that every business wants ‘Five Nines uptime.’ Their budget, on the other hand, is more in the neighbourhood of ‘Nine Fives uptime.’
Five Nines, or 99.999% uptime, is the equivalent of 5m16s of downtime in a calendar year… and it is possible, just not cheap. Yes, there is a cost to redundancies and depending on the BIA and other calculations might or might not be worth it. The Nine Fives, a bit of a joke but would be 55.5555555% uptime, would be munch less expensive and comes out to a little over 162 days of downtime per year.
Based on the business calculations the company must decide whether the cost of redundancies are worth it… but it does not need to be an all-or-nothing decision. Five Nines would require georedundancy as a start, and a lot more to boot. Maybe your company does not need that guaranteed uptime of about five minutes downtime per year, but would be willing to settle for Three Nines (a little under nine hours of downtime per year). It is a lot less expensive, and if a little bit of downtime would not destroy your company then it is much more manageable.
In the television program the redundancy was not excessive. Had the chief called out every firehouse in the city as backup then that would have been excessive. A single truck, on the other hand, can break down so calling out a second truck ‘just in case’ was not excessive, but yes there is a cost associated with it. If the numbers people do not like that then we have to reframe the question: ‘A single truck costs X to deploy, so two trucks costs X x 2. What is the value of the life or lives that were saved? In addition to that, what would be the reputational cost to the department had our truck broken down and nobody would have responded?’
The message is that there is a cost to any redundancy, but if we evaluate the costs versus potential loss (both dollar value and reputational) then we can decide on what redundancies are worth investing in and which might be excessive. Once we know that, we can learn how to explain the cost versus value to the ‘bean ‘efficiency experts’ so that they can better understand why they are important.

Leave a comment