Categories
Blog posts in English

Introducing STPA – a new Test Analysis Technique

At the core of innovation in IT is someone getting the idea of connecting existing services and data in new ways to create new and better services. The old wisdom behind it is this:

The Whole is Greater than the Sum of its parts
– Aristotele

There is a flipside to this type of innovation that the opposite is also true: The whole can become more problematic than the negative sums of all the known risks.
My experience as a tester and test manager is that projects generally manage risks in individual subsystems and components quite well.
But I have on occasions found that we have difficulty imagining and properly taking care of things that might go wrong when a new system is connected to the infrastructure, subjected to real production data and actual business processes, and exposed to the dynamics of real users and the environment.

Safety, Accidents and Software Testing

Some years ago, I researched and came across the works of Dr. Nancy Leveson and found them very interesting. She is approaching the problem of making complex systems safe in a different way than most.
Leveson is professor of aeronautical engineering at MIT and author of Safeware (1994) and Engineering A Safer World (2011).
In the 2011 book, she describes her Systems-Theoretic Accident Model and Process – STAMP. STAMP gives up the idea that accidents are causal events and instead perceives safety as an emergent property of a system.
I read the book a while ago, but has only recently managed to begin the transformation of her ideas to software testing.
It actually took a tutorial and some conversations with both Dr. Leveson and her colleague Dr. John Thomas at the 5th European STAMP/STPA workshop in Reykjavik, Iceland in September to completely wrap my head around these ideas.
I’m now working on an actual case and an article, but have decided to write this blog as a teaser for other testers to look into Leveson’s work. There are quality resources freely available which can help testers (I list them at the end of this blog).
The part of STAMP I’m looking at is the STPA technique for hazard analysis.
According to Leveson, hazard analysis can be described as “investigating an accident before it occurs”. Hazards can be thought of as a specific type of bug, one with potentially hazardous consequences.
STPA is interesting to me as a tester for a few reasons:

  • As an analysis technique, STPA helps identify potential causes of complex problems before business, human, and societal assets are damaged.
  • One can analyze a system and figure out how individual parts need to behave for the whole system to be safe.
  • This means that we can test parts for total systems safety.
  • It works top-down and does not require access to knowledge of all implementation details.
  • Rather, it can even work on incomplete models of a system that’s in the process of being built.

To work, STPA requires a few assumptions to be made:

  • The complete system of human and automated processes can be modeled as a “control model”.
  • A control model consists of interconnected processes that issue control actions and receive feedback/input.
  • Safety is an emergent property of the actual system including users and operators, it is not something that is “hardwired” into the system.

I’d like to talk a bit about the processes and the control model. In IT we might think of the elements in the control model as user stories consisting of descriptions of actors controlling or triggering “something” which in turn produce some kind of output. The output is fed as input either to other processes or back to the actor.
The actual implementation details should be left out initially. The control structure is a mainly a model of interconnections between user stories.
Given the control model sufficiently developed, the STPA analysis itself is a two step activity where one iterates through each user story in the control structure to figure out exactly what is required from them individually to make the whole system safe. I won’t go into details here about how it works, but I can say that it’s actually surprisingly simple – once you get the hang of it.

36574241164_d2989109b0_o.jpg
Dr. John Thomas presented an inspiring tutorial on STPA at the conference.

Safety in IT

I have mentioned Knight Capital Group’s new trading algorithm on this blog before as it’s a good example of a “black swan project” (thanks to Bernie Berger for facilitating the discussion about it at the first WOTBLACK workshop).
Knight was one of the more aggressive investment companies in Wall Street. In 2012 they developed a new trading algorithm which was tested using a simulation engine. However, the deployment of the algorithm to the production environment turned out to be unsafe: Although only to be used in testing, the simulation engine was deployed and started in production resulting in fake data being fed to the trading algorithm. After 45 minutes of running this system on the market (without any kind of monitoring), Knight Capital Group was bankrupt. Although no persons were harmed, the losses were massive.
Commonly only some IT systems are considered “safety critical” because they have potential to cause harm to someone or something. Cases like that of Knight Capital indicate to me that we need to expand this perspective and consider safety a property of all systems that are considered critical to a business, society, the environment or individuals.
Safety is a relevant to consider whenever there are risks that significant business, environmental, human, personal or societal assets can be damaged by actions performed by a system.

STAMP/STPA and the Future of Testing

So, STPA offers a way to analyze systems. Let’s get this back to testing.
Software testing relies fundamentally on testers’ critical thinking abilities to imagine scenarios and generate test ideas using systematic and exploratory approaches.
This type of testing is challenged at the moment by

  • Growing complexity of systems
  • Limited time to test
  • Problems performing in-depth, good coverage end-to-end testing

DevOps and CD (continuous delivery) attempts to address these issues, but they also amplify the challenges.
I find we’re as professional testers more and more often finding ourselves trapped into frustrating “races against the clock” because of the innovation of new and more complex designs.
Rapid Software Testing seems the only sustainable testing methodology out there that can deal with it, but we still need to get a good grip on the complexity of the systems we’re testing.
Cynefin is a set of theories which are already helping testers embrace new levels of complexity in both projects and products. I’m actively using Cynefin myself.
STAMP is another set of theories that I think are worth looking closely at. Compared to Cynefin, STAMP embraces a systems theoretical perspective and offers processes for analyzing systems and identify component level requirements that are necessary for safety. If phrased appropriately, these requirements are direct equivalents of test ideas.
STAMP/STPA has been around for more than a decade and is already in wide use in engineering. It is solid material from one of the worlds’ leading engineering universities.
At the Vrije Universiteit in Amsterdam, the Netherlands they have people taching STPA to students in software testing.
The automobile industry is adopting STPA rapidly to manage the huge complexity of interconnected systems with millions of lines of code.
And there are many other cases.
If you are curious to know more, I suggest you take a look at the resources below. If you wish to discuss this or corporate with me on this, please write me on twitter @andersdinsen or e-mail, or join me at the second WOTBLACK workshop in New York on December 3rd, where we might find good time to talk about this and other emerging ideas.

Resources

Thanks to John Thomas and Jess Ingrassellino for reviewing drafts of this blog post. Errors you may find are mine, though.

DSC_0146
This photo shows machinery in an Icelandic geothermal power plant. Water heated to 300 deg C by the underground magma flows up and drives turbines and produces warm water for Reykjavik.

Categories
Blog posts in English

If You See Something, say Something

I have been back in Copenhagen a few days now after Anna Royzman’s excellent fall 2016 testing conference Reinventing Testers in New York during the week of September 25th to 29th 2016. This is the first of probably a few blog posts sharing thoughts and inspiration from the conference.
I am a test consultant. Helping solving clients testing problems efficiently and in meaningful ways is crucial to me. Reinventing and reasserting myself as needed, and staying critical to both my own preformed ideas and towards others’ is necessary.
I need to go to testing conferences. It allows me a break out of my daily social obligations so that I can better stay true to what I believe in.
It is about getting new inspiration, learning and sharing, and eventually about maintaining my performance as a tester and test manager.
During the Reinventing Testers conference, I had some very interesting discussions with James Bach about freedom. He and I agree that personal freedom is fundamental in testing.
“The human spirit should not be put under a hat,” he said at one point during the conference, and I fully agree. But freedom is also about relation.
I walked around Lower Manhattan on Wednesday, and in the window of a bank or insurance company of some sort, I saw a message on a poster: Feeling free is not worrying what your neighbors think.
The message disturbed me as I feel underlying it is a reassertion to the lonely and insecure that other people should not matter: That one is only free, alone.
This is obviously wrong.
True freedom depend on us becoming ourselves, but certainly also on relations towards other people: Shared and differing talents, perceptions, opinions, values, moral codexes.
People are different, but we’re tied together in so many ways.
In technology, freedom relates to safety and quality. I started writing this blog post on the way home on an SAS Airbus A340-300 which was at the time flying more than 900 km/h through the thin air, 12 km above the North Atlantic.
The flight was good and safe, and I was free to think there.
But only because people had worked to make it safe.
And this is important: A good deal of the work needed to make systems safe involves careful testing and as testers we relate to people: Clients, users, stakeholders etc.
We help make them free.
The conference had a special nerve, I think, and I think I can label it.
If you see something, say something, signs say in the New York subway. New Yorkers don’t have to all like each other, but it was obvious to me, that they know that they are only free, together.
And that is a pretty cool attitude to freedom, I think.
 

dsc_4333
James Bach inspecting a piece of abstract art. Could it represent freedom? 🙂

 
 
 
 

Categories
Blog posts in English

Wanted: Test Leadership (or why verbs can be better than nouns)

The exceptional testing performance requires something from everyone of us: We are performing in teams, but as individuals we need to be experts in our craft, tools and methods.
That is not all, though. We also need to be expert ourselves.
How can I be myself? That is a question, which philosophers have discussed for thousands of years.
We have to talk and think more about our personal values to become ourselves. I find that knowing and acting on our values is key to our professional performance as testers, and even more as leaders. It requires a rich language.
The problem, I think, is that in the processes of perfecting our software and technology development, and manage our projects, our language has become dull, technical, command-and-control-focused and valueless. Am I right?
Language scientists believe that in the earliest languages, there were no nouns. Verbs existed before nouns.
This indicates that that our ancestors talked about our relation to the world and people around us long before they started naming things in it.
Today we spend almost all our working hours defining and naming things. We discuss whether one noun is better than another. Which is the correct noun to use in a situation. What noun names the best practice.
I find there is far too little talk about what to do with the things named by all the nouns. And more importantly: What comes out of doing stuff, in the world around us.
Naming things isn’t leadership. It is optimization of communication. I want to see more leadership in testing.
We should somehow go back to the roots of our language and start talking about testing values using less nouns, more verbs.
I am running a series of protreptic workshops with my friend Karen in Copenhagen. We bring together a very diverse group of people and talk about values. The setting is informal, but Karen and I facilitate it closely.
The experience is awesome.
One participant wrote to me that it requires a great deal of “brain work”, but is rewarding: “I have met people who are different from me, and that makes the experience interesting because you start thinking: why do I have the attitudes and opinions I have?”
We don’t do psychoanalysis or discuss reactions. We don’t talk about models of the brain either. There is, in general, no cause-and-effect-thinking in the workshop. Only lots of inspiration.
And we inspire each other to talk about worth and personal values. Asking “why do I have these attitudes and opinions?” is one way of discovering them.
Leadership is about the team taking responsibility together, but that is not something that is monopolized with the constituted leader anymore. Taking responsibility is on everyone’s shoulders today, even you and me, the individual team members.
“To lead others, first you have to lead yourself”, my friend Maibritt Isberg Andersen says.
And now, that we are all test leaders, I really hope we will all start talking more about our attitudes and opinions, and why we have them.
It is about inspiring test leadership.

Categories
Blog posts in English

Say "Yes, but…" and remain true to your values

Testers sometimes have to make compromises. We have professional values and beliefs, but they may conflict with values in the contexts we act in.
Isn’t there always someone who matter who has an opinion about what testing is and how it should be performed? 
I remeber a few situations, where I have felt I was tasked to do something in a way that I didn’t agree with. How can I make a difference, then?
This feeling points to something which is a dilemma for every skilled professional: We can sometimes feel that our personal and professional values are challenged, but there is still a job to be done.
How can a I make the compromise, accepting a challenge, while staying true to my values and beliefs? 
It’s about learning and improving.
 
Social responsibility
To me, one of the most important things about being professional and a context driven tester is taking social responsibility. This is an important value to me.
Social responsibility is not about self-critique. As a professional, my personal doubts and worries are valuable.
Instead it is about trying to give customers what they need by understanding their situation and helping them get better with what we are doing together.
That requires what I call personal leadership. But foremost, it requires conversation and negotiation.
 
“Yes, but…”
In his “Improv(e) your testing” talk at Let’s Test 2016, Damian Synadinos @dsynadinos reminded me of a simple and efficient strategy to opening conversations. In improv, a golden rule is to start replies with with “yes, and…”. This helps adding to whatever is happening on stage.
In professional situations we sometimes have to subtract instead:

Yes, I will perform the test and report to you about it, but please explain me how the test case and bug count metrics you ask me to do will be useful?

The “yes“-part is about accepting the challenge. The “but” implies that I’m going to stay true to my knowledge, experience, values and beliefs and raise professional doubts about methods I’m asked to use, things I’m asked to report, processes I’m asked to follow.
I’m not asking rhetorical questions. Rhetorical questions shut windows to the world and enclose me in my own thoughts and ideas.
So I keep thought in the back of my mind when I’m asked to do something in a certain way: “Is this really in the best interest of the people who matter: The project stakeholders?”
Replying “yes, but…” enables me to act on my personal values in contexts which have values of their own.
 
Masterclass in New York City
On September 26th, during Test Masters Academy‘s REINVENTING TESTERS WEEK in New York City, I will be doing a workshop titled: “Act on your values!” on values and personal leadership.
As testers and IT-professionals we have to quickly recognize and adapt to ever changing contexts in order to produce value for our employers, clients and various diverse customers. This can be challenging, both on the personal and the professional level. As leaders, team members and individuals we often have to lead ourselves.
The workshop will focus on how our personal and shared values can guide us. It will be based on the principles of protreptic dialogue, which is a philosophical facilitated conversation revolving around the values embedded in what we say, do and think. First described in ancient Greece in the fourth century, professor Ole Fogh Kirkeby of Copenhagen Business School has revived protreptic dialogue as both a concept, a leadership tool, and a coaching principle with the objective to “turn us towards ourselves”.
I plan for the workshop to be a safe space for exploration and learning. Participants are expected to share opinions, thoughts and ideas, and to treat others’ opinions, thoughts and ideas in a respectful and appreciative manner. No prior knowledge of leadership, dialogue, philosophy, or protreptic dialogue is required.
Key takeaways

  • Consciousness about personal values and values of the contexts we work in
  • Strategies for dealing with the dilemmas we face as testers
  • An introduction to protreptic concepts and dialogue

Get tickets here.
28cbe1e4-a4e5-4bc8-860c-3c7584be81d9
 

Categories
Blog posts in English

Value centered dialogue at CPHContext

I’m beginning to get quite excited about speaking at CPHContext about ”Value Centered Dialogue in Context Driven Testing”. It’s not the first time I speak at a testing conference, but I am going to demonstrate a type of dialogue for which there is no firm recipie and I can therefore only plan for mentally. And that is of course a bit exciting 🙂
To settle my nerves, I’m writing this blog to reveal something about what I’m going to tell people.
Recently, a good friend asked me: “What is leadership is to you?”
My answer came quicker than I thought it would: “It is about setting people free to do their best,” I said.
We were talking about personal leadership values.
Heuristics and values
There are many ways to lead people – we could call them leadership heuristics – and while you and I can attend the same courses or read the same books and therefore learn the same leadership heuristics, our personal values shape our actions and therefore the way we apply these heuristics.
Everything I’m going to say in the session will be about basic human values and how I have found a special type of dialogue can bring new energy into context driven testing leadership.
I have my slides ready, and I hope it will be a good experience for everyone attending my session.
A protreptic dialogue
I’d like to give show something about how a protreptic dialogue between me (the guide) and you would start out. I might start with a question to you:
What does it mean to be context driven?
I’ll listen carefully to your answer and depending on what you answer (there is no right or wrong here as it is about you) I might tell you something about the origins of the word context. Words are important in protreptic dialogue.
The word context is orignally latin and comes from contextus which means joining together. The danish word for context is sammenhæng, which means the same, so context is something we are joined to, or maybe even woven into, as the latin origins actually indicate.
Then, what does it mean to be context driven: Can something that we are joined to or even woven into drive us? It might if there is motion in it, so if we want to understand something about how the context is driving us, we should look at the dynamics in it. But perhaps the driving could be reversed: Can our testing set the context in motion?
This question was for you, and again I’ll listen carefully to what you say. If it was me, I might answer myself like this:
Of course we can set the context in motion, and we do, as testers. After all, testers discover stuff other people have not yet discovered, we build trust, create business value, spoil illusions and other things that send motion back into the context.
This is interesting. As a guide, I’ll listen to your value laden words: discovery, trust, value, illusions. In a human value-perspective they have meanings related to the four basic human values: The Good, The Beautiful, The Just and The True.
In the ongoing protreptic dialogue, we will explore these values together, getting very close to what they really mean to you. We might talk about your work or other things in your life, but only if you want to and bring it up. This is not a therapy session.
Protreptic dialogue is meant to be a nice and respectful experience for both. There are no roles to play, we are both ”ourselves”, but we are taking a journey together to discover something about ourselves, in this case about context driven testing.

Categories
Blog posts in English

Why the dichotomy of testing versus checking is the core of our craft

Please note: This post is being updated.
In his Why the testing/checking debate is so messy – a fruit salad analogy, my good friend Joep Schuurkes posts an absurd dialogue in which two persons become confused because they cannot distinguish between apples and fruit. He claims the dialogue could still happen if apples is replaced with checking and fruit with testing.
He is trying to show that in the same way that apples are a sort of fruit, checking is a sort of testing. And that discussing testing *versus* checking is bullshit.
I think Joep is wrong, and I shall discuss why and how here.
A little “versus”
The core of the discussion is the little “versus” between testing and checking, which Bolton and Bach insists on. And I insist on it too: It introduces a dichotomy, which is not only important, it is even necessary.
And it is necessary because it shapes our thoughts about testing.
To be precise, it leads us to think on a conceptual level instead of just an activity level. Once we accept the little ”versus” between the two, accept the dichotomy, we can start thinking about our craft. We are no longer forced to only think about the activites we do.
And just as important: We can distinguish our craft from something that it is not.
It’s like the way more and more people discriminate between leadership and management. Once you accept that the two are conceptually different, something interesting happens: A whole new understanding of the act of playing ”the boss” reveals itself.
In the same way, when we start discriminating between testing and checking, the way we talk about what we do as testers, change. And we change.
A humanistic and value producing view on testing has revealed itself to us through this dichotomy:
Testing was, but is no longer…
Testing is no longer a necessary evil, only done because programmers are sloppy, don’t read requirements and make mistakes. Instead, testing has become a craft, carried out by humans. A craft that adds value to the product, the organisation and society as a whole.
We are no longer little machines working under detailed instruction. We are testers, and therefore everything we do, our job satisfaction and even the value we produce, depends on this very dichotomy.
I will not let the confusion confuse me
So why the confusion? Well, I think the confusion arises because we confuse concepts with activities when we talk our daily, ambigous language.
As a tester, I carry out checks when I test, but when I do, the checks I am doing are elements in the testing and the whole activity is testing, not checking.
But if, on the other hand, I program a computer to run through a number of input combinations to a software program, have my program verify the results by comparing them to something ”expected”, and produce a report of boolean results on the basis of this, the whole activity of running that and distributing the report from the computer program is checking, not testing.
However, letting this confusion lead us to discard the difference between testing and checking would be a pity. The dichotomy is core to Boltons and Bachs testing philosophy. If I reject it, I have to reject more or less everything they say about testing.
And worse: I will have to give up my profession.

Categories
Blog posts in English

On the Value of Interested, Dedicated, and Fascinated People

Some test managers and test consultants are very busy pointing out the right processes, organisational structures and methods to use in software testing.
But no methods, processes and structures can assure great testing. Great testing is created by people.
This quote by Neil Armstrong, which I came across a couple of years ago, is worth remembering whenever we lead people in testing:

“The way […] that made [the Apollo project] different from other sectors of the government to which some people are sometimes properly critical is that this was a project in which everybody involved was, (1) interested, (2) dedicated, and, (3) fascinated by the job they were doing. And whenever you have those ingredients, whether it be government or private industry or a retail store, you’re going to win.”

To me, his message is that as leaders, our aim should be to do whatever we can to make people just that: Interested, dedicated and fascinated by the job we are doing.
Source: Transcript of Neil Armstrong Interview with Stephen Ambrose and David Brinkley

Neil Armstrong, first man to walk on the moon. Photo: NASA.
Neil Armstrong, first man to walk on the moon. Photo: NASA.

Categories
Blog posts in English

Core Values in Testing

There’s something about life that you won’t find anywhere else.
– Ole Brunsbjerg, headmaster.

The Copenhagen Context Driven Testing meetups are becoming a tradition thanks to the work of Carsten Feilberg and Agniezka Loza. In June, I chaired a workshop in Ballerup near Copenhagen during one of the meetups. 16 testers shared ideas about values in software testing.
There are four or five basic human values which everyone shares. The good, the beautiful, the true and the just. Freedom relates to these four. We express and rate them differently and they are intrisic to us, subjective, but still shared among humans.
My personal human values shape my actions, words and thoughts, and thus also the words and expressions I use in my daily language. My language can tell you about my values and therefore something about who I am.
Workshop and procastination
In the workshop I chaired in June, I asked the participants to pick picture cards to illustrate thoughts about testing. Then they spoke about the picture and about testing. We shared our words and statements on post-it’s and I collected them.
I was busy at work after the workshop, and the box with the words ended up on my desk. Summer and vacation came, and I procastrinated opening it. One of the last days of vacation, I finally read the words on the post-it’s.
Here are the words:

Knowledge; Information; Curiosity; Exploration; Investigation; Fight :-); Courage; Confidence; Balance; Collaboration; Evolvement; Surprise; Order; Performance; Discovering stuff, that others have not (yet) discovered; beautiness. Usability (easy/better ways of using stuff). Universatility. User experience design; Good (better) end user experience; user needs; user satisfaction; Sustainability. Creativity. Responsibility. Curiosity; Easeing somebody else’s job; Striving; Alertness; Communication; Added communication; added collaboration; information sharing; Building bridges; (Make it) fun; Excellence; Any word / anything; Getting a kick; Covering / exploring; Contradictions / paradoxes; Building trust; Finding (new) ways; Getting to know; Helping; Revealing; Avoiding losses; Whole solutions; Support descisions; Transparancy; Quality; Assessing quality; Avoid scandals; Improvement; Business needs; Filling gaps; People; To spoil illusions (own and others’); Digging for something deeper; Truth; Structure; Growth; Responsibility; Team work; Exploration; Progress; Seeing/finding possibilities; Erkendelse/Erkenntnis/realisation; Business value; Honesty.

DSC_1239
Truth and testing
It’s interesting to note that many of the words above relate to the value ‘truth’. Testing implies couriosity, gives a kick, spoils illusions, happens through exploration etc.
I consider ‘truth’ to be the fundamental core value in testing. Truth as a term is a complex thing, but when we use words that relate to the value ‘truth’, it’s easier to see.
As a tester, I prefer things that are true and don’t accept stories that can’t be verified. I rate things that are more true than other things. For example, I tend to dislike reducing truth to numbers, and prefer a more nuanced understanding of subjects.
I do have beleifs, hyphotesis, and test ideas, but at the end of the day, the ideas only prove their worth when they have been evaluated.
More than truth?
But look again.
Many (most?) of the words deal with things that are not related to ‘truth’: Reponsibility, easening other peoples jobs, evolvement, user experiences, whole solutions, improvement, business value etc.
This reminds us that testers are not just concerned with ‘truth’, i.e. testing, but also value how testing is used and the results that the whole team or company achieves.
What does this tell me as a testing leader?
It tells me that in my leadership, I cannot only focus on testing ideas, spoiling illusions, and telling the truth if I wish to motivate and encourage our teams to work efficiently and independently doing their testing.
I have to consider how the testing contributes to achieving other goals and higher goals.
I have to consider that coorporation with colleagues work well. That the product we somehow help with is something that makes users happy. That there are bottom line results because of our testing. That disasters are prevented.
These things are not just ‘context issues’. They are core to testing leadership.
Word play
I have played with the words on the cards and come up with a mission statement for a hypothetical testing team. The mission statement somehow expresses this.
Feel free to play with the words yourself.

We are testers. We are ready to spoil illusions, both our own and others’. We have courage to do so and generally like to be surprised. So we always dig for something deeper, a deeper understanding, a realization, an ‘erkenntnis’. We get a kick when that happens. Through testing, we seek truth, but we also feel a responsibility to make our testing useful to create user friendly and whole solutions, support growth and improvement, and sustainability. Our testing thus aims to assist the creation of pleasing and aestethic solutions, to serve other peoples needs and hopes, and in general to do good.

PS: The quote from my uncle Ole Brunsbjerg at the top of this article is to remind us that there is more to life than testing. Or anything else. Life is very rich and as humans, we value all of it.

Categories
Blog posts in English

Are you playing the Russian roulette? Learning from failure

I think most (if not all?) testers have witnessed situations like this: A new feature of the system put into production, only to crash weeks, days or just hours later.
”Why didn’t anybody think of that?!”
Truth is, quite often, somebody did actually think about the problem, but the issue was not realised, communicated or accepted.
Below is the story about the space shuttle Challenger accident in 1986.
Disaster…
Twentynine years ago, space shuttle Challenger exploded seven minutes into the flight killing the seven astronauts aboard.
Theoretical physicist Richard Feynman was a member of the accident commision. During the hearings he commented that the whole decision making in the shuttle project was ”a kind of Russian roulette”.
The analogy is striking. Russian roulette is only played by someone willing to take the risk to die.
I don’t know anyone who deliberately want to play the Russion roulette, so why did they play that game?
Feynman explains: [The Shuttle] flies [with O-ring erosion] and nothing happens. Then it is suggested, therefore, that the risk is no longer so high for the next flights. We can lower our standards a little bit because we got away with it last time…. You got away with it but it shouldn’t be done over and over again like that.
The problem that caused the explosion was traced down to leaking seals in one of the booster rockets. On this particular launch ambient temperatures were lower than usual and for that reason the seals all failed. The failed seals allowed very hot exhaust gasses to leak out of the rocket combustion chamber, and eventually, these hot gasses ignigted the many thusand litres of higly explosive rocket fuel.
Challenger blew up in a split second. The seven astronauts probably didn’t realise they were dying before their bodies were torn in pieces.
It was a horrible tragedy.
Chapter 6 of the official investigation report is titled: ”An accident rooted in history.”
The accident was made possible because of consistent misjudgements and systematically ignored issues, poor post flight investigations, and ignored technical reports. The accident was caused because three seals failed on this particular launch, but the problem was known and the failure was made possible because it was systematically ignored.
The tester’s fundamental responsibilites
As a tester, I have three fundamental responsibilities:

  1. Perfom the best possible testing in the context
  2. Do the best possible evaluation of what I’ve found and learnt during testing.  Identify and qualify bugs and product risks.
  3. Do my best to communicate and advocate these bugs and product risks in the organisation.

The Challenger accident was not caused by a single individual who failed detecting or reporting a problem.
The accident was made possible by systemic factors, i.e. factors outside the control of any individual in the programme. Eventually, everyone fell into the trap of relying on what seemed to be “good experience”. The facts should have been taken seriously.
A root cause analysis should never only identify individual and concrete factors, but also systemic factors which enabled the problem to survive into production.
Chapter 6 of the Challenger report reminds me that, when something goes wrong in production, performing a root cause analysis is a bigger task than just finding out the chain of events that lead to problem.
Many thanks to Chi Lieu @SomnaRev for taking time to comment early drafts of this post.

Photo of the space shuttle Challenger accident Jan. 28, 1986. Photo credit: NASA
Photo of the space shuttle Challenger accident Jan. 28, 1986. Photo credit: NASA

Categories
Blog posts in English

The Context is Copenhagen #CPHContext

#CPHContext 2015 opened today at the Tivoli Hotel & Congress Center. I couldn’t make it today, but judging from tweets, it has been good. I’ll be there tomorrow morning at 8.00 for the main day of the conference.
I’m really looking forward to it!
Frankly I’ve had very little time to study and prepare myself for the programme, but I know I’ll be meeting some of the world’s greatest testers there. I expect to listen, share and learn.
Kudos to Morten from PrettyGoodTesting for setting up this conference – great job done!
PS: I’ll bring my camera, so expect photos on my flickr photo stream after the event.