Safe At Work?

Icon - Make Safety a Prerequisite

Make Safety a Prerequisite

I worked in a big project and while doing some testing I noticed a bug that was previously fixed had reappeared. Since I was the one that had originally corrected the bug it was quite easy for me to spot the regression. I reported my findings to the project manager and after some investigation it came clear that several bug fixes were gone. Someone had forgot to merge the last fixes from the x.0 release to the x.1 branch.

Even after the missing merge was done, the project manager still didn’t trust that everything was fixed. There was a consensus among the developers that the merge was the cause and everything should be fine, but the project manager still felt unsure of the state of the system. To mitigate this an execution of a big regression test was ordered. In good shoot the messenger fashion I got the honor to do the testing. I had found the problem in the first place, so people thought it was logical that I should do it. Since this was seen as a risk, it was critical to get this done fast.

I spent a little more than two weeks doing insane hours. Mostly alone in the lab executing scripted manual regression tests. I had to repeat the same tests over and over again on different but similar equipment. Making an already boring task, soul destroying. The result was as expected and I didn’t find any regression related to the initial problem. Everything was fixed by doing the missing merge.

A non-existing continuous integration was the reason that this fault could slip through in the first place. The lack of technical safety that continuous integration and test automation would have brought, made the project make decisions based on feelings rather than to base them on facts. This uncertainty prevented people from trusting each other. What at first was a technical problem started to affect the psychological safety. Needless to say I wasn’t that keen on bringing up any inconvenient findings for a while.

 

Workshop: Safety Scan

One of the four principles in Modern Agile is “Make safety a prerequisite”. The reason for that safety is not seen as a priority is that priorities may change, but safety will always have to be there.

Modern Agile Wheel
If we take a look at Maslow’s hierarchy of needs we can see that safety is the second step towards self-actualization. If we want high performance teams creating software in a way that makes people awesome we need to make safety happen first. Only people that are safe will have the guts to do the experiments needed to create products that really stands out. We must have people that are safe enough to handle a failure, learn from it and then conduct a new experiment based on the new knowledge gained. If we want people to do their best they must be able to bring their whole self to work, without fearing to be punished in any way.

Maslows-Hierarchy-of-Needs.jpg
In order to put this topic on the agenda we have created a workshop called “Safety Scan”. In the workshop we look at safety from several perspectives (psychological safety, technical safety and fail safety) related to your team’s daily work. We will together discuss what safety means in your context and you will also have a possibility to rate the current level of safety in your team by doing a Safety radar. We will not suggest any actions in the workshop. It is your team that must come up and own the actions that needs to be performed. If needed we can of course guide you or facilitate exercises that will increase the safety in your team. When performing the workshop, Vegas rules will be applied (“What happens in Safety scan, stays in Safety scan”). You can be absolutely sure that the content of the session will not be shared with people outside your team.

burst
A made up example of a Safety radar(Vegas rules remember… 🙂 )

Would you like to have us perform a Safety Scan on your team, contact me or Leif Ershag.

Story point rant

I’m no fan of of story points. I have tried it when using Scrum but didn’t really see the benefits. Instead we have used gut feeling to decide on how much to pull into a sprint. That has served us really well. The few times we failed to reach the sprint goal and didn’t deliver what we have promised, we held a RCA to find the real cause of the problem. After putting in counter measures we always come out a little bit stronger. Even without counting story points we have been able to increase the velocity when the team continuously improves.

I think what annoys me the most is that I have noticed teams getting caught in a dirty race to always improve the score no matter what. These kind of measurements should be kept inside a team motivating continuous improvement. Somehow it seems that a lot of teams get measured by outsiders for their velocity. That makes it really tempting to start gaming the system.

I have seen is teams that make advanced formulas to estimate how many points of their unfinished work they should count in this sprint and how many points that are left for the next sprint. The natural thing would really be to only count finished stuff and do some analysis to understand why we failed deliver what we promised. Some teams even count story points for fixing bugs they created in the last sprint. Sad.

Using kanban is always the first choice for me. Creating a flow and don’t have to adjust the work towards the sprint ending. When using measurements I like to choose measurements that balance each other to avoid gaming with the result. A combination of measuring throughput, lead time and defects have I found helpful.

Why it is a bad idea to have test coverage targets

 

A lot has previously been written on this topic, but it seems like there is a need for another post in this area. Probably saying the same thing as the other ones.

Test coverage is a negative metric. While a low test coverage is bad it is not certain that a high test coverage is  a good thing. It really depends on the tests. What we are testing and how well the tests are implemented. A 100% test coverage with tests that are all green is not so usable if the tests for instance doesn’t have any assertions. Test coverage tells you if there are parts of the system that are lacking tests. Not how good your tests suite is. A high coverage score might also create a sense of false security, that will prevent a team from doing what’s needed to create a product with good quality.

kermit

For a team under pressure it might be tempting to start gaming the system to reach a test coverage target that has been forced upon them. Testing things that are easy to test. Instead of writing tests that are useful, but hard to implement.

So when do we have enough test coverage. According to Martin Fowler it is when we have reached the following state.

  • We have a low fault slip through to production.
  • You are not afraid of changing the code.

Sadly the feeling in your stomach while delivering is not as easy to share with upper management as your test coverage result would be.

Read some of Martin Fowler’s thoughts on this topic here.

 

Make an impact!

Do you want to know how your work fits in the big picture? Do you want to know how your efforts contribute to make the place you work at successful? Do you want to avoid doing stuff without real business value? If yes, then Impact mapping might be something for you and your team!

Impact mapping is a light weight conversation tool created by Gojko Adzic that is used to connect deliverables to business goals. In his workshop that I attended, he used the underpants stealing gnomes from South Park to illustrate where impact mapping fits.

When the gnomes are asked why they are collecting underpants they say that “Collecting underpants is Phase 1”. As soon as the kids ask them about phase 2 the gnomes answer that phase 3 is profit. They haven’t really figured out how to turn their mountain of underwear into money. The gnomes business case described with gifs

Gnomes_plan

Impact mapping is used to connect Phase 3 (A goal) with Phase 1 (a deliverable) with the help of a phase 2 containing actors and their behaviors.

To create an impact map we are asking ourselves four questions. Why? Who? How? and What? The result will be visualized with a mind map.

Why?
Why are we doing this? We state a business goal that shall be Specific, Measurable, Achievable, Relevant and Time bound.

Who?
What actors can help us to achieve the goal?

How?
How shall our actor’s behavior change so that we can achieve our business goals? The actor’s behavior changes when they are starting and stopping doing things, when they do more of some things and less of some other things. The behavior also changes when actors for instance do things faster or slower etc. etc. etc.

What?
What shall we as an organization or delivery team do to support the necessary impacts? What deliverables shall we produce?

A simple example
Let’s say that we are working for the Swedish Transport Administration. Our goal is that zero persons shall be killed or severely injured in traffic each year. We are going to make an impact map that shall help us to figure out how to achieve this modest goal. I gather a team with people that can help us to answer the questions. Then we are having a conversation in order to create the map.

I have used the tool Mindmup to draw the impact map.

speed4-768x233

Why?
Zero persons shall be killed or severely injured in traffic each year.

Who?
The actors we have identified to help us reach our goal is Pedestrians and Drivers.

How?
Drivers are changing their behavior by start obeying the speed limits.

Pedestrians are changing their behavior by start wearing reflectors, stops jaywalking and crosses roads faster.

What?
What can we do as an organization to support the change of behaviors? Using speed cameras that measure the average speed should really help the drivers to slow down.

Every time we deliver something we shall evaluate if the actors behavior is changed and how much. For instance if the ad campaign for wearing reflectors is enough for changing the behavior so that pedestrians always wears reflectors when it is dark, then we don’t have to give away free reflectors and write a theme song. That will save us a lot money! Creating deliverables that is not needed is the number one source of waste!

Impact mapping can also be used to reverse engineer long wish lists. Then we start from the other direction and connect what to how to who to why. Features that doesn’t support change to a new desired behavior and connects to a goal shall be thrown away.

Impact maps can be translated to user stories. “In order to start wearing reflectors, as a pedestrian I want free reflectors” “In order to cross the road faster, as a pedestrian I want to go to a fitness camp!” (Ok all stories from the map doesn’t sound really good but that is due to my crappy example, you see the point right? 🙂 )

If you are doing hypothesis driven development then the impact map could be used to specify the assumptions you will validate or falsify with your experiments.

Some reading

The source:

https://www.amazon.com/Impact-Mapping-Software-Products-Projects/dp/0955683645/ref=sr_1_1?ie=UTF8&qid=1477343290&sr=8-1&keywords=impact+mapping

Great book that among other things shows impact mapping in a Lean Startup context.

https://www.amazon.com/Lean-Enterprise-Performance-Organizations-Innovate/dp/1449368425/ref=sr_1_1?ie=UTF8&qid=1477343315&sr=8-1&keywords=lean+enterprise

Do the right thing

I have been a fan of Lean Startup ever since I read the book with the same name for the first time. Working with hypothesis in really short iterations fueled by lean principles in order to gain validated learning. Replacing big investments with small experiments is what it is all about. Theory is one thing though, implementing it in an organization is something else. Luckily I had the possibility together with my team mates in team Kafka and our friends in team Firefly at Tieto to attend the Crisp course Lean Team. We (Product owners, developers and support consultants) left this course with new knowledge, insights and tools that will help us to not only do things right, but even more important do the right things.

The course kicked off with a case study on how a system for handling examinations at Stockholm University was implemented. The major learning here was that until validated we should consider everything in the backlog as speculations. Short iterations and frequent releases that was tested on real users created knowledge that could never have been anticipated in advance.

Courses held by Crisp are often (or always?) using the Training from the back of the room strategy. Making the learners driving the learning by participating in a lot activating and inspirational exercises. This creates a course that is pure fun and making the new knowledge stick. The first exercise we did was a Lego game showing the importance of team members having T-shaped skills in order to deliver great results.

Next up was one of my biggest takeaways from the course. Learning about Impact mapping. This is a lightweight tool for describing the connection between business goals and features and how we want to change user behavior in order to reach the goals. Other benefits is that you get a road map nicely visualized. This is how I hope we can handle the backlog in the future. Ordering the Impact Mapping book by Gojko Adzic to learn more is now on my to-do list.

We learned a lot of techniques for gaining more knowledge with just a small investment of time.

Creating Proto-personas.
We are creating personas to be able to see things from a customer-centric point of view.

Story boarding.
We describe problems and solutions by drawing story boards

Hypothesis creation
Why we believe in an idea and how we can validate it.

Design studio.
A workshop for collaborative idea generation.

Created an MVP by doing a prototype.
An MVP is the smallest thing we can do to validate an idea.

Goobing
How to plan and execute validation of ideas on real people.

For every step we did we used an experimentation kata to validate the work. It is a generic technique that can be applied in almost all situations.

I really enjoyed this class and learned a lot of things that we can implement in our daily work. Taking the class together with the whole team made it even better. Getting advice and feedback from experts and hearing examples from their work life made it a great time.

I recommend everyone that has the possibility to take this class to do it! Bring the whole team!

A big thank you to everyone making our participation possible and to everyone attending for making the class great!

Breaking the waves

We were in painful position with a lot of reported bugs and lot of unhappiness among, well among everybody. I was reading the book Implementing Lean Software Development (Poppendieck & Poppendieck, 2006) and got a lot of ideas and inspiration to implement a better way of working.

The outcome? The support staff felt safe knowing that they could report an issue and that it would be handled quickly. The developers was not disturbed that often anymore, preventing multitasking. They also got a better understand of the consequences of their actions. The busy product owner found a way to share a little bit of the workload.

wavebreaker
Image borrowed from here

This is how we implemented the ideas from the book.

Delete all old bugs
To get a fresh start delete the existing bug list. Don’t worry if you delete one bug  too much. If it is important enough someone will report it again. We created a limit on the bug list. If the number of bugs exceeds 30, something must be removed from the list. If a bug list grows to big it will not be reviewable any more and will probably contain duplicates and such. The lean principle that error hides in piles of inventory also applies here.

It shall only be possible to report bugs at one place
We are using VSTS and created an inbox for bugs reachable for all stakeholders. No more bugs that are lost in emails.

Create a good bug description template
We want to avoid that there are missing information when a developer finally starts working on the bug and the description has to be completed by someone who no longer remembers the details (churn). We therefore want a well written bug report. Our reports contains besides reproduction steps also customer impact so that we clearly understand how the fault affects the user. We also include expected and actual result to avoid any misunderstanding of what should be fixed.

Create  a short feedback loop
We created a daily  meeting called Daily Triage to prevent that a bug is unattended too long. To make it a snappy and effective meeting we time boxed it to 15 minutes. We have visualized the number of bugs in the triage inbox on a screen, so if the inbox is empty everybody will know that the meeting is canceled. If something really critical is happening it is ok to call for an extra triage. We also made an agenda for the meeting to avoid a lot of unrelated chitchat. For each bug in the inbox we ask:

1) Is the bug correctly reported?
If not send it back to the submitter with feedback on how to improve the report.

2) Is it a bug?
Some people tend to use bug reports to introduce more functionality. That sucks. If it is a request for new functionality it should be prioritized against the rest of the backlog and handled the way we handle other customer requests.

3) What is the severity?
Setting a severity helps us discussing the priority and ROI.

4) Should we fix it?
Is the ROI high enough? If yes put it on the buglist. If no, report to the customer / submitter that we will not fix it. Sometimes it has happened the we had said no to fix a bug, but later reconsidered that decision when more people has reported the same thing and the ROI of a fix has increased.

Make the right people attend the meeting
In order to empower the meeting to make decisions we have people from development, support and a product owner attending the meeting.

Making this process a part of a Kanban flow created a really efficient bug handling shop. Triage an item in the morning, fix it and test it during the day and then deliver it around 04:30 next morning. Compared to deliver once a month it was a huge improvement. (Making small deliveries often, is one of the best quality investments you can do. It is a lot easier to build quality in when working in small batches.) The delight of customer getting their issue handled that quickly often exceeded the annoyance of having to report the bug in the first place.

One of the other important things we did was giving one of the teams working on the product the mission to handle customer failure demands quickly. Functioning as a wave breaker so that other teams could spend time doing new development and paying off bigger chunks of the technical debt. So two years later how did it go? Well last time I checked we had three open bugs. Quite manageable!

 

 

The March of the Musicians

In the book The March of Musicians (Musikanternas uttåg, 1978)  PO Enqvist is  telling the story about the spread of the labour movement in the villages around Bureå, just south of my home town Skellefteå. Instead of being greeted as liberators the socialist agitators met a really hostile resistance. In other places the hostility came from the owner of the sawmills. In this area the hostility and violence surprisingly came from the workers themselves.

So what was the new ideas that workers opposed? Basic things that we take for granted. Eight hour work days, fair and decent pay and safe working conditions. Fear of change. Fear of the impact of new ideas and the conflict with the hardcore Lutheranism and the Lutheran work ethic was the base of the resistance. Today it feels unbelievable that the workers themselves could violently resist things that would obviously improve their lives.

Sundsvall_18900501

Workers demanding eight hours of work, eight hours of freedom and eight hours of rest. Source

So what if resisting lean and agile is the same kind of mistake as the workers in Bureå made? Saying no to things that could improve lives. What if people instead of being a victim of the system started to control their own destiny by improving the system that mistreated them?  Working in empowered teams guided by clear boundaries instead of micromanagement from above. Making decisions in the teams, the place where we have the most accurate information.  Having fun doing experiments in safe-to-fail environments. Taking the leap to become a creative worker instead of a knowledge worker.

An energizer epiphany

Good energizers can make all the difference in the world at for instance a liftoff or a retrospective. Building up the right amount of energy and creating a good atmosphere. I have been low on inspiration for a while and ended up doing other things as meeting starters. Not bad stuff, but still not the same amount of fun.
My daughter is turning 5 and we had a birthday party for her and her friends. Watching the kids play I realised that organized games on children’s parties is pretty much the same thing that we are doing as energizers. Googling “Childrens party games” has given me an endless stream of energizer ideas. Pure gold!

Failing towards excellence

“Think big, act small, fail fast; learn rapidly” is the slogan for Lean Software Development. In order to make that happen we need to create as many short feedback loops as possible. One way to do this is to implement a stop and fix process in your team. A proper designed stop and fix process will transform a failure into continuous improvement. Discovering improvement possibilities shouldn’t be limited to retrospectives or when managers are yelling due to some crisis. Instead let us build with quality, fixing errors where and when they are happening, preventing them from happen again and not propagating them further down the line. It is also important that we lower the threshold for when to stop and fix. If we are only stopping the line when there are severe crises, we will not gain the benefits from this relentless quality and process improving strategy.

And the alternative? Well if we don’t fix problems right a way they will pile up and you will sooner or later have a crisis on your hands. You will get an uneven workflow that will lead to even more defects. The people will be overloaded and overloaded people does not make the best decisions. People who don’t make the best decisions creates defects that… I  have seen this first hand and it is not pretty.

What are the prerequisites for doing stop and fix in a successful way?

  • Work in small batches. Having a lot of work in process makes it hard find anomalies. Defects hides in piles of work. In order to stop and fix we need to be able to spot the failures. That’s one of the reasons that limit work in process is so important and rewarding.
  • An established process for the day to day work is needed. It is hard to stop and fix in a controlled way if the normal situation is chaos or Laissez faire.
  • Define a process for when to stop the line. It should never be a discussion if we have met the criteria. Create a checklist, if all the boxes are ticked you can pull the cord and stop the line.
  • A defined process for what to do when the line is stopped.
  • A culture where it is ok to make mistakes. We must make clear that failure is ok as long as we are learning from the failures. Making the same mistake twice though, is kind of unnecessary and stupid.
  • There will probably be people that will ask you to skip this process just to get some short term gain, just this time… People who are more focused on output than outcome doesn’t value uncompromising quality work. You must have enough discipline to resist that. It is better to deliver fewer things with good quality than a lot of things with bad quality.

 

So what should we do when we have entered stop and fix mode?

First of all everything does not have to stop. Toyota (the creators of TPS, aka Lean) does not stop the whole factory every time something is going wrong. Instead involve the people needed to handle the situation. It is called stop and fix and not stop and repair or even stop and patch, there are some mandatory steps to create the desired outcome.

  1. Fix the fault. We shall fix the fault and deliver the solution as fast as possible. If possible also implement poka-yokes like unit tests or automated GUI test to prevent the fault to reappear undetected.
  2. Fix the faulty process that made it possible to make this mistake. We who think that Lean is great way of running a business and product development believes that the right process will produce the right result. We also believe that we shouldn’t blame mistakes on individual since mistakes happen due to faulty process that allow people to fail. That is why we need to find the root cause of the issue. What on the surface looks like a technical problem will after some digging be revealed as a system problem. My favorite way of doing root cause analysis is to conduct a session of 5-whys. Eric Ries has made a good description on how use this method. (http://www.startuplessonslearned.com/2009/07/how-to-conduct-five-whys-root-cause.html ) Just make sure that everybody involved is present while doing the root cause analysis. Otherwise it might easily become a blame fest (Blamestorming?). When we are implementing the process improvements we are step by step getting a little better and are coming a bit closer to excellence.
  3. Share the failures and the learnings. Wouldn’t it be great if we were working in an environment where we were talking openly about mistakes and what we learned from them? An open climate like this could save us all from repeating failures. Write a post on the blog, create a wall of fail or talk about it on the fika.

And please remember…

15zdfc