Kanban experiments from the trenches
This article is gonna be too long again, so no introduction, right on… oh yeah, hi!
Process
Our team is made up of 3+ software engineers, 1 requirement analyst (or PO [Proxy]), 1/2 of a team leader, in Paris, and 2 QA engineers in India… I know, please don’t insist. Until this project, all agile projects were managed with a purely iterative, scrum-like, approach in our company. With 5 people in Paris and 2 in India, we couldn’t apply these principles anymore. It was the perfect reason for me to try lean formally. I already was very interested in lean, all I needed was a good excuse. And it allowed me to confirm a few problems I had with agile.
Once upon a time, we mapped our value stream, i.e. our development process, into a Kanban board. As we are a distributed team, we use an electronic tool. As we are already using VersionOne in our company, which somehow enables kanban, VersionOne is our reference kanban board.
Steps
The steps of our stream are:
- (None): The prioritized backlog.
- Study: The goal of this step is to make sure everyone agrees on what needs to be done. We study and estimate the story, split it if it is too big, split it into tasks, define tests. I’ll also come back to estimates, as it is not supposed to be part of kanban. We defined this step so that we make sure everyone is involved in the story definition and agreement.
- Ready for dev: A queue.
- Dev: We do the thing right, and we verify tests will pass.
- Ready for test: Another queue.
- Test: We verify the right thing was done. Stories can be blocked here, get back to Ready for Dev, or we can open tickets (issues, defects, whatever). I won’t get into detail here, because we are still experimenting. But I have a point of view, and we can talk further about this step and issues if you’re interested.
- Validation: Strange step, but an interesting one. Everyone loves Reviews, i.e. demonstrations of what we did. Therefore, we formally kept this in the stream, by adding a step that is a mix of a queue (items waiting for review) and something that we do (the review). But as we consider the review as an atomic event compared to the cycle time, a queue is ok.
- Accepted: Done done.
Definitions of Done
Study done, or ready for dev
- The story is estimated, and is no more than 8 story points.
- Acceptance criteria are defined, understood, and agreed upon by everyone.
- Tasks are defined and estimated.
- A coarse-grained test plan is defined.
- Everybody agrees to switch it to ready for dev.
Dev done, or ready for tests
- All tasks are finished.
- The CI is green.
- Developers verified the test plan should pass, and the dev shouldn’t cause regressions.
- Refactoring to be done is done.
- “Enough” unit tests are written
- Developers agree that the code is ok.
Tests done, or ready for validation
- The test plan passed and is green.
- Exploratory tests were done.
- “Enough” tests were automated.
- Everybody agrees that the story is ok for validation.
- If an issue is found, but is quick to fix, the story remains in tests until a fix is provided. If only non-blocking issues are found, defects are opened and prioritized, and the story is accepted. Otherwise (i.e. blocking issues that can’t be fixed quickly were found), the story gets back to ready for dev.
Validation done, or ready for business
- The story was demonstrated during a Review (same as in Scrum).
- Everybody attending the presentation agrees. If people disagree, then we discuss in order to see if the story should be put back somewhere in the stream, or if another backlog item should be created and prioritized.
WIP limits
Explanations and comments
We focused on trying to make sure everyone is on the same page: QA is always involved, we foster communication at every step, we try to make sure we all agree at every step… And that’s what we actually do. We use phone, IM, and email a lot. Every day, we run a daily stand-up meeting remotely, on the phone. We spend as much time as necessary to ask or answer questions. It is very costly, but it’s the price to pay.
What we want to avoid at all cost is that stories get back into the stream, e.g. from test to dev. But we only avoid it if it makes sense. If a story actually has issues that we consider as blocking, we actually put it back to dev. We want WIP limits to have a meaning. Therefore, when engineers actually work on an item, we want one of their slots to be actually occupied. And we don’t cheat on what a blocking issue is. If an acceptance criteria is not met, or if we just don’t agree, by default, the story is not accepted, and it goes back to dev.
We kept many things from iterative, like story points and tasks. There are several reasons for this choice:
- We can somehow estimate/plan, based on our previous experience and metrics. We had a velocity and so on.
- We can easily fall back to past approaches if kanban doesn’t fit, or if conditions change.
- It’s not very expensive to maintain.
- It might still be relevant (see below).
I hope it’s just gonna be transient. We should be able to get a lot of metrics from our kanban board very soon. In particular, I’d like to see lead times, throughputs, times within steps, variabilities, roundtrips, depending on story types, sizes, and so on… I hope to be able to see what figures influence cycle times and throughputs, and get rid of the others (hoping dropped figures don’t become meaningful when we get rid of some bottlenecks). From this analysis, I hope to confirm, for example, that estimates don’t really matter for the cycle time or the throughput, given story sizes are in the same order of magnitude (i.e., in our project, from 1 to 8, which is already huge from my point of view).
What needs to be improved
We need first figures: cycle time, throughput. This is a priority to be able to estimate. The goal of estimating is to be able to say: when is this item supposed to be done? “This item”, of course, can be anywhere in the backlog. The estimate should be something like:
estimate = (backlog size / throughput) + cycle time
where backlog size is the quantity of backlog that needs to enter the stream before “this item”. So we desperately need info about cycle time and throughput.
That’s not a surprise, but organizing remote retrospectives is complicated, not to say impossible. For now, we introspect informally, by talking. And we insist on the fact that everyone should propose ideas, and we listen carefully to every point of view. And regarding continuous improvement, that’s about it. But we know that we need to formally provoke introspection. We count on travels, in any direction, to provoke this, but it didn’t happen yet. Finally, we might use A3 as well when it is time to do so.
For the moment, we have issues feeding queues, because study takes a lot of time. We already made some trade-offs in our process, but we still need to improve this. Once again, it’s not a surprise: making sure we understand each other from different hemispheres takes a lot of time (touch AND lead time).
Test automation was not a deep part of our culture. Especially in pure QA teams, like we have in India. Flex automation is a pain in the ear (censored). And, of course, short term business is the short term priority. As a conclusion, we can say functional test automation is still to be done.
Improve, again and again. We must always have this in mind. Our process must never be stable, cause there’s always room for improvement, and conditions always change. The best process is the one that improves continually.
What we learned
Lean confirmed 3 problems I had with iterations, because it solved them. All of these issues are related to the fact that everything must always have the same pace in iterations.
For example, if we “pre-plan” (or present the stories to the team, or whatever) 3 days or a week before the actual iteration planning, we must take no more than this time to think about all stories we’ll take for the iteration. But on big or more or less clear projects, questions will rise. If stories are not clear for the planning, they can’t be taken for the iteration, or they must be replaced by a spike. Causes are multiple: technical issues, functional or technical dependencies between stories or teams, functional questions, you can’t get the customer on time, PO is not enough available at that moment, QA or developers think about cases the PO had never thought about, etc… If everything was clear at the first time, we wouldn’t need to include everyone. So having a pre-defined agenda for understanding a story has been a problem on every iterative project I worked on.
Another example? Stories need to be planned for the whole iteration. But it’s sometimes difficult, because a 2-week iteration, for example, doesn’t fit all cases. For some set of stories it’s not enough, for some it’s too much. The first case is only sub-optimal, it’s not necessary a problem. The second can be a real problem. Let me explain. In an iteration, I have several stories, related to a common theme because we want our reviews to be as epic as possible. We plan and estimate them, and the iteration starts. Generally, the first story will contain some tasks to prepare the field for all stories, because they relate to the same part of the code, and a bit of architecture and tools is needed. While we do this, we discover a lot things. Sometimes good ideas, often difficulties. They can be technical or functional by the way. So once we’ve done this first story, we often realize we should re-plan all the other stories. But we never do, and the velocity never gets predictable enough to be trustworthy.
Kanban solves this by getting rid of sets of stories to prepare. We prepare stories just in time, so that we can wait for discovering the first story before studying the others.
Another problem is that some necessary steps of the iteration are not formally defined in iterative approaches, because they don’t belong to the iteration. I’m specifically referring to pre-planning, but, on big projects, there might be other steps like integration testing. We can add them specific ceremonies for these steps, but I always had issues with them. The team is focusing on its iteration, and it’s too often difficult to interrupt them.
In my past projects, the pre-planning was too often postponed, if run in any way. And the team could not think about the stories of the iteration to come anyway, because they had to finish the current iteration anyway. Finally, planning was always improvisation, and estimates were crap.
When I worked on projects where integration testing was needed, we had a dedicated team for this, which worked in iterations with 2 or 3 day offset compared to other teams. And when issues were found during integration testing, it was too late for dev teams because their iteration was over and the stories accepted anyway.
With kanban, you understood it, we solved our problems by adding the study step. This step is totally part of the story development, so it is actually and always done. And it takes the time it must take. If we had multiple teams with integration testing, we could a multi-tier board with integration testing. It doesn’t change the problem that much, but, at least, you see it specifically.
The other solution brought by kanban is removing iterations. While I agree iterations can be great for motivating the team, I’ve personally never seen the benefit of them so far. Committing a content for the iteration is too often a fantasy, and we know quickly that our commitment is not a real one. We discover too many things during the iteration, which is great, but which changes the rules of the game too soon. And having a pace driven by iterations is as regular as following a stream, so it doesn’t really wakes us up once 2 or 3 iterations are done.
Conclusion
So far, everybody is happy about kanban. We mapped the reality of our work. The team has a great visibility. We’re gonna get great metrics for management very soon. I’m sure we’ll be able to make release plannings and timely estimates. Everybody clearly knows what they need to do just by following the board.
As you can see, I’m not selling any church: we haven’t applied kanban by the book, we are recording a lot of figures compared to what we should do, and we will check what needs to be kept. Hopefully, I will get to the same conclusions as David J. Anderson. If there are differences, I will be able to explain objectively why we changed things. I’m really excited about it!
Many things need to be improved at the moment, but I’m sure we are using the right approach for our project.
And to conclude this conclusion, I realize this article is really from the trenches: muddy, too long, deadly. If you reached this point, you can be proud, soldier. We are fighting for a good cause my friend.
Who wants to criticize this, suggest improvements, or just exchange about it? I’m excited about seeing your point of view.