The great pyramid

This article is a translation from the French original post on Arolla‘s blog.

Pluto Krath was approaching his office, almost running in anger. He didn’t mind the indistinct voice.

— K.I.T.T, remind me to buy some milk when we’re back to F.L.A.G.

He entered the room, put his laptop on his desk:

— What do they don’t get about the test pyramid? Even a kid could get it.

He turned around and froze, his eyes screaming, his mouth professionally shut.

— What the frog are you doing here, David?

Colonel Shepard was again missing from his Sharknado The 4th Awakens poster. David Hasselhoff was standing near the window. He raised his watch at his mouth.

— Forget about the milk K.I.T.T. This is gonna take a long time.
Got it Michael
— Now you’re the Knight Rider, fabulous. Non-sense down to your fingertips.
— So the test pyramid uh?

Pluto sat on his chair.

— Fine let’s pretend. Yeah, the devs don’t get it. Our automatic test campaign lasts forever and unit tests don’t catch any bug. I’m trying to explain them how to do unit, service, and UI tests, and all they tell me is “it won’t work here”. How many times have I heard that?

David had a look outside the window

The test pyramid, that was a great metaphor. Many unit tests, few end-to-end tests. You’re right, everyone understands it. You must be doing something wrong. How did you help them get it?
— I told them that in order to have a quicker campaign, they should turn more integration tests into unit tests. That would even allow them to test more edge cases, thus making their testing more robust.
— Were integration tests part of the testing pyramid?
— No, the pyramid only talked about unit, services, and UI tests. But you see my point. You said it yourself: many unit tests, few end-to-end tests.

Copyright Mountain Goat Software and Mike Cohn

David blinked maliciously at Pluto.

— Oops, I’m using the wrong terms. Probably because I never really understood the pyramid myself. And you didn’t get I didn’t get it. So I’m wondering what you got they didn’t get. Words are important when you explain something. Dev is all about detail.
— Oh come on. Everybody knows what a unit test is. You test a method or a function, so that it doesn’t do any I/O. It makes your test quick, and you can run tens of thousands of them in seconds.
— And services tests? Integration tests? System tests? End-to-end tests? UI tests? I can point you to several pyramid alternatives that differentiate those. Because the pyramid needed amendments for different contexts. Still, none of these pyramids makes any consensus. I can also guarantee you that unit test advocates don’t agree on what it means, for that matter.
— Sure, you can split hairs. Still, the test pyramid has never failed me. In all the teams I worked with, we separated 2 or 3 categories of tests, and we always had a a solid test campaign. I’m trying to help the team with my experience, here.
— Here are a few other common features of your past products. They were all small textbook monoliths, with very few dependencies with other teams. One was a textbook Spring/react shopping cart, the other one was a textbook nightly file aggregation batch. And last but not least, none was the product you’re working on now. Context is everything. How is your current product different?

Pluto leaned forward. He suddenly got enthusiastic.

— It’s a set of 17 micro-services, developed in several languages, sometimes by people outside of the team. Its features sometimes depend on services provided by other teams. It’s distributed, running non-stop, and services are deployed independently.
— And the micro-services are more or less close to the OS resources, more or less critical from a performance point of view, developed at various times, have various levels of maturity, depend more or less on directory trees or scheduling of shell scripts. They sometimes get split, their contracts evolve, as the team learns about what it must and can do.
— What’s the link with a unit or a service test?
— Well what do you call the service to test? The user service at the product boundary, or the micro-service http interface? What do you call a method or function to unit test in C? In haskell? In shell? How do you talk with another team to test the service globally?
— Again, that’s nitpicking. We’ll just come up with our definitions of Mike Cohn’s words explicit, and we’ll be fine.
— First congratulations. You just said you’d make definitions explicit, and that’s a move worth making. People arrive with their culture, with different definitions of the same words. If it’s as simple as you think, it won’t take much time. Second, so you want to keep the original three words.
— Yes the magic number.
— Remember your Spring project, where you bundled everything that was neither unit nor end-to-end as integration tests. You could feel the opportunities you lost by trying to horseshoe each of those tests in a common framework. That felt heavy didn’t it?

Pluto looked down his memories

— Oh yes it did. There were quite some fights about it. So let’s just name a few categories in our context, and use them everywhere.
— Exactly! And don’t forget to keep this list alive. Now you need to find what the context is. Do you have a single context? What about the department? Does technology matter? Domain? Culture?
— Do you mean we should have different sets of categories for different services? Or sets of services? Or that we should share some vocabulary with other teams?
— All, none, I don’t know. Talk about it. Keep whatever makes sense. Make things explicit and relevant.
— So the only remaining thing about the pyramid is the triangle.

David sat on the sofa and stared at Pluto.

— …
— What? Tests are either quick and reliable or they represent reality, and are fatter, more flaky. Aren’t they?
— Hmm that’s interesting
— Are you doing the shrimp?
— Am I doing the shrimp?
(sigh)
— You’re making your point by yourself. Tests have several desired properties, and you want to maximize all of them. Among those properties, you can find:

  • Speed, to run loads of tests.
  • Expressiveness, in code and in diagnostics. When some test fails, you want to know why.
  • Isolation, of runtime environment and of tested features.
  • Reliability, i.e. not flakiness. Tests that embed much stuff are the most flaky.
  • Representativity. For example, is your test representative of real life if it does not include scheduled scripts that move input files to your input folder? Is production environment representative if you use a test flag for the incoming request?
  • Coverage. Imagine the combinatorial explosion of edge cases you need to test.
  • And many more. Don’t ask me a mnemomic acronym.

You can see that most of them conflict with each other. For each context, you’ll find several toolboxes with a fair compromise among these properties. In some contexts, they’ll be tightly ordered as a pyramid. In some others, you’ll be able to maximize most of the properties in a single category. The properties don’t balance each other on a linear gradient. It depends.
— It’s more a set of multi-dimensional bubbles than a pyramid.
— Coughnerdcough.
— But “Testing shows the presence, not the absence of bugs“, doesn’t it?
— I never said the opposite. I talked about fair compromises. You’ll identify the right bubbles with your sensibility. It’s about trust and confidence, it’s very subjective. Oh, and of course, we’re only talking about automatic stuff, not about actual testing.
— Finally, nothing is left from the pyramid. The Knight Rider beats Mike Cohn.
— Not at all, are you crazy! The metaphor is great, I told you. When I discovered it, it was an epiphany. We’re talking about your precise context, not a generic frame of mind. The pyramid is just a model, so it’s wrong, that’s all. This one is particularly simplistic to stand the contact with the field, but it’s still very useful as a basis for thinking.

Pluto laid back on his chair, rubbing his nose, and stretched his legs.

— Real life is more nuanced than the illustration, I get it. You did it again, I’ll have to get back to the team.

When Pluto opened his eyes, David Hasselhoff was back in the poster.

— I really need to sleep more.

Not a decision

This article is a translation from the French original post on Arolla‘s blog.

Risky Business Insurance, IT silo, Friday 9pm, meeting room Bob Ross

— Pluto Krath: We won’t come back to this, we already decided we wouldn’t use a database.
— Nadia: We had no idea we would need to navigate data that don’t fit in memory. If we don’t have a database, we will need to re-implement everything we need to do that from the disk: indexing, relational queries, execution plans, serialization… If we hadn’t told we would use a database, you wouldn’t even have noticed.
— Pluto Krath: No, we decided we would only use the memory and the disk, and that’s what we’re gonna do. Implement everything you need. And don’t forget I need the minutes for this meeting by Monday morning.

Pluto Krath’s office, 9:45pm

In the last illuminated office, Pluto was sending the most urgent emails before the weekend. He suddenly realized David Hasselhoff was sitting in front of him. He looked at his Sharknado the 4th awakens poster, missing one character, then back at David.

— Shut you mouth, blink. 
— But how… 
— Don’t change subject. So you just kicked back Nadia. Are you proud at least?
— It’s not that simple. We needed to take a decision, and that’s what I did.
— Oh the dirty word. A decision. Don’t use vocabulary for grown-ups. What options did you have?
— What do you mean? Oh, well, use a database or re-implement everything from scratch.
— That’s it? Wasn’t it you, plastering and quoting Virginia Satir’s rule of three: to have one choice is no choice, to have two choices is a dilemma, and to have three choices offers new possibilities. Where’s the third choice? I can easily think of a dozen.
— OK, I can find more. But what for? We can’t use anything we wouldn’t build ourselves.
— You’re right, let’s tackle your assumptions. What makes you think you should build everything by yourself?
— Because we can’t afford a license for an Oracle database. If we go there, I’ll have to deal with Mickey and his team of DBAs, and he really wants me out of his career.
— Well that was something. Unverified assumptions, not even shared with your team. Starting with your first assumption, once again: you only have two options. Aren’t there free databases? Is the relational model the best option for your context? Can’t you find indexing libraries that could prevent you from reinventing some square wheel? Would these solutions force you to deal with Mickey?
— I got it, there were other options, and we didn’t explore all of them. Whatever. We don’t have the time, the decision is taken, we’re moving forward.
— Oh yeah, I almost forgot the master of all options: waiting. Why take that decision now?
— That’s the way it works! We move forward with decisions, action plans, concrete stuff. How do you want to build a house without foundations?
— And you call these foundations. You’re doing meetings! You haven’t invested or implemented anything. You have nothing to modify. That’s what thinking is good for: as long as it’s ideas, it doesn’t cost a cent to change your mind. “Foundations”… You work at a company managing risk. Can’t you find someone who can explain you the value of an option? There is nothing more valuable for your product right now. You should cherish your options. Not kill them without reasons.
— So you want our specs to be bundles of dead ends. How do you want us to synchronize with the rest of the company if you don’t freeze anything?
— I don’t want anything. Let me remind you I’m only in your mind. You might even be sleeping. You’d better spend time with your family. Instead, you’re wasting your youth arguing about specs. Another way of saying you want to freeze reality inside your fantasies. For now you only need to generate options and explore them, and you’re doing the exact opposite. You can be sure the best solution is not the ones you see today. And it’s definitely not to cut other options today.
— Should we code disposable prototypes? Throw code away?
— Here you are, you’re considering options. You’re even thinking about ways to limit the cost and blast radius of experiments. We’re getting somewhere. Of course it’s time to cope with real constraints. Have you ever seen a product developed in meeting rooms? When I see you try to look like pros in stock photos, you look like your kid playing at tea parties.
— But how will we know which solution to choose?
— And here is the light. Congratulations Pluto! You’re wondering what all this is good for. You’re on the right track. Some paths will be dead ends, others will spawn new ideas or new problems to solve. However, the solution you will adopt will probably be none of the ones you’re considering today. It will be a mix of what you know now and the knowledge you’ll gather along the way. You already made the main progress you needed, by wondering why. Once you understand the problem, you’ll be able to consider conditions of success or failure for your experiments. Did you feel the click? Your might have aches between your ears tomorrow morning. It happens when you use new muscles.
— Fine, we’ll talk about that again.

Pluto was so impatient to send his mail to Nadia, he didn’t even notice his poster got all his characters back, and his neighbor was gone.

Random comments

Let’s show empathy. If people like Pluto fantasize about Capital Decisions, it’s probably because they’re not equipped for uncertainty. We never learned models helping us be comfortable with uncertainty. We always had to know. And a good way to predict the future is to force it, even in arbitrary ways. Thus the need for decisions. Decisions allow us to take further decisions. Therefore, the best way to get out of that spiral would be to help everyone understand little bit less deterministic models, like cynefin.

Many teams suggest to maintain a decision log. Because of a bad lazite, I never had the occasion to practice this, but I think it can really help organizations, by offering an objective knowledge base to workaround cognitive biases. You need to find a common format for your decisions, for example by stating the problem to solve, options, assumptions, reasons to choose, expected results, anticipated risks, etc. A decision log allows evaluating past decisions, and improving decision taking. Internet is full of templates and articles. Go on and discover the topic, like I would discover it if I started implementing it.

Judging a decision is independent of its results. You can take a great decision leading to disastrous results, or stupid ones and have great results. Every decision is made in an evolving context.

There are several tactics for taking decisions, like:

  • HiPPO (Highest Paid Person Opinion), where the boss decides. You’d better have a boss who is smart all day long, and who has all the information.
  • Consensus, where decisions are not taken until everyone agrees. It requires more time and energy, and decisions tend to be more, well, consensual.
  • Consent, where you only need nobody against the decision. In this mode, you’d better limit the blast radius of your decisions.
  • Anarchy, where everyone does what they want. To keep some coherence, you need the system to be clear enough to align everyone.
  • These methods can be applied by representative minorities. Representatives should really be representative, and information must flow well and massively in all directions.
  • I almost forgot voting. Voting engages all participants, and frustrates losers. By merging reasons to vote against a decision, a pure vote hides those reasons. You can manipulate elections by choosing a voting method.

I prefer the consent by default. Other people have other preferences. Anyway, just be aware that your preference, or your organization’s, are not the only options. And of course, it depends. To get deeper, have a look at David Market’s delegation scale, Management 3.0 by Jurgen Appelo, or holacracy.

Take a step back and look at your decisions, you won’t regret it. What decisions did you take recently? What decisions were taken for you? What were the options? Couldn’t it be postponed? Do that exercise, and you’ll discover possibilities your weren’t aware of.

My cheapest estimate

This article is a translation from the original French version available at Arolla’s blog.

Predictions are hard, especially about the future. Still, everybody wants some. In a coherent world, we would only need to predict the release of a very few features. However, as we are often forced into estimating everything, the quicker the better. Life is too short to waste your time on divination.

At a previous job, we could make predictions in an efficient and effective way. It took 2 or 3 hours to predict several months of releases for each team. And the results were pretty good, compared to the organizations I had seen before. In other organizations, predictions were only derived from cost estimates. In this organization, we relied as much as we could on what we had done in the past.

Context

Let’s start with a disclaimer: I don’t have numbers anymore, so mores or lesses will have to do. Now with the context.

We were 3 or 4 teams, releasing 2 or 3 products on the shelf, more or less related to each other.

Dependencies, between teams or internal to teams, were extremely limited. We’ll come back to that later.

Backlog items were small. By that I mean they were done in 2 to 3 days top. I usually need to work on that issue first. In that organisation, it was already anchored in people’s minds.

Every delivered item was attached to bigger items. Even bugs (I still don’t see the difference between bugs and the rest, I only see things to do, gaps towards an ideal). Using jira, we called big items epics. It took time for the teams to realize epics needed to have an end in order to be useful. But in this article, let’s consider epics were done in the release.

An epic was made up of 10 to 50 smaller items. We’ll call these smaller items tickets. Jira or not, I refuse to call them user stories: user stories are stories, about users, period.

At the end of a release, we could quickly know what the team had foreseen at the beginning of a release.

Recap. At the end of a release, we knew:

  • What epics the team had prophecized.
  • What epics were done.
  • What tickets were done in each epic.

From there, we could announce a content for the next epic:

  • Estimate each epic to do.
  • Compute each team’s capacity, in tickets.
  • Compute the proportion of that capacity we can use for planned and unplanned items.
  • Thus, predict the release content.

Epics estimate

Considering we knew how many tickets past epics were made up of, we stored them in big buckets. The following buckets were enough: 1, 3, 10, 30, 100. In my career, I quickly realized the Fibonacci sequence was too precise. For example, 1 2 and 3 could be merged into a common 2. And when you predict the future, you don’t want to let people think you know what you’re doing. In the word estimate, there is estimate, never forget that.

Then, we took future epics, from highest to lowest priority, and put each one in the corresponding bucket, by comparing them to the done ones.

We avoided at all cost estimating the number of tickets it would take to finish the epic. That is the mistake you don’t want to make: predicting the future.

We agreed by consensus. A simple rule to settle conflicts: if we thought an epic didn’t fit a given bucket, it went to a bigger one.

This exercise took approximately 1h, with the whole team.

Some people told me I should talk about t-shirt sizing, when they saw the 1/3/10/30/100 scale. A few comments about that:

  • T-shirt sizing doesn’t have an arythmetic. You can’t add t-shirt sizes.
  • That being said, it’s also its main quality. And choosing buckets for epics while ignoring how many tickets will compose them, is very close to t-shirt sizing. We briefly tried that, but participants were a lot faster with numeric cues. Culture, it depends on teams.
  • When I compare kids playing in the kindergarten and the seriousness of estimation meetings, I realize we could also use animals: fly, dog, elephant, diplodocus. The main advantage of this scale is that diplodocuses don’t exist, and we would always like big backlog items didn’t exist.

Team capacity

That’s pretty simple. If you did 200 tickets in the last release, you can predict your team to do 200 in the next one.

You could cross product values when release durations vary. But be wary of such computations: releases always have more or less exploratory periods, and they don’t spread linearly.

Applying linear computations for number of people, availability of people, or holidays, is even more dangerous. I prefer considering a team as a whole. Waiting times explain more of durations than the capacity of teams to parallelize tasks correctly done.

Actually, I think that the function (working time -> team capacity) is not computable or predictable. It is not increasing, not even continuous, and certainly not linear. Don’t evaluate team capacity from working time, it’s simpler.

This exercise took about 15 minutes for the scrum master and the PO.

The unexpected

By comparing what had been planned and what had been done, we got an unplanned rate. Between the start and the delivery of a release, we change our minds, reprioritize, discover functional holes, technical debt, and so on. Well, there is no reason for that to change. Therefore, we considered that unplanned rates would remain identical between releases.

We only considered planned and unplanned epics. We didn’t need to categorize tickets of an epic.

Our unplanned rates were approximately 65 to 75%. That is to say, 65 to 75% of what we did in a release had not been foreseen at the beginning of the release. That’s the way it is. Just take reality as it is, don’t try to distort it. Neither should you do hope-based planning, by firmly affirming you wouldn’t change your mind next time.

Taking new information into account is good news. If you have little unplanned work, don’t take that as good news without digging into it. There is a great chance that someone is burying his head, or, like the three monkeys, is shutting down every door, yelling lalalala prostrated in the corner of a meeting room.

This exercise took approximately 1 hour for the PO, depending on the difficulty of archeological excavations.

Prediction

We had team capacity, and an unplanned rate. From these, we knew how many tickets we couldn plan. For example, if we had done 400 tickets in the previous release, including 100 planned tickets, then we could plan 100 tickets for the next release. Having estimated epics in number of tickets, we knew which epics we could announce for the next release.

Turtles all the way

What we did for a release could be adapted to other levels or granularity. We used a variation of this method for 2-week iterations, by replacing epics with counting tickets).

The only thing you need to know is, unplanned rate must be evaluated independently for each level of granularity. Knowing you have 75% unplanned work at the release level won’t help you evaluate uncertainty at the iteration level, and vice versa. It can be more, it can be less.

Explanation

Why did this method work?

First, we found the right level of granularity. Like all nested systems, you have one or more stable (i.e. predictable) levels. In our case, stable levels were epics for the release, and tickets for the iteration. We were lucky to have a system with stable levels, and to identify them the first time.

Then, we already talked about it, dependencies were very few. Workload thus explained most of delivery times. It is rarely the case. In general, items spend most of their time in queues. In this case, estimating workload is useless. Rather measure lead times, and start from there.

Finally, the law of large numbers and having many small tickets per epic helped smoothing out disparities. At the epic scale, and considering the error inherent in divination, we could consider all tickets as identical.

Conclusion

It’s up to you to find the stable elements of your system. This method was the result of several iterations. Stable elements will help you predicting the future. By measuring these stable elements, and projecting them into the future, you have a higher chance to come up with realistic previsions. Avoid estimating what’s going to happen, at all cost. Too many biases will pollute your calculations.

I didn’t invent anything, it is more or less the #NoEstimate approach. I thought of proposing a few links to this approach here, but internet already took care of that. It’s up to you now. Happy exploration.

Frugal conclusion

Just in case. These words, and all their cousins, predict disasters.

We add features, code, processes, just in case. I prefer adaptability to new requirements, rather than anticipating every detail.

We create laws that paralyze whole sectors, to protect ourselves from limited harm caused by one black sheep. I prefer detecting problems, rather than preventing their hypothetical causes.

We create rigid processes to make sure of what they create. It may be useful in cynefin’s simple or complicated domains, not in the complex one. I prefer allowing good surprises, rather than avoiding bad ones.

We monitor proxy indicators because it’s easy. Sir Tony Hoare, the creator of null, who named it his billion-dollar mistake, added it because it was easy to do so. I prefer adding something because the need and the solution were validated, rather than because it’s easy.

What a relief when you remove some process, a 300-line method, or one third of the backlog. You can feel instantly your cognitive load decreasing, the blood weight diminishing in your brains. Afterwards, you can’t even remember why they were here in the first place. Let’s anticipate. Verify that what you add is useful. Remove what is not. Slim it down.

The whole series:

Frugal process

After backlog and code, let’s go on with the process. In this article, I’ll refer to the procedures, the methodology, the method, the practices, and not about the program doing stuff in a computer.

A process tells us what to do in a given context. It states pre-conditions for applicability, post-conditions of success, variations, attention points. A process always exist, be it implicit or explicit. A process has multiple nested levels of granularity, to produce, understand production, update the process, manage conflicts, communicate…

Company culture is everything making mandatory or forbidden, encouraging or discouraging, people reactions in some context. We can thus consider a company culture as a process. As behaviors, we can list and modify it (more or less easily is you want the change to stick).

The process is huge. We don’t want to make it more complex by adding arbitrary clauses to it. We need to factorize it, make it as little as possible. If the process becomes too complex, we need some additional process to interpret it. We don’t need that.

A process can accelerate things by:

  • Guiding people, so that they don’t need to search for the right way to go.
  • Avoiding people to keep wondering the same things again and again.
  • Helping people do the things right first time.

A process can also ruin your life when:

  • It complicates more than necessary what needs to be done.
  • It prevents you from doing what needs to be done, unless you work around it.

An important quality of a process is its adaptability. Situations vary, and evolve. So a process must remain plastic. So we’re back to the same considerations as code: simple things are easier to adapt.

  • When the process is a 200-page document, nobody will get back to it to adapt it to the new context.
  • When the process requires 200 persons to synchronize in order to update it, nobody will make the effort.
  • When the process is implicit, nobody can discuss it.

In addition, the process has to be more or less prescritive, depending on the context. The cynefin model, the more useful I know, helps us understand this, and choose a strategy to handle a situation, depending on its nature:

  • In the obvious domain, where you can easily predict consequences from the context, define and follow a list.
  • In the complicated domain, you can predict consequences given some analysis. Ask experts to tell you what to do.
  • In the complex domain, the systems has too many too dynamic relations to be predictable, and they change when you touch them. State hypotheses, and validate them through experimentation. This domain is the most frequent in software. In the complex domain, you need a more abstract process. This process will give you clues to design experiments, gather feedback, verify you didn’t overlook some perspective.
  • In the chaotic domain, it’s fire. Get out of there as quickly as you can.
  • To which we add the cliff, from obvious to chaotic, where you violently loose your illusions. You thought your situation was comfortable, and competition shows you a very different reality. Falling to chaos is hard, and you need to get out while you’re still on kodak’s board.
  • And the disorder, where you don’t know where you stand. It could deserve a whole series of its own.

The process must take all these domains into account, in a relevant way. Follow ikea instructions in the complex domain, and you will make too many useless errors. Experiment on methods to assemble an ikea furniture, and you’ll waste more than the traditional week-end of family chaos.

Note that lean proposes a very useful tool to handle this: the standard. The standard is the best way we know today to do something. It is the support for continuous improvement, because it documents our current knowledge, and constantly evolves from there. We get back to it as soon as we observe a gap with the objective (i.e. often), in order to understand what can be improved. It documents pre and post conditions, variations, attention points. Standards can be involved in anything, given its level of abstraction corresponds to the task at hand.

In short

  • Take the time to understand cynefin. My current level of understanding took me a few minutes of epiphany, and several years of deeper study. Every second was worth it.
  • Adapt your procedures to the context.
  • Make your procedures explicit.
  • Don’t add useless procedures.

Finish it

Frugal code

I feel the most like a good dev when I delete stuff. But attachment to code is the most painful challenge to overcome when I try to help my colleagues adopting an ingineering culture. I.e. an experimentation culture. For reasons I don’t understand, devs crave their code, each line of it, since the first minute of its existence (at least for reasons I don’t understand *anymore*, as I probably had the same attachment to code in a life I don’t remember). And the more devs write code, the more they are attached to it.

Quality

Organizations need to adapt to needs. Societies evolve. People grow up, their needs change. Our comprehension of these needs, and of our work environment, gets more relevant. We can not freeze the world during the months or years product development requires.

Our job is to modify code. We thus need a code that is plastic enough to adapt to these changes. In a problem resolution activity like dev, this is what I call quality: the capacity to keep a satisfying pace over a desired period. To be agile, this capacity is not negociable. As devs, it is our duty to always enforce this capacity, without asking anyone.

Design

We find code plasticity in patterns, modularity, high cohesion and low coupling, and so on. But you need to be aware that, each time you choose a pattern, you pick a compromise, you make on axis harder to loosen another one.

Take the DRY principle, for example (Don’t Repeat Yourself). It is universally accepted as a base principle. I hardly hear it discussed. Still, it participates in creating dependencies, source of evil. This is a compromise to understand. In addition, two pieces of code looking alike are not the same code. Think semantics before factoring code. Martin Fowler popularized the rule of three, suggesting you should write code at least three times before extracting some common stuff. Neal Ford teaches us that “The more reusable something is, the less usable it is”. I recently discovered the acronym WET, Write Everything Twice. Let’s not remain stuck on DRY, and think about compromising.

Design patterns are also compromises to understand. They favor plasticity along one dimension while sacrificing another. For example (provocation on purpose), heritage allows multiplying embodiments of a given class of behavior. But it makes modification of this class of behavior harder.

Even if you don’t see which axis you’re making less platic, you need to be aware that a pattern is an indirection. Indirections get in the way of seeing detail, while making high level understanding easier. With indirections, you better scan principles, less details. This is a compromise, it has good and bad sides. You need to be aware of it.

Frugality

To make code more adaptable, my first track is to not write it. The easier code to modify is no code. I always try to add as little code as I need. To add indirections only if they bring value. To delete code when it brings less value than complexity.

Note that we have tools to help us limit the quantity and complexity of code:

  • With TDD, you only write the code that allows to pass one test, no more.
  • With DDD, you split the problem in bounded contexts. Each of these contexts is thus smaller. As you don’t put code in common between bounded contexts, you limit the number of dependencies, and the overall complexity.
  • With BDD, you better assess the problem boundaries. You avoid useless hypotheses.

What a delight to refactor no code. We call it greenfield, and it makes the eyes of every nerd shine. This is an extreme. It gives an idea of what you get when tending towards that ideal: serenity, hapiness, smiles.

If you want to approach that ideal, limit the quantity of code to modify:

  • Don’t write code “just in case”.
  • Don’t add patterns before code complexity requires it.
  • Don’t anticipate too much code flexibility. Wait to know which axis needs freedom of movement.
  • To support all this, learn refactoring and emergent design.

And now let’s jump to process

Frugal production

Let’s think about this enterprise philosophy hit song:

  1. I want to satisfy the needs of users, buyers, managers, stakeholders.
  2. Therefore I want features.
  3. Therefore I want to optimize the production of features.
  4. In other words I want to maximize the production of features.

Imagine that big bank. It wants to solve problems for its advisors, or even its customers. With some help from specialists at every step, the bank formalized requirements in a backlog, organized RFI and RFP, selected a software package, started a project, set up constraints to make sure everything is delivered on time and budget, mounted and unmounted every corresponding team.

At every step, the bank relied on the result of previous investments, results of which were validated, and thus considered right. If everything done before is right, we only have one thing to measure when developing: the speed of production according to specifications. I may be user stories, man.days, lines of code, green tests in a campaign, whatever.

We maximize production of features because we don’t want to come back to what was already validated, and because we don’t know how to do all the steps at the same time. In other words, because it is easy. As a consequence, we measure the accumulation of things on top of accumulated things. This logic is the cause many a failed project.

Jeff Patton teaches us that the goal is to optimize outcome (i.e. changing behavior) to improve impact (i.e. consequences for our organization), while minimizing output (i.e. production).

How to do that concretely? As said earlier, we maximize output because it is easy. Therefore solutions will be hard. This is good news: it is an occasion to take some advantage over your competitors.

Validation

We must move quietly. Take the time to verify that what we throw out the door is useful. Consolidate fundations before adding new floors.

As Jim Benson says, “Software being ‘Done’ is like lawn being ‘Mowed”. Software gets interesting when it gets into users hands. It is finished when it is decommissionned.  Make this official, by adding a validation/understanding/learning/feedback step at the end of the value stream.

A feature to release is not code to produce. It is an outcome hypothesis. John Cutler insist on talking about bets. Once the code is released, we need to evaluate its consequences, and decide where to go from there:

  • It’s perfect we can stop iterating.
  • We should try modifying this or that.
  • We need more info.
  • Let’s deactivate or remove it.
  • And so on.

By the way, if you doubt, like you should, of features usefulness, you should limit the number of experiments you run in parallel. An experiment takes time to reveal its secrets. This delay is actually an interesting topic to think about. Among other things, it helps understanding why our so-called experiments are not scientific. Cynefin rather talks about probes.

Pull system

Limit your WIP (Work In Progress, i.e. the number of things being worked on), in all steps of the stream, including studying/prioritizing. By doing so, you will avoid preparing items when the next step is not ready to take it. Your backlog will thus remain under a reasonable limit.

  • The first step of the stream is a prioritized list of problems to solve, ideas, wishes, unverified asumptions. You study these topics by priority, when you have the capacity to take them. That is to say, when it’s useful to think about them.
  • Then you can think further about them: “what for”, split, explicit the goal and constraints, share understanding…
  • Then development if needed, test, deployment, and so on.
  • And then, validate the need, gather feedback to iterate.

So, step by step, you push features from the last step: impacting the world.

Of course, it only works if items are small. How small? If you’re beginning, it’s never small enough. If you have more experience about flow, and you know why it’s too small, then improve your process to decrease the transaction cost, and then make items smaller.

Wrapping up

Let’s fix the logic described above. Instead of:

  • Users have needs.
  • So we must produce features.
  • So we must optimize feature production.
  • I.e. we must maximize feature production.

Let’s try:

  • Users have needs.
  • We might satisfy those needs with features.
  • So we must optimize feature production and impact.
  • I.e. we must minimize feature production while maximizing impact.

When you use a product, you are delighted when you can see right away the feature you need. It’s a nice surprise to use it feature fluidly, the way you expected. It’s a change, compared to those products crawling under menus of sub-menus to propose all potential options, endless forms to support every possibility, just in case.

Propose the product you love using:

  • Verify features usefulness.
  • Do less things in parallel, and finish them.
  • Focus on users.

It’s never too late to do things properly. It’s always time to validate the hypotheses induced by upfront investments, however huge they are.

Now let’s go with code.

Enough is enough

This series of article is translated from French articles on my employer’s blog, Arolla. The French articles may or may not be released at the time you’re reading this.

We consume too much. We eat, throw away, heat, send e-mails, spend, earn, too much. We need to learn how to do more with less.

We have limitless backlogs. We look for ways to produce more faster. Production is our main indicator. As a company or a country measuring its income growth to invest more to grow more, we measure our production to release more to have more features to earn more. Simple isn’t it?

We miss two parameters here:

  1. Complexity doesn’t grow linearly with size. It tends to grow in a chaotic and explosive way. Sometimes independantly of growth. And surely in an unpredictable way. The worst news is, complexity doesn’t have a maximum. It is not capped by your capacity, anyway.
  2. You can’t predict how the system you’re creating will evolve. It is a kid, living its growth in a chaotic way. You can’t predict consequences of the evolution of your system, as soon as it gets a little bit complex.

I can only see one way of keeping that under control: move slowly, carefully, checking how the system evolves while you touch it. In other words, evolve frugally.

Because frugality diserves loads of ink, and I’m paid by the article, I’l try exploring this topic in 4 steps:

  1. Let’s start with backlog.
  2. What about code?
  3. Did you forget the process?
  4. Let’s get back to serious work.

People are better in the unknown

We are currently discussing about adding a team in another country to our product (single collocated team in Paris). This is the kind of discussion we’re having:

Program Manager: “We need to deliver Feature A in 6 months, and I don’t want to allocate more than 40% of our throughput to it. 40% need to remain for the expedite, and 20% for the support. Feature A alone would take about 80% of your current throughput, so we need a plan to double your throughput.”

Dev Director: “We have issues hiring here. And even if we can, it takes about 6 months from opening the job offer to having the dev in the office in Paris. So we think we’d rather open the job offers in Romania where it takes about 2 months instead of 6.”

The first thing that strikes me in this kind of conversations is that we consider people as resources (I didn’t coin the term in the discussion to avoid offending these virtual personas). When I say resource, I mean a linear resource, with a predictable and somehow linear behavior. I’m not offended being compared to a keyboard or a chair when I’m referred to as a resource. I’m just always surprised argumenting with someone with experience and responsibility, who never realized that 1+1 is not 2 when talking about people’s throughput. 1+1 might be 1, it might be 3, or 5, or sometimes 2, it depends.

A team has a complex behavior, and it’s very hard to predict it. If you add X% people to a team, you won’t add X% throughput. In general, you’ll even lose Y% (Y not being necessarily <100) for a given time. Even if you consider interactions, you won’t be able to come up with an equation that can predict a team’s output.

While I was thinking about it, I bumped into two awesome [series of] articles: people are resilience creators not resources by Johanna Rothman, and the principles not rules series, by Iain McCowatt. They helped me put words on my point of view.

People don’t have a linear behavior. They learn, they socialize, they create bounds, interact, and create together. We are not predictable. Let’s just realize it and deal with it. And you know what? That’s what we are great at! We are great in the unknown, at adapting collectively.

We won’t be more predictable or efficient by following a process or a precise plan. Or at least not very often. Actually only in cynefin’s simple quadrant. And when we’re adapted to bound processes and predictability, we’re a good candidate for being replaced by a machine, which would do better than us. Think about automated tests being great, for example, but mostly for known knowns. The organization’s interest is not to get predictable behaviors out of people. At best, you may get somehow predictable throughputs out of stable teams, but you don’t want to have more.

Back to our original discussion. Whatever the plan, we don’t know what will happen. Given the time frame, the business context, and the code base we’re working on, we are quite sure creating a team in a different country, in a different language, will have a negative effect for the release. But we’re not sure about it. We think creating a team in Romania will have a more negative impact than growing the team in Paris, but we don’t know about it. We think it might have a positive effect on throughput after some ramp up period… I could go on for pages.

The thing is, the system doesn’t want to be sure about any of these assumptions. It’s not the system’s interest. If the system could predict these assumptions, then people would have predictable behaviors, and it would be a bad thing for the organization.

So let’s start with a belief (e.g. “we can’t hire in Paris”), a value (e.g. “face to face collaboration”), a hypothesis (e.g. “a remote team could improve the throughput of this project”), a strategy (e.g. “scale that team”), and experiment/iterate on it.

When should I…

Most agile posts I see are about finding the right bargain about this question:

When should I gather requirements/specify/test/merge to main line/document/integrate/deploy/communicate progress to customers/<insert any phase gate you’re used to and concerned about when you think about iterative/continuous software development>?

The answer is: always!

If it hurts, do it often. It there is risk, do it often.

I won’t go through every possible question, you’ll find them in every consultant’s blog. There are two short answers to all of them:

  • It depends: from your situation to the ideal one, there is a gap. You must deal with it and find the right trade-off.
  • Do everything more often. Every time you delay something, like working on a branch, you don’t decrease risk: by delaying risk, you increase risk.

These answers come together It’s an and, not a xor.

The ideal situation is: every character you change in your code works, brings value, is crystal clear to developers and users, and instantly available to your stakeholders. This is ideal. But we can be ambitious: it is the goal. Everything we do must tend towards this goal.