Fear is an emotion that slows teams down. It makes us more cautious. It makes us over-invest in prevention. It makes us less willing to trust, to communicate openly, and - most painfully - to take risks. It is the dominant reason I see teams fall back on "best practices" which may not be effective, but are at least reassuring. Unfortunately, these actions generally work to increase the batch size of our work, which magnifies the consequences of failure and therefore leads to more fear. Reducing fear is a heuristic we can use to judge process improvements. Anything that reduces fear is likely to speed up the fundamental feedback loop.
The interesting thing about fear is that to reduce it requires two contradictory impulses. First, we can reduce fear by mitigating the consequences of failure. If we construct areas where experimentation is less costly, we can feel safer and therefore try new things. On the other hand, the second main way to reduce fear is to engage in the feared activity more often. By pushing the envelope, we can challenge our assumptions about consequences and get better at what we fear at the same time. Thus, it is sometimes a good idea to reduce fear by slowing down, and sometimes a good idea to reduce fear by speeding up.
To illustrate this point, I want to excerpt a large part of a recent blog post by Owen Rogers, who organized my recent trip to Vancouver. I spent some time with his company before the conference and discussed ways to get started with continuous deployment, including my experience introducing it at IMVU. He summarized that conversation well, so rather than re-tread that material, I'll quote it here:
One thing that I was surprised to learn was that IMVU started out with continuous deployment. They were deploying to production with every commit before they had an automated build server or extensive automated test coverage in place. Intuitively this seemed completely backwards to me - surely it would be better to start with CI, build up the test coverage until it reached an acceptable level and then work on deploying continuously. In retrospect and with a better understanding of their context, their approach makes perfect sense. Moreover, approaching the problem from the direction that I had intuitively is a recipe for never reaching a point where continuous deployment is feasible.
Initially, IMVU sought to quickly build a product that would prove out the soundness of their ideas and test the validity of their business model. Their initial users were super early adopters who were willing to trade quality for access to new features. Getting features and fixes into hands of users was the greatest priority - a test environment would just get in the way and slow down the validation coming from having code running in production. As the product matured, they were able to ratchet up the quality to prevent regression on features that had been truly embraced by their customers.
Second, leveraging a dynamic scripting language (like PHP) for building web applications made it easy to quickly set up a simple, non-disruptive deployment process. There’s no compilation or packaging steps which would generally be performed by an automated build server - just copy and change the symlink.
Third, they evolved ways to selectively expose functionality to sets of users. As Eric said, “at IMVU, ‘release’ is a marketing term”. New functionality could be living in production for days or weeks before being released to the majority of users. They could test, get feedback and refine a new feature with a subset of users until it was ready for wider consumption. Users were not just an extension of the testing team - they were an extension of the product design team.
Understanding these three factors makes it clear as to why continuous deployment was a starting point for IMVU. In contrast, at most organizations - especially those with mature products - high quality is the starting point. It is assumed that users will not tolerate any decrease in quality. Users should only see new functionality once it is ready, fully implemented and thoroughly tested, lest they get a bad impression of the product that could adversely affect the company’s brand. They would rather build the wrong product well than risk this kind of exposure. In this context, the automated test coverage would need to be so good as to render continuous deployment infeasible for most systems. Starting instead from a position where feedback cycle time is the priority and allowing quality to ratchet up as the product matures provides a more natural lead in to continuous deployment.
The rest of the post, which you can read here, discusses the application of these principles to other contexts. I recommend you take a look.
Returning to the topic at hand, I think this example illustrates the tension required to reduce fear. In order to do continuous deployment at IMVU, we had to handle fear two ways:
- Reduce consequences - by
emphasizing the small number of customers we had, we were able to
convince ourselves that exposing them to a half-baked product was not
very risky. Although it was painful, we focused our attention on the
even bigger risks we were mitigating: the risk that nobody would use
our product, the risk that customers wouldn't pay for virtual goods,
and the risk that we'd spend years of our lives building something that
didn't matter - again.
- Fear early, fear often - by actually doing continuous deployment before we were really "ready" for it, we got used to the real benefits and consequences of acting at that pace. On the negative side, we got a visceral feel for the kinds of changes that could really harm customers, like commits that take the whole site down. But on the plus side, we got to see just how powerful it is to be able to ship changes to the product at any hour of the day, to get rapid feedback on new ideas, and to not have to wait for the next "release train" to put your ideas in action. On the whole, it made it easier for us to decide to invest in preventive maintenance (ie the Cluster Immune System) rather than just slow down and accept a larger batch size.
Making this fear-reduction strategy work required more than just the core team getting used to continuous deployment. We eventually discovered (via five whys) that we also had to get each new employee acculturated to a fearless way of thinking. For people we hired from larger companies especially, this was challenging. To get them over that hurdle, we once again turned to the "reduce consequences" and "face your fears" duality.
a new engineer started at IMVU, I had a simple rule: they had to ship
code to production on their first day. It wasn't an absolute rule; if
it had to be the second day, that was OK. But if it slipped to the
third day, I started to worry. Generally, we'd let them pick their own
bug to fix, or, if necessary, assign them something small. As we got
better at this, we realized the smaller the better. Either way, it had
to be a real bug and it had to be fixed live, in production. For some,
this was an absolutely terrifying experience. "What if I take the site
down?!" was a common refrain. I tried to make sure we always gave the
same answer: "if you manage to take the site down, that's our fault for
making it too easy. Either way, we'll learn something interesting."
Because this was such a big cultural change for most new employees, we didn't leave them to sink or swim on their own. We always assigned them a "code mentor" from the ranks of the more established engineers. The idea was that these two people would operate as a unit, with the mentor's job performance during this period evaluated by the performance of the new person. As we continued to find bugs in production caused by new engineers who weren't properly trained, we'd do root cause analysis, and keep making proportional investments in improving the process. As a result, we had a pretty decent curriculum for each mentor to follow to ensure the new employee got up to speed on the most important topics quickly.
These two practices worked together well. For one, it required us to keep our developer sandbox setup procedure simple and automated. Anyone who had served as a code mentor would instinctively be bothered if someone else made a change to the sandbox environment that required special manual setup. Such changes inevitably waste a lot of time, since we generally build a lot more developer sandboxes than we realize. Most importantly, we immediately thrust our new employees into a mindset of reduced fear. We had them imagine the most risky thing they could possibly do - pushing code to production too soon - and then do it.
Here's the key point. I won't pretend that this worked smoothly every time. Some engineers, especially in the early days, did indeed take the site down on their first day. And that was not a lot of fun. But it still turned out OK. We didn't have that many customers, after all. And continuous deployment meant we could react fast and fix the problem quickly. Most importantly, new employees realized that they weren't going to be fired for making a mistake. We'd immediately involve them in the postmortem analysis, and in a lot of cases it was the newcomer themselves (with the help of their mentor) who would would build the prophylactic systems required to prevent the next new person from tripping over that same issue.
Fear slows teams of all sizes down. Even if you have a large team, could you create a sandboxed environment where anyone can make changes that affect a small number of customers? Even as we grew the team at IMVU, we always maintained a rule that anyone could run a split-test without excess approvals as long as the total number of customers affected was below a critical threshold. Could you create a separate release process for small or low-risk commits, so that work that happens in small batches is released faster? My prediction in such a situation is that, over time, an increasing proportion of your commits will become eligible for the fast-track procedure.
Whatever fear-reducing tactics you try, share your results in the comments. Or, if fear's got you paralyzed, share that too. We'll do our best to help.
(Image source: www1.american.edu/TED)