So we want to deliver value faster, but how do we do it? The good news is that there are lots of ways to achieve it. The bad news is that it’s hard to pick the right means. What capabilities and approaches are the ones that matter to us as tech people?
Building software isn’t much different from building machines. This article will introduce a few similarities.
This is the second part of a two-part article series on accelerating IT. Looking at ways to deliver value faster I’ll try to give an idea of the skills and capabilities we as tech people need to drive acceleration. To answer this question, I’ll come up with a holistic perspective that includes both technical, organisational, and cultural competencies.
Part 1 makes an attempt to look beyond the “Accelerate” buzzword and its “faster = better” narrative by identifying common principles and motivations behind acceleration in the context of different business environments.
Short on time? Here’s an ultra-short abstract of part 2:
- The idea to deliver value faster by reducing lead time and keeping batch sizes low has first been used in the car manufacturing industry and was later refined to be more fitting for software businesses.
- The DORA metrics are the most popular set of metrics to measure software delivery performance, however they don’t cover the whole picture. The DORA research program identifies four pillars of capabilities that broaden the view and shift the focus on the entire organisation.
- Team structures have a large impact on software delivery speed and should be designed just as carefully as a software system’s architecture.
- Cloud service providers help to increase delivery speed and reduce complexity as they provide the backbone for the lower service layer of a software-driven product.
Lead time and batch size: Toyota’s pioneering steps
In the 1980s, Toyota came up with a set of principles that were eventually summarized by the term lean manufacturing. Although these ideas were initially used to help establish continuous improvement and production flow in the car manufacturing industry, many of these concepts evolved over the years and were widely adopted in other fields. Two notable beliefs that emerged from the lean movement are especially interesting for software people (see , pg. 409):
- the time to convert raw materials into finished goods, referred to as lead time, should be kept low
- a good indicator for achieving a low lead time is by working in small batches
Why are these two beliefs particularly interesting? According to the lean movement, lead time is the best predictor of overall quality, customer satisfaction and employee happiness. Batch size in turn is the best predictor to lowering lead time. Toyota’s finding was that, rather than producing large sets of identical car parts in big batches, producing and finishing single instances of cars allowed for quicker learning and improvements, less rework, and reduced the overall production time. At the time, most other car manufacturers focused on large batch sizes to drive down production cost (see: , pg. 185ff). Interestingly, this sounds quite similar to the fast/ultra-fast fashion example from part 1 of this series and serves as another example where implementing an “economy of speed” style production allowed for quicker learning and adaptation, which eventually resulted in Toyota outperforming its competitors.
From car manufacturing to IT
Eventually, lean principles were adapted in other industries. Following Eric Ries’s lean startup methodology, Toyota’s principles were translated into concepts that could be adapted by software-driven businesses (see an example here), and numerous other works (, ) refined these terms so that they are more fitting for software teams:
In the context of software-driven businesses, lead time can be defined as “the time it takes from a customer making a request to the request being satisfied”. It spans the entire software production process, from collecting initial requirements for a new feature until this feature becomes usable for end users. This process consists of creative work as well as work that is repetitive and can be automated. The automatable part is bringing the feature into the target environment. The time to achieve this is defined as Delivery Lead Time: “The time it takes to go from code committed to code successfully running in production”. (see: , pg. 14-15)
Batches are the work items needed to create a shippable product. In the context of software development these are the source code changes that developers put into version control. Maintaining a minimum batch size in this context can be seen as testing and deploying every code change to version control with the aim to bring these small changes to production as quickly as possible. In contrast, a large batch size would be seen as having long-lived feature branches that are being brought to production after weeks or months of work.
Keep batches small and lead times low
Although the respective definitions vary depending on whether we look at car or software manufacturing, the key idea remains the same: Focusing on the entire product development lifecycle, batch sizes should be kept small to allow for a low lead time. These two indicators provide a good abstraction for measuring delivery performance. Rather than focusing on maximum efficiency (as proposed by the “economies of scale” mindset), we aim for the “economies of speed” side of the spectrum: Being able to learn, adapt, and implement quickly in order to reduce the time to deliver value to the customer.
DORA and the DevOps movement
Inspired by the metrics developed by the lean manufacturing movement, the DevOps research and assessment team (DORA, now part of Google) further refined the lead time/batch size set of measures and tried to identify metrics that can be associated with a high software delivery performance. The research team identified 4 key metrics:
- Lead time for changes (The amount of time it takes a commit to get into production)
- Time to restore services (How long it takes an organisation to recover from a failure in production)
- Deployment frequency (How often an organisation successfully releases to production)
- Change failure rate (The percentage of deployments causing a failure in production)
These metrics can be used quite simply to get an idea of an organisation’s software delivery performance. The popularity of the annual State of DevOps report shows that the findings are relevant and widely accepted in the industry, being a good starting point for improving and accelerating software delivery. While I find these metrics generally helpful, I’d argue that they should be taken with a grain of salt as they don’t cover the whole story.
What DORA doesn’t measure
Metrics should be used with caution because they can pose a misleading incentive. Maybe the product we shipped has little value for the customer, or we might have wasted 6 months because we were busy arguing about our organisation’s goals. These things might happen while DORA assured us that we’re in the elite group of performers.
We may introduce automated CI/CD pipelines to get from code committed to running in production in a few minutes, but might have failed elsewhere.
Instead of focusing on just a handful of metrics, I propose a more holistic view on the software creation process. This is because the speed at which we can create an artifact of value is determined by the whole organisation building it, not just our CI/CD pipelines. However, measuring our performance on the org level is a challenging task, because everything that happens before we type “git push” into our console is hard to boil down to a number. Building stuff end-to-end ultimately is a highly creative process involving a lot of human brain power. coming up with feasible organisation goals, creating meaningful value propositions, defining requirements, and building an environment that allows for quick learning cycles is nothing machines can do. It is here where the DORA metrics fall short, as they have a strong focus on what happens after committing to version control. They don’t account for the highly creative and hard-to-automate process before committing our changes. And many of the decisions that happen before the first line of code is checked into version control greatly influence our overall lead time, not just the delivery lead time. Don’t get me wrong: I believe that using DORA metrics to measure our software delivery performance is helpful and valuable, but: We should be aware that they don’t cover the whole picture.
Four pillars of capabilities: The lesser-known aspects of DORA
While the DORA metrics don’t cover the whole picture, the DORA research program, greatly influenced by Dr. Nicole Forsgren and the research conducted by Ron Westrum provide a broader view on acceleration and identify capabilities in organisations that directly or indirectly increase its software delivery performance.
Sadly though, I feel that these less technical aspects don’t get quite the attention they deserve. The team at DORA identified 4 pillars of capabilities, of which technical capabilities pose one cornerstone:
Let’s look at each of the pillars briefly:
- technical: Refers to techniques primarily associated with DevOps, such as using continuous integration, deployment automation, trunk-based development, and cloud infrastructure fall into this category. This is the dimension of DORA that has the highest visibility and is the easiest to measure.
- measurement: This dimension focuses on monitoring, observability, and failure remediation, such as using solutions for proactive failure notifications or gathering application logs and traces. Measures from this category also have mostly technical implications.
- process: This category aggregates techniques such as working in small batches, making the flow of work visible, and reducing change approval wait times. While these factors can be aided by tools, their implementation requires a company/developer culture that supports and facilitates these.
- cultural: This dimension focuses on the cultural aspects of an organisation, such as growing a company culture of trust and continuous learning. It is highly influenced by Ron Westrum’s research on information processing in company contexts. Westrum identified that generative organisations (as opposed to pathological or bureaucratic orgs) fare better when confronted with problems or change (and are healthier places to work, too). Performance in this category isn’t easy to measure and significant management effort is needed to introduce change.
Addressing the cultural dimensions of software delivery is nothing that can be achieved by improving any of the primary DORA metrics. While DevOps principles are relatively easy to achieve technologically, the organisational implications are a lot harder to visualize and measure.
Don’t mess with Melvin Conway
Team structures (and their boundaries and interactions) play an important role in how information can flow across an organisation and thus greatly influence software delivery speed. Conway’s law remains very relevant when building products with software. Traditional, org-chart driven organisations need to take this factor into consideration when seeking to reduce their lead time. Matthew Skelton and Manuel Pais found many obstacles to fast delivery flow to be team related and propose forming teams along domains or value streams (see ) instead of relying on subsystem-focused teams to reduce cognitive load and increase flow:
Organisation design and software design are, in practice, two sides of the same coin, and both need to be undertaken by the same informed group of people.
While the organisational aspects of software delivery aren’t the focus of this article series I can’t stress this point enough. In my experience, this is an often neglected aspect when discussing ways to get better at building software-driven products. Organizing teams in functional silos (“frontend team”, “database team”, “testing team”) will likely slow down our software delivery process. My proposal is that we shouldn’t lose sight of the broader goals: Delivering an artifact of value end-to-end, and shaping our organisation and teams so that humans have a better and more successful time collaborating. Docker’s recent efforts of restructuring its teams and organisation structure serves as a good practical example for this.
The role of the cloud
An important building block promoted by the DevOps movement to increase developer productivity and to reduce lead times is to make use of cloud platforms (which falls into DORA’s technical dimension). According to DORA research, leveraging cloud platforms improves software delivery performance by:
1) enabling developer teams to provision infrastructure resources as needed with little to no wait times,
2) allowing for quick scaling of compute resources, and
3) providing measures to monitor systems performance and cost.
Customisation vs. commoditisation
Apart from the DORA perspective, there is another interesting motivation for using cloud technologies that divides IT into 2 layers (see , pg. 120):
- A lower service layer that provides standardized resources and services that are unlikely to have a competitive impact
- An upper business-facing layer that consists of in-house developed software that provides direct business value and is likely to have a competitive impact
According to this interpretation, resources we’d associate with the lower service layer (data centers, network infrastructure, databases, Kubernetes clusters, etc.) are something we should spend little time on as we can easily buy it off-the-shelf using cloud service providers. Wardley maps are a great way to visualize this concept by mapping components according to their visibility to the customer and their degree of customization. Imagining some kind of custom-built software product (our core business), we can reduce complexity by buying or renting the required supporting services. Cloud platforms provide exactly this: Commoditized services that previously resided in on-site data centers.
Pushing as many infrastructure components out into the commodity layer we can therefore reduce overall complexity during product development and gain some speed as compared to managing these ourselves.
Summing up: How to deliver value faster
In this article, we had a look at concepts that are associated with accelerating software delivery, and tried to broaden our view on it. The idea of reducing lead time and keeping batch sizes low has first been used in the car manufacturing industry and was later refined to be more fitting for software businesses. The DORA research program comes with a set of metrics and competencies that facilitate speed, however these are often viewed as purely technical capabilities.
There are lesser known concepts promoted by the DORA research program that emphasize the value of an organisational culture that facilitates speed, constant improvement, and employee satisfaction. Also, team setups play an important role when trying to achieve flow and speed. We learned that the value of making use of cloud environments can be explained by looking at Wardley maps. If we are able to commoditize the components and services that are unlikely to have a competitive impact to the commodity layer, we can gain speed because we’re able to focus on the parts of a product that provide a business value.
Famous last words
I’ll admit: This article just barely scratches the surface of a lot of aspects that hover around in the software delivery word cloud. Having a technological background, my primary area of expertise is solving problems with software, so I’m by no means an expert on organisation management, and neither am I an expert on product management. I hope that I’ve nonetheless been able to get my message across: Helping companies to successfully build valuable software involves a broad set of skills, many of them non-technical.
Let’s not forget about this and shape our teams and our organisations so that we can be successful building useful things and have fun doing it.
I hope that you found this article series useful. Thanks for reading!