This past Tuesday, the annual State of DevOps Report for 2017 was released. The report is one of the most respected in the industry and attempts to measure and comment on the direction that a broad range of organizations travel on their DevOps journey. There are usually some interesting comparisons and trends to analyze against the previous year’s results also.
The report is a joint effort between Puppet and DevOps Research and Assessment (DORA) and is derived from survey responses from practitioners in the industry. This year there were 3200 participants and a lot of the insights are gained from comparing high performing teams to low performing teams.
In this blog post, I’m going to comment on a couple of insights and trends that interested me in the report findings. I don’t intend to cover everything raised by the report, so I do encourage you to read it for yourself and find what observations most apply to your situation. You can download the full report for free here.
The survey found that the best indicator of a high performing team was how much manual work they were doing. High performers are doing less manual work than they were in the 2016 survey. Amongst high performers in 2017 versus 2016:
- Use of configuration management solutions (i.e. Puppet, Chef, Ansible, DSC, etc.) is up 33%.
- Use of automated testing is up 27%.
- Automated environment deployments are up 30%.
- Automated change approval is up 27%.
These are all quite significant increases across the board, they indicate that these high performing teams see automation as one of the cornerstones of their success and are happy to invest in it further. It’s pleasing to see that all the major applications of automation in the software development lifecycle are increasing by broadly similar percentages. It’s clear that the most successful teams take a holistic approach to automation and don’t, for example, use it only for testing.
- Use of test automation is the biggest indicator of success with Continuous Delivery.
- High performers spend 21% less time on unplanned work and 44% more time on new work.
- Between high performers and low performers, deployment rates have narrowed from 2016 but failure rate and time-to-recover have widened.
- High performers are doing 46x more deployments than low performers. Down from 200x in 2016.
- Failure rate: 0-15% (high performers), 31-45% (low performers).
- High performers recover from failure 96x faster than low performers. Up from 24x in 2016.
- Ability to take an experimental approach to development is highly correlated with a Continuous Delivery approach.
The link between good test automation and continuous delivery is unsurprising. CD encourages the gradual delivery of smaller units of change over infrequent, big-bang releases and these smaller units of change are easier to validate with automation. i.e. if a new defect occurs, the haystack through which you search for the needle is much smaller. If you can detect bugs as early as possible after they are introduced and the unit of change that introduced them was small, you can generally resolve issues with minimum cost. Pairing CD with effective test automation is the way to get there.
Quality is a difficult thing to define clearly, nevermind measure. Many teams measure the number of reported defects versus the size of a code base to get a defect density metric, but that’s a very narrow view of quality. The State of DevOps Report measures quality as the percentage of time a team spends on unplanned work and rework versus new work. It’s interesting that the survey focuses more on how reality matches up with a team’s plan as a measure of quality than the size of a backlog.
It is clear that low-performing teams are performing more deployments in 2017 than they were in 2016. The number of deployments versus high-performing teams has greatly narrowed, however, the failure rate of those deployments and the time to recover from those failures has increased over the same period. It appears that teams beginning to adapt to CD practices are seeking speed without building quality into the deployment pipeline first so it would be better for these teams to further invest in automated testing and recovery procedures before increasing their deployment rate. When successful, failure rate and time-to-recover should decrease as deployment rate increases, so it might be that these are better indicators of success than just deployment rate alone.
- High performing teams are loosely coupled and have high autonomy.
- Less likely to need an integration environment.
- Ability to innovate is highly restricted if teams cannot change specifications without external approval.
- Innovation is important for predicting market performance.
- Teams should be empowered to take decisions on the work they do, but need to be highly transparent to external stakeholders about everything they do and every decision they make.
- Autonomous teams should decide which tools they use to improve outcomes, making choices based on how they work, rather than leaving those decisions to a central control group.
- Approaches to a loosely-coupled architecture:
- Bounded contexts and APIs.
- Ability to use mocks and virtualization to test components in isolation.
- In reality, many service-oriented architectures do not permit service isolation.
I found it interesting that the report contained as many observations on the architecture of organizations as the architecture of software systems. Perhaps that should be unsurprising though, as Conway’s Law has always stated that the two are closely linked. It makes an interesting point on the balance of power between those on a development team and external stakeholders, e.g. product managers, management hierarchy and even customers. Ideally, these external stakeholders are represented by a Product Owner that is actually on a development team practicing Agile methodologies alongside the engineers, but that is not always the case. My experience is that it is quite common for larger organizations to try and adapt the likes of Scrum to fit into their traditional management structure and this usually means trying to do it without a Product Owner. These teams are unlikely to have the level of autonomy needed to maximize innovation.
On the technical side, the report observes that many systems that claim to have service-oriented architectures are often monolithic, in reality. Many are unable to support the easy testing of individual services in isolation, which is often the first stage of testing in a continuous delivery pipeline and the quickest/most valuable feedback loop to the actual development team.
The report makes valuable reading for anybody involved in a DevOps transformation. There is an entire section of the report dedicated to transformational leadership and many other observations that I have not commented on, so do make the time to read it for yourself and highlight the findings that most fit with the experience gained through your own transformation.
What findings do you think are most enlightening for your situation?