Say you’re working in some software development shop and you find yourself concerned with code quality metrics. Reverse engineering your team’s path to this point isn’t terribly hard because, in all likelihood, one of two things happened.
First, it could be that the team underwhelmed someone, in some business sense — too many defects, serially missed deadlines, that sort of thing. In response to that, leadership introduced a code quality initiative. And you can’t improve on it if you can’t measure it. For that reason, you found yourself googling “cyclomatic complexity” to see why the code you just wrote suddenly throws a warning.
The second option is internal motivation. The team introduced the metrics of its own accord. In this capacity, they serve as rumble strips on the side of your metaphorical road. Doze off at the wheel a little, get a jolt, and correct course.
In either case, an odd sort of gulf emerges between the developers and the business. And I think of this gulf as inherently noisy.
Code Quality Metrics for Developers
I spend a lot of time consulting with software shops. And shops hiring consultants like me generally have code quality improvement initiatives underway. As you can imagine, I see an awful lot of code metrics.
Here are some code quality metrics that I see tracked most commonly. I don’t mean for this list to be an exhaustive one of all that I see.
- Lines of code. (This is an interesting one because, in aggregate, it’s often used to track progress. But normalized over smaller granularities, like types and methods, people correlate it negatively with code quality — “that method is too big.”)
- Cyclomatic complexity: the number of execution paths that exist through a given unit of code. Less is more.
- Unit test coverage: the percentage of paths through your code executed by your unit test suite. More is more.
- Static analysis tool/lint tool violations count: run a tool that provides automated code checks and then count the number of issues.
As software developers, we can easily understand these concepts and internalize them. But to explain to the business why these matter requires either a good bit of effort or a “just trust us.” After all, the business won’t understand these concepts as more than vague generalities. There’s more testing coverage, or something…that sounds good, right?
These metrics can then have noise in them, meaning that how important they are for business outcomes becomes unclear.
Code Quality Metrics for the Business
When you look at things from the perspective of the business, you tend to favor different ways of evaluating code quality. The business, which isn’t versed in the intricacies of code, naturally favors outcomes. So they value things like the following:
- Story cards/features delivered
- Counts of production defects
- Adherence to schedule
- Cost of changes
- Feature completion/defect resolution cycle time
We can draw some interesting distinctions here. Whereas the code quality metrics of interest to developers focus on properties of the source code, the things that the business values are focused on results of the source code. In other words, the business doesn’t know anything about the properties of the code, per se. Instead, it measures things that happen as and after the team deploys the code to production. Did we deploy on time and budget? Have things gone wrong after deployment? Can we quickly and easily respond to future needs for change?
But this part of the whole suffers from potential noisiness as well. In the first place, all of these are lagging indicators. They’re easy to measure but hard to trace definitively back to causality and thus hard to improve. In other words, plenty of factors beyond code quality can affect things like on-time feature delivery.
Code Quality Metrics: Meet in the Middle
People often talk about signal-to-noise ratio as a way to measure how much of a total volume of information actually matters. In pursuit of a favorable signal-to-noise ratio, you need to eliminate irrelevancies and superfluous information. So how do we do that with code metrics? Well, we look to find the intersection of source code properties and business outcomes.
But that presents a hard problem. How do you find properties of source code that trace neatly to business outcomes? And why don’t the ones we’ve looked at so far?
Well, all of those get to business concerns only indirectly. Test coverage serves as a proxy for measuring the quality of a unit test suite, which serves as a proxy for representing code quality. The same kind of reasoning is behind cyclomatic complexity and violation quantity as well.
We can get there more directly by starting with the types of outcomes that the business measures and working backward.
A Different Set of Code Quality Metrics
With that in mind, let’s think again about what matters to the business. The business wants predictability in terms of budget, deadlines, and application behavior. They want to minimize the cost of change and the risk associated with it. And they want to keep the cost of new functionality flat and to a minimum. So let’s think about properties of code that tie directly to those things, as opposed to what we’re historically used to measuring.
Here are some alternate metrics to consider:
- Type rank. Based on Google’s page rank, this identifies the importance of a piece of code to the codebase (in terms of dependencies). Highly ranked code is generally riskier to change.
- Dependency ratio (edges versus nodes) in the dependency graph. How interconnected is the codebase? More interdependence and snarl correlate with feature slowdown over the course of time.
- Abstractness vs instability. Do heavily depended-upon parts of the code have abstraction layers? Or are they highly concrete? If they’re highly concrete, change becomes extremely labor intensive.
- Lack of cohesion of methods (LCOM). Do elements of the code exhibit non-cohesion, indicating poorly organized design? In codebases with lack of cohesion, higher potential exists for regressions and strange production problems.
Notice that, like popular metrics cyclomatic complexity and test coverage, these involve highly technical concepts. In fact, from a nuts-and-bolts perspective, these composite metrics are harder for developers to grok. But also notice that these metrics speak directly to properties of code that interest the business, making them less noisy code quality metrics.
Visualization Is Key
One last component to meeting in the middle with code quality metrics involves visualization. People often gravitate to test coverage because of how well it lends itself to visualization. It’s a percentage that teams want to increase so you can plot a nice graph over time and everyone can keep track. That’s powerful.
So to really drive home this collaboration between the dev teams and the business, we need to make these metrics I’ve proposed powerful as well. Take a look at this heat map of type rank, for instance.
Anyone looking can see, in bright red, where high-risk spots lie in the codebase. And you don’t need to be highly technical to glance at this, see lots of red, and know that you’re looking at risky and expensive change when you touch the codebase.
Or take a look at the abstractness versus instability graph.
This makes an excellent conversation piece among both team members and the business. A lot dots in the red means slow pace of change and relative riskiness.
Developers and business folks alike gravitate toward measures that they understand. So the key lies in finding metrics that matter and making both parties understand them. And there’s no better way to work toward understanding than with visual aids.
There’s nothing wrong with using the metrics that both parties have historically used. By all means, continue to track them if you’re so inclined. But with code quality metrics, separating the signal from the noise is critical. So do your best to find, use, and visualize metrics that speak directly to important outcomes.