Wander the halls of an enterprise software outfit looking to improve, and you’ll hear certain things. First and foremost, you’ll probably hear about unit test coverage. But, beyond that, you’ll hear discussion of a smattering of other metrics, including cyclomatic complexity.
It’s actually sort of funny. I mean, I understand why this happens, but hearing middle managers say “test coverage” and “cyclomatic complexity” has the same jarring effect as hearing developers spout business-meeting-speak. It’s just not what you’d naturally expect.
And you wouldn’t expect it for good reason. As I’ve argued in the past, code coverage shouldn’t be a management concern. Nor should cyclomatic complexity. These are shop-heavy specifics about particular code properties. If management needs to micromanage at this level of granularity, you have a systemic problem. You should worry about these properties of your code so that no one else has to.
With that in mind, I’d like to focus specifically on cyclomatic complexity today. You’ve probably heard this term before. You may even be able to rattle off a definition. But let’s take a look in great detail to avoid misconceptions and clear up any hazy areas.
Defining Cyclomatic Complexity
First of all, let’s get a specific working definition. This is actually surprisingly difficult because not all sources agree on the exact method for computing it.
How can that be? Well, the term was dreamed up by a man named Thomas McCabe back in 1976. He wanted a way to measure “the number of linearly independent paths through a program’s source code.” But beyond that, he didn’t specify the mechanics exactly, leaving that instead to implementers of the metric.
He did, however, give it an intimidating-sounding name. I mean, complexity makes sense, but what does “cyclomatic” mean, exactly? Well, “cyclomatic number” serves as an alias for something more commonly called circuit rank. Circuit rank measures the number of independent cycles within a cyclic graph. So I suppose he coined the neologism “cyclomatic complexity” by borrowing a relatively obscure discrete math concept for path independence and applying it to code complexity.
Well then. Now we have cyclomatic complexity, demystified as a term. Let’s get our hands dirty with examples and implications.
Cyclomatic Complexity by Example
To move away from the abstract, I’ll start with the simplest sort of example. Recall that we want the number of linearly independent paths through a given piece of source code. What could be easier than reasoning about that for a one line method?
1 2 3 4 |
public double Divide(int x, int y) { return x/y; } |
This method has a cyclomatic complexity of one. That shouldn’t surprise. Only one single path through this method exists (set aside, momentarily, the possibility of a runtime exception — I’ll talk about that a little later).
Now, let’s spice things up a little by adding a mathematically questionable bit of logic to guard against exceptions.
1 2 3 4 5 6 7 |
public double Divide(int x, int y) { if (y == 0) return 0; return x / y; } |
This method will now return zero in cases that would have generated a division by zero exception. And by doing that, it has upped its cyclomatic complexity from one to two. Why? Well, because two independent paths through this code now exist. In most cases, the if condition evaluates to false and the method returns x divided by y. But, in cases where callers pass zero for y, the method will execute along a different path, returning zero after the if condition evaluates to true.
Expanding on this Reasoning
With a simple example in the books, let’s generalize a bit. Introducing the conditional created additional complexity by adding another path through the code. But as you can deduce, we have other means of adding complexity.
NDepend measures cyclomatic complexity for you, along with documentation for how it computes the figure. Along with the if keyword, you can acquire additional complexity by use of looping constructs (while, for, foreach), switch blocks (case/default), jumps (continue, goto), exceptions (catch), and compound conditional enablers (&&, ||, ternary operator).
And this makes sense since each of those introduces a new path through the code. Your code may or may not meet the initial conditions for a loop. You have N ways to move through a switch statement. Code may or may not trigger an exception. You get the idea.
So a method’s cyclomatic complexity score winds up as one plus the number of these constructs that you encounter. And a type/namespace/project complexity is the additive complexity of its methods and properties. At least, according to NDepend, it is. (Again, more on this later, when I address the non-standardization of the metric.)
What’s the Big Deal? Who Cares About Cyclomatic Complexity?
Let’s look at why this matters to anyone. Why would developers care, let alone middle managers? (Perhaps now you understand my bemusement at non-technical middle managers concerning themselves with cyclomatic complexity — it’s a deeply code-arcane construct.)
Cyclomatic complexity matters mainly because it serves as a way to quantity complexity in your code. And that matters because complexity translates directly to risk, a concern of interest both to the business and to developers. Complexity of code elements, both in terms of size and paths through the code, correlate with defects.
As developers, we don’t want our code to generate defects, obviously. So we’ve come to regard high cyclomatic complexity as something indicative of higher likelihood of defects, thus our interest in measuring it. But the same thing has come to apply for non-technical stakeholders as well. They concern themselves with cyclomatic complexity precisely because they, too, care about defect likelihood.
But cyclomatic complexity also has significant ramifications because of its impact on unit tests. Specifically, difficulty of testing grows proportionately with cyclomatic complexity. If a method has cyclomatic complexity of 25, this means that you would need to write 25 tests (or at least invoke the method 25 times) before you addressed each path through the code.
The Perils of Cyclomatic Complexity
First and foremost, you might start to wonder if this isn’t all a bit reductionist. Well, yes, it is. Code metrics are inherently reductionist. I have no doubt that you could conceive of a method with a high cyclomatic complexity score that’s also relatively easy to understand, test, and maintain. Reducing a method to its decision points provides interesting but incomplete data about that method.
Secondly, let’s modify one of our existing examples a bit to prove a point. Instead of a guard condition, let’s trap an exception.
1 2 3 4 5 6 7 8 9 10 11 |
public double Divide(int x, int y) { try { return x/y; } catch(DivideByZeroException) { return 0; } } |
According to NDepend’s calculation scheme, we now have a method with cyclomatic complexity of two. But Visual Studio has its own cyclomatic complexity calculator. And that calculator scores this method’s cyclomatic complexity as one.
Throughout this post, I’ve alluded to the idea that the programming world has not yet standardized a cyclomatic complexity calculation algorithm for a given programming language, let alone across languages. So you’ll see disagreement on C# cyclomatic complexity alone, to say nothing of applying it to other .NET languages or even complexity of the IL code.
But It Still Matters
Admittedly, these shortcomings of cyclomatic complexity present real challenges to its use. But you should still pay attention.
After all, any metric is inherently reductionist, and many metrics exist in spite of methodological disagreements. Polls calculate approval ratings for politicians, but “approval” is an incredibly vague construct. And the methodology by which this polling is conducted varies. Still, these polls matter. So it goes with code metrics and cyclomatic complexity.
Where the idea of cyclomatic complexity really shines is in aggregate. Are you going to have methods with a high complexity score that aren’t really so bad? Probably. Likewise, will you have awful methods with a low cyclomatic complexity score? Probably. But if you pull away from individual methods and start evaluating your codebase as a whole, average cyclomatic complexity becomes both more accurate and more telling.
So pay attention to your codebase’s cyclomatic complexity. Use it as a means for evaluating your code and communicating its properties to outsiders. Because if you don’t, sooner or later at wave of management or consultants will.