NDepend

Improve your .NET code quality with NDepend

With Code Metrics, Trends are King

Here’s a scene that’s familiar to any software developer.  You sit down to work with the source code of a new team or project for the first time, pull the code from source control, build it, and then notice that there are literally thousands of compiler warnings.  You shudder a little and ask someone on the team about it, and he gives a shrug that is equal parts guilty and “whatcha gonna do?”  You shake your head and vow to get the warning situation under control.

If you’re not a software developer, what’s going on here isn’t terribly hard to understand.  The compiler is the thing that turns source code into a program, and the compiler warning is the compiler’s way of saying, “you’ve done something icky here, but not icky enough to be a show-stopping error.”  If the team’s code has thousands of compiler warnings, there’s a strong likelihood that all is not well with the code base.  But getting that figure down to zero warnings is going to be a serious effort.

As I’ve mentioned before on this blog, I consult on different kinds of software projects, many of which are legacy rescue efforts.  So sitting down to a new (to me) code base and seeing thousands of warnings is commonplace for me.  When I point the runaway warnings out to the team, the observation is generally met with apathetic resignation, and when I point it out to management, the observation is generally met with some degree of shock.  “Well, let’s get it fixed, and why is it like this?!”  (Usually, they’re not shocked by the idea that there are warts — they know that based on the software’s performance and defect counts — but by the idea that such a concrete, easily metric exists and is being ignored.)

Continue reading With Code Metrics, Trends are King

Let’s Build a Metric: Global and Local Scope

Last time in this series, I began an exploration of how a method might be impacted by the scope of the variables that it uses.  The idea is that it’s easier to comprehend a method that uses more narrowly scoped variables.  If method A uses nothing but local variables, method B uses class level fields, and method C uses global state, time to comprehend will increase from A to B to C.  Let’s introduce this hypothesis into the running experiment for a time to comprehend metric.

But before we can do that, let’s take a look at how those concepts are represented in NDepend.  Let’s simplify the query we used last time and use that as a starting point.  This is a query that simply looks at the fields used somewhere in a method.

Recall the Board constructor we were looking at last time.

The constructor uses two class-level fields: _boardSize and _pieces. If we run this query in the query editor, we see that there are two fields. But, here’s another cool feature of NDepend — you can actually drill into that field to see what the fields in question are. Here’s what that looks like.

NDependFieldBreakdown

That’s good progress. We can address the “class field” part of the equation when we start adding this consideration to our metric. But what about global state? Well, let’s see what happens to FieldsUsed if we add a global field. I added a little global variable repository to the code base and then amended the constructor to use it.

Now when I run NDepend’s analysis, I see that there are 3 fields, and it lists the field from GlobalDumpingGround. This is good news, but not unexpected. After all, there’s nothing in the naming of “FieldsUsed” to suggest that it’s only talking about fields for that method’s class. So now we have a small task of figuring out how to distinguish local fields from global variables.

This might seem very simple, but there is a bit of subtlety to it. It isn’t just a matter of “fields from this class versus from others.” After all, I could have an instance field from some other class that’s visible to my method. You don’t see this a lot because standard practice is to have fields be private, but it is possible. This means that we need to put qualifiers on both types of field for our experimental purposes here. Let’s take a look at that.

For local fields, we’re going to look at fields that share a parent type with the method, meaning they belong to the class in question. We also want them not to be publicly visible, to capture the general usage practice of encapsulation. For global fields, we want them to be publicly visible and static, since this is essentially the definition of global scope.

When we apply this to the board constructor, the results are what we expect: 2 local fields and 1 global.

GlobalLocalFields

That’s perfectly fine and good, but what about properties? Working with a local or global property is probably no different when it comes to time to comprehend, and this query will fail to take properties into account. Let’s add those and see what things look like.

Here’s the revised code:

The constructor now worries about a global variable, a global property, 2 local fields, and a local property. So let’s make an update to the query we’re running.

And, finally, let’s see what NDepend tells us.

GlobalLocalProperties

We still have two local fields and one global field. But we now have a local property being accessed here. That’s not a surprise, since we’re initializing the Pieces array with the local property Size. But why are there 2 global properties when the only one we’re dealing with is AGlobalProperty? Well, that’s a tricky bit, and it’s occurring because even though we reason about AGlobalProperty as a single variable, it’s actually two methods as far as the IL is concerned: a getter and a setter. And, we’re using both of them.

And that brings me to an explanation of the CQLinq here. I’m searching for methods, because that’s what properties are under the hood. Specifically, I’m searching for locals as getters and setters that share a parent type with the calling method. And for globals, I’m just looking for getter/setter methods that are publicly visible and static. I don’t care what the parent type is — a global is a global.

There are, of course, a lot of refinements and different ways we could spin these queries. When reasoning about a problematic global state, we might say that we don’t care about public statics that are only readable, because the mutability is what creates the problem. When applying this sort of querying to your code base, you may do just that.

But for the purposes of the experiment, what I’m interested in is mainly the effect of these scopes on readability. Now that you’ve seen a primer on the way the concepts can be quantified, stay tuned for another round of experimentation.

Be Careful with Software Metaphors

skyscraper- software metaphors

Over the years, there have been any number of popular software metaphors that help people radically misunderstand the realities of software development.  Probably the most famous and persistent one is the idea that making software is similar to building a skyscraper (or to building construction in general).

This led us, as an industry, to approach software by starting with a knowledge worker “architect” who would draw grand schematics to plot every last detail of the software construction.  This was done so that the manual laborers (junior developers) tasked with actual construction could just do repetitive tasks by rote, deferring to a foreman (team lead) should the need for serious thinking arise.  It was important to lay a good foundation with database and framework selection, because once you started there could be no turning back.  Ever.  Should even minor plan changes arise during the course of the project, that would mean a change request, delaying delivery by months.

Software is just like construction, provided you’re terrible at building software.

This metaphor is so prevalent that it transcended conscious thought and crept its way into our subconscious, as evidenced by the “architect” title.  Given the prevalence of agile (or at least iterative) software development, I think you’d be hard pressed to find people that still thought software construction was a great model for building software.  I don’t think you see a lot of thinly sliced buildings, starting with an operational kitchen only and building out from there.

But there are other, more subtle, parallels that pervade the industry and lead to misunderstandings between “the business,” managers, and software developers.

Component Assembly

One such misunderstanding that I see frequently is to equate software components with physical components.  Consider a small application that consists of a login screen, a profile screen, and a screen that allows users to browse and make purchases.  There’s a natural tendency for people not involved with the code — particularly non-technical people — to view these as three isolated components.

The mental model thus becomes one of assembly.  This small project is like setting up a bedroom with disassembled furniture.  There’s a bed, a dresser, and a night stand, so what you do you when you have a tight timeline and plenty of labor?  Naturally, you create a bed team, a dresser team, and a night stand team and task them with working in isolation.  Once the individual components are ready, they can be integrated by positioning them appropriately within the room.  Right?

It’s the perfect plan, but it seems like the assembly teams can’t figure it out.  They keep talking about things you don’t care about, like databases, session management, and something called “common,” whatever that means.  So you tell them to have more meetings and figure it out.  But then they come back and talk about how it isn’t a good idea for two different teams to implement “leg.”  You patiently explain that a bed leg is different than a night stand leg or a dresser leg, and tell them to each make their own, and that you don’t know or care what “DRY” means.

Building isolated, pluggable components is a good idea, it makes business sense, and it allows you to pipeline labor.  Good developers will figure out how to make that sound plan a reality.  Right?

Not so much.  As it turns out, software components and physical components have some key differences.  Replicating a physical component involves construction, fabrication, or 3-D printing, whereas replication of a piece of software involves flipping a bunch of bits on a disk.  Reusing a physical component means hacking it off of a nightstand and taping it to a bed.  Reusing a software component is not destructive this way.  The differences go on, but the point is the same.  Asking a software team to operate as if it were building physical components is a recipe for friction between your mental model and theirs.financial debt as a good software metaphor

Financial Debt

This is liable to raise some eyebrows, because the concept of “technical debt,” is, perhaps, one of the best tools when it comes to facilitating a conversation between developers and people making budget decisions.  Technical debt (more or less) refers to a situation in which developers take a shortcut to get something to market  in a hurry, knowing that they’ll later have to undo their current work and “do it right.”  In other words, they’re paying a premium for short term liquidity, the way someone who incurs financial debt does.  They’ll later have to ‘repay the interest’ by spending more total time getting to the right solution.

Unlike the software metaphor of building construction, which I would argue has been largely damaging to the industry, financial debt as a metaphor has proved quite valuable.  But it has limits, and carrying the metaphor too far can lead to misaligned expectations.

When you take a loan from the bank, there are clear terms to payoff.  Generally this means that you’re constantly paying a small percentage of what you owe as a premium for the outstanding balance of the loan, and that percentage either doesn’t change, or it changes predictably.  With software?  Not so much.

Even assuming that it were straightforward to quantify the “less effort now for more effort later” trade off, you wouldn’t get a constant rate or even a predictably changing rate.  Software is a lot more volatile than that, and the “return rate” will depend on a lot more than what you’ve “borrowed.”

To put it in a way that’s perhaps easier to sink your teeth into, consider the software construct of “global state.”  If you don’t know what this is, think of it as a super power that software developers use to rip holes in the space time continuum, at least as it pertains to the world of software.  Let’s say that your software is a city, and there’s a traffic jam preventing you from shipping.  “No problem,” your developer says, “I can take care of that if you don’t mind some technical debt.”  She then proceeds to rip a wormhole into the middle of a busy street, diverting all traffic into what looks like some kind of desert somewhere.  Traffic problem solved.  Ship it.

You’ve taken out one single loan in this universe to get traffic to a manageable level.  Granted, that particular road is ruined by the wormhole, so you’ll need to build another one at some point, and that’s the interest you’ve agreed to pay.  That’s what you’re planning on doing when you get around to it.  And that all seems fine until you start getting weird reports of camels blocking traffic miles away, and scorpions stinging people going to work on traffic lights.

It turns out that ripping holes in the very fabric of your application has weird, unpredictable consequences that require repayment of debts you never planned for (and perhaps don’t understand the source of).  The lesson is that a decision to let (or encourage) developers to take shortcuts and make hacks can put your code in a degenerative state that neither of you is really prepared for.  If you aren’t careful, pressure to get them to ship will be more like navigating a minefield than shopping around for a mortgage.

Be Careful with Software Metaphors

It’s legitimately hard to mentally model software, particularly if you’ve never been a developer.  We live in a very tactile world and we use vivid mental models as mnemonics to aid our understanding.  Software is very abstract, conceptual, and precise in nature, and this makes it inordinately difficult to model in the way to which we’re accustomed.  We’re bad at bridging these worlds, and it’s perfectly understandable that we’re bad it.

Frankly, the only way to have a good mental model of software is through practice, and the realization that any analogies we use are transitory and incomplete at best.  Holding too close to any particular software metaphor is liable to trip you up in your decision making, so be very wary in conceiving of software development as being like anything other than… software development.

The Most Important Code Metrics You’ve Never Heard Of

Oh, how I hope you don’t measure developer productivity by lines of code. As Bill Gates once ably put it, “measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.”  No doubt, you have other, better reasoned code metrics that you capture for visible progress and quality barometers.  Automated test coverage is popular (though be careful with that one).  Counts of or trends in defect reduction are another one.  And of course, in our modern, agile world, spring velocity is ubiquitous.

But today, I’d like to venture off the beaten path a bit and take you through some metrics that might be unfamiliar to you, particularly if you’re no longer technical (or weren’t ever).  But don’t leave if that describes you — I’ll help you understand the significance of these metrics, even if you won’t necessarily understand all of the nitty-gritty details.

Perhaps the most significant factor here is that the code metrics I’ll go through can be tied, relatively easily, to stakeholder value in projects.  In other words, I won’t just tell you the significance of the metrics in terms of what they say about the code.  I’ll also describe what they mean for people invested in the project’s outcome.

Type Rank

It’s possible that you’ve heard of the concept of Page Rank.  If you haven’t, page rank was, for a long time, the method by which Google determined which sites on the internet were most important.  This should make intuitive sense on some level.  Amazon has a high page rank — if it went down, millions of lives would be disrupted, stocks would plummet, and all sorts of chaos would ensure.  The blog you created that one time and totally meant to add to over the years has a low page rank — no one, yourself included, would notice if it stopped working.

It turns out that you can actually reason about pieces of code in a very similar way.  Some bits of code in the code base are extremely important to the system, with inbound and outbound dependencies.  Others exist at the very periphery or are even completely useless (see the section below on dead code).  Not all code is created equally.  This scheme for ranking code by importance is called “Type Rank” (at least at the level of type granularity — methods can also be ranked).

You can use Type Rank to create a release riskiness score.  All you’d really need to do is have a build that tabulated which types had been modified and what their type rank was, and this would create a composite index of release riskiness.  Each time you were gearing up for deployment, you could look at the score.  If it were higher than normal, you’d want to budget extra time and money for additional testing efforts and issue remediation strategies.

Cohesion

Cohesion of modules in a code base can loosely be described as “how well is the code base organized?”  To put it a bit more concretely, cohesion is the idea that things with common interest are grouped together while unrelated things are not.  A cohesive house would have specialized rooms for certain purposes: food preparation, food consumption, family time, sleeping, etc.  A non-cohesive house would have elements of all of those things strewn about all over the house, resulting in a scenario where a broken refrigerator fan might mean you couldn’t sleep or work at your desk due to noise.

Keeping track of the aggregate cohesiveness score of a codebase will give you insight into how likely your team is to look ridiculous in the face of an issue.  Code bases with low cohesion are ones in which unrelated functionality is bolted together inappropriately, and this sort of thing results in really, really odd looking bugs that can erode your credibility.

Imagine speaking on your team’s behalf and explaining a bug that resulted in a significant amount of client data being clobbered.  When pressed for the root cause, you had to look the person asking directly in the eye and say, “well, that happened because we changed the font of the labels on the login page.”

You would sound ridiculous.  You’d know it.  The person you were talking to would know it.  And you’d find your credibility quickly evaporating.  Keeping track of cohesion lets you keep track of the likelihood of something like that.

Dependency Cycles

So far, I’ve talked about managing risk as it pertains to defects: the risk of encountering them on release, and the risk of encountering weird or embarrassing ones.  I’m going to switch gears, now, and talk about the risk of being caught flat-footed, unable to respond to a changing environment or a critical business need.

Dependency cycles in your code base represent a form of inappropriate coupling.  These are situations where two or more things are mutually dependent in an architectural world where it is far better for dependencies to flow one way.  As a silly but memorable example, consider the situation of charging your phone, where your phone depends on your house’s electrical system to be charged.  Would you hire an electrician to come in and create a situation where your house’s electricity depended on the presence of your charging phone?

All too often, we do this in code, and it creates situations as ludicrous as the phone-electrical example would.  When the business asks, “how hard would it be to use a different logging framework,” you don’t want the answer to be, “we’d basically have to rewrite everything from scratch.”  That makes as much sense as not being able to take your phone with you anywhere because your appliances would stop working.

So, keep an eye out for dependency cycles.  These are the early warning light indicators that you’re heading for something like this.

Dead Code

One last thing to keep an eye out for is dead code.  Dead code is code that can never possibly be called during the running application’s lifecycle.  It just sits in your codebase taking up space to no good end.

That may sound benign, but every line of code in your code base carries a small, cognitive maintenance weight.  The more code there is, the more results come back in text searches of the code base, the more files there are to lose and confuse developers, and the more general friction is encountered when working with the system.  This has a very real cost in the labor required to maintain the code.

Use Code Metrics Wisely

These are metrics about which fewer people know, so the industry isn’t rife with stories about people gaming them, the way it is with something like unit test coverage.  But that doesn’t mean they can’t be gamed.  For instance, it’s possible to have a nightmarish code base without any actual dead code — perversely, dead code could be eliminated by finding everything useless in the code base and implementing calls to it.

The code metrics I’ve outlined today, if you make them big and visible to all, should serve as a conversation starter.  Why did we introduce a dependency cycle?  Should we be concerned about the lack of cohesion in modules?  Use them in this fashion, and your group can save real money and produce better output.  Use them in the wrong fashion, and they’ll be just another ineffective management bludgeon straight out of a Dilbert comic.

NDepend updated to Version 6.2

NDepend version 6.2 has just been released. We have addressed over 20 bug fixes, including a blocker one for Visual Studio 2015 update 1 Git Controls.

More specifically the new Visual Studio 2015 Update 1 Git controls in the Visual Studio status bar were interacting with the NDepend Visual Studio extension status bar control. As a consequence this was provoking VS UI freezing. That is fortunate that the Visual Studio team warned partners (VSIP) a few weeks ago that they were adding controls to the status bar. The issue was coming from a synchronous usage of the WPF dispatcher to implement the NDepend progress & status circle. Invoking the dispatcher asynchronously fixed the issue.

GitStatusBar

We also stumbled on an unusual issue due to an unfixed Windows bug. When working with DataGridView with many rows (like 1000+) we can face an unmanaged StackOverflowException that crashes the process. The Windows bug is explained here http://stackoverflow.com/a/14716720/27194 and as far as we know it is not fixed. The problem occurs only when the Windows process TabTip.exe runs (“Touch Keyboard and Handwriting Panel Service“) and the stackoverflow link explains that the only way to prevent it is to disable this touch keyboard service. We’re going the hard way and actually when NDepend starts, it now tries to kill this process. Most of the time it’ll work, even if the Windows user is not administrator. If you get any inconvenience with this rough fix, please let us know.

Apart these two fixes, many other bugs were fixed and some improvements were added (see the complete list here). Bugs fixed also includes some incorrect results that were happening because the way Roslyn emits IL has significantly changed in some situations, and NDepend relies a lot on IL code analysis.

Enjoy!

 

 

Let’s Build a Metric: Using CQLinq to Reason about Application State

I’ve been letting the experiments run for a bit before posting results so as to give all participants enough time to submit, if they so choose.  So, I’ll refresh everyone’s memory a bit here.  Last time, I published a study of how long it took, in seconds (self reported) for readers to comprehend a series of methods that varied by lines of code.  (Gist here).  The result was that comprehension appears to vary roughly quadratically with the number of logical lines of code.  The results of the next study are now ready, and they’re interesting!

Off the cuff, I fully expected cyclomatic complexity to drive up comprehension time faster than the number of lines of code.  It turns out, however, that this isn’t the case.  Here is a graph of the results of people’s time to comprehend code that varied only by cyclomatic complexity.  (Gist here).

SecondsVsCyclomaticComplexity

If you look at the shape of this graph, the increase is slightly more aggressive than linear, but not nearly as aggressive as the increase that comes with an increase in lines of code.  When you account for the fact that a control flow statement is also a line of code, it actually appears that conditionals are easier to comprehend than the mathematical statements from the first experiment.

Because of this finding, I’m going to ignore cyclomatic complexity for the time being in our rough cut time to comprehend metrics.  I’ll assume that control flow statements impact time to comprehend as lines of code more than as conditional branching scenarios.  Perhaps this makes sense, too, since understanding all of the branching of a method is probably an easier task than testing all paths through it.

As an aside, one of the things I love about NDepend is that it lets me be relatively scientific about the approach to code.  I constantly have questions about the character and makeup of code, and NDepend provides a great framework for getting answers quickly.  I’ve actually parlayed this into a nice component of my consulting work — doing professional assessments of code bases and looking for gaps that can be assessed.

Going back to our in-progress metric, it’s going to be important to start reasoning about other factors that pertain to methods.  Here are a couple of the original hypotheses from earlier in the series that we could explore next.

  • Understanding methods that refer to class fields take longer than purely functional methods.
  • Time to comprehend is dramatically increased by reference to global variables/state.

If I turn a critical eye to these predictions, there are two key components: scope and popularity.  By scope, I mean, “how closely to the method is this thing defined?”  Is it a local variable, defined right there in the method?  Is it a class field that I have to scroll up to find a definition of?  Is it defined in some other file somewhere (or even some other assembly)?  One would assume that having to pause reading the method, navigate to some other file, open it, and read to find the definition of a variable would mean a sharp spike in time to comprehend versus an integer declared on the first line of the method.

And, by popularity, I mean, how hard is it to reason about the state of the member in question?  If you have a class with a field and two methods that use it, it’s pretty easy to understand the relationship and what the field’s value is likely to be.  If we’re talking about a global variable, then it quickly becomes almost unknowable what the thing might be and when.  You have to suck the entirety of the application’s behavior into your head to understand all the things that might happen in your method.

I’m not going to boil that ocean here, but I am going to introduce a few lesser known bits of awesomeness that come along for the ride in CQLinq.  Take a look at the following CQLinq.

If your reaction is anything like mine the first time I encountered this, you’re probably thinking, “you can do THAT?!” Yep, you sure can. Here’s what it looks like against a specific method in my Chess TDD code base.

MethodFieldsAndParametersResults

The constructor highlighted above is shown here:

BoardConstructor

As you can see, it has one parameter, uses two fields, and assigns both of those fields.

When you simply browse through the out of the box metrics that come with NDepend, these are not the kind of things you notice immediately.  The things toward which most people gravitate are obvious metrics, like method size, cyclomatic complexity, and test coverage.  But, under the hood, in the world of CQLinq, there are so many more questions that you can answer about a code base.

Stay tuned for next time, as we start exploring them in more detail and looking at how we can validate potential hypotheses about impact on time to comprehend.

And if you want to take part in this on going experiment, click below to sign up.




Join the Experiment