NDepend

Improve your .NET code quality with NDepend

Let’s Build a Metric 3: Compositeness

Last time, I talked through a little bit of housekeeping on the way to creating a metric that would be, uniquely, ours.  Nevermind the fact that, under the hood, the metric is still lines of code.  It now has a promising name and is expressed in the units we want.  And, I think that’s okay.  There is a lot of functionality and ground to cover in NDepend, so a steady, measurable pace makes sense.

It’s now time to start thinking about the nature of the metric to be created here, which is essentially a measure of time.  That’s pretty ambitious because it contains two components: defining a composite metric (recall that this is a mathematical transform on measured properties) and then tying it an observed outcome via experimentation.  In this series, I’m not assuming that anyone reading has much advanced knowledge about static analysis and metrics, so let’s get you to the point where you grok a composite metric.  We’ll tackle the experimentation a little later.

A Look at a Composite Metric

I could walk you through creating a second query under the “My Metrics” group that we created, but I also want this to be an opportunity to explore NDepend’s rich feature set.  So instead of that, navigate to NDpened->Metric->Code Quality->Types with Poor Cohesion.

Metric3

When you do that, you’re going to see a metric much more complicated than the one we defined in the “Queries and Rules Edit” window.  Here’s the code for it, comments and all.

There’s a good bit to process here.  The CQLinq code here is inspecting Types and providing data on Types.  “Type” here means any class or struct in your code base (well, okay, in my code base), along with a warning if you see anything that matches.  And, what does matching mean?  Well, looking at the compound conditional statement, a type matches if it has “LCOM” greater than .8 or “LCOMHS” greater than .95 and it also has more than 10 fields and 10 methods.  So, to recap, poor cohesion means that there are a good number of fields, a good number of methods, and… something… for these acronyms.

LCOM stands for “Lack [of] Cohesion of Methods.”  If you look up cohesion in the dictionary, you’ll find the second definition particularly suited for this conversation: “cohering or tending to cohere; well-integrated; unified.”  We’d say that a type is “cohesive” if it is unified in purpose or, to borrow from the annals of clean code, if it conforms to the single responsibility principle.  To get a little more concrete, consider an extremely cohesive class.

This class is extremely cohesive.  It has one field and three methods, and every method in the class operates on the field.  Type cohesion might be described as “how close do you get to every method operating on every field?”

Now, here’s the crux of our challenge in defining a composite metric: how do you take that anecdotal, qualitative description, and put a number to it?  How do you get from “wow, Recipe is pretty cohesive” to 0.0?

Well, this is where the mathematical transform part comes in.  Here is how NDepend calculates Lack of Cohesion of Methods (LCOM).

  • LCOM = 1 – (SUM(MF)/(M*F))

Where:

  • M is the number of methods in class (both static and instance methods are counted, it includes also constructors, properties getters/setters, events add/remove methods).
  • F is the number of instance fields in the class.
  • MF is the number of methods of the class accessing a particular instance field.
  • Sum(MF) is the sum of MF over all instance fields of the class.

Quantifying the Qualitative

Whoah.  Okay, let’s walk before we run.  It’ll be helpful to work backward from an already proposed formula.  What do we know by looking at this?

Well, at the very highest level, we’re talking about fields and methods in the class, and how they interact.  It’s easy enough to count the number of methods and fields — so far, so good.  MF, for a given field, the number of methods in the class that access that field, which means that Sum(MF) is the aggregate for all fields.  Sum(MF) will therefore be less than or equal to M*F.  They’re only equal in the case where every method accesses every field.

Thus the term SUM(MF)/(M*F) will range from 0 to 1, which means that the value of this metric ranges from 1 to 0.  1 is thus a perfectly non-cohesive class and 0 is a perfectly cohesive class.  Notice that I described Recipe as “0.0”?  If you run this metric on that class, you’ll see that it scores 0 for a “perfect cohesion score.”  And so, the goal here becomes obvious.  The creator of this metric wanted to come up with a way to describe cohesion normalized between 0 and 1 with a concept of bounded, “perfect” endpoints.

And this is the essence of compositeness, though to build such a metric, you create the transform rather than working backward to deduce the reasoning.  You start out with a qualitative evaluation and then think about a hypothesis for how you want to represent that data.  Is there a minimum?  A maximum?  Should the curve between them be linear?  Exponential?  Logarithmic?

It’s not a trivial line of thinking, by any stretch.  As it turns out, there isn’t even agreement, per se, on the best way to describe type cohesion.  You’ll notice that NDepend supports a secondary metric (that I intentionally omitted from the formula definition above for simplicity) for cohesion called LCOM HS, which stands for the “Henderson Sellers Lack of Cohesion of Methods.”  This is a slightly different algorithm for computing cohesion.  And man, if experts in the field can’t agree on the ideal metric, you can see that this is a tall order.  But hey, I didn’t say it’d be easy — just fun.

So, having seen a little bit more of NDepend and established a foundation for understanding a bit of the theory behind composite code metrics, I’ll leave off until next time, when I’ll dig a bit into how we can start reasoning about our own composite metric for “time to understand a method.”  Stay tuned because that will get interesting — I’ll go through a little bit more of NDepend and even get into Newtonian Mechanics a little.  But not too far.  I promise.

< Let’s Build a Metric 2: Getting the Units Right

Let’s Build a Metric 4: Science and Experiments >

Let’s Build a Metric 2: Getting the Units Right

Last time in this series, there was definitely some housekeeping to accomplish.  I laid the groundwork for the series by talking a bit of theory — what are code metrics and what is static analysis?  I also knocked out some more practical logistics such as taking you through how to get the code base I’m using as well as how to create and attach an NDepend project to it.  The culmination was the creation of our very own, tiny, and relatively useless code metric (useless since NDepend already provides the metric we ‘created’).

So, seems like a good next thing to do is make the metric useful, but there may be an interim step in there of at least making it original.  Let’s do that before we get to “useful.”  But even before that, I’m going to do a bit of housekeeping and tell you about the NDepend feature of managing the queries and rules in the Queries and Rules Explorer.

Group Housekeeping

Last time around, I created the “My Rules” group under “Samples of Custom rules.”  This was nice for illustrative purposes, but it’s not where I actually want it.  I want to bring it up on the same level as that, right after “Defining JustMyCode”.  To do that, you can simply drag it.  Dragging it will result in a black bar indicating where it will wind up:

 

Metrics2 (1)

 

 

Once I’ve finished, it’ll be parked where I want it, as shown below.  In the future, you can use this to move your groups around as you please.  It’s pretty intuitive.

 

Metrics2 (2)

Now, have you noticed that this is called “My Rules” and that’s a little awkward, since we’re actually defining a query.  We’re ultimately going to be trying to figure out how long it takes to comprehend a method, and that’s a query.  A rule would be “it should take less than 30 seconds to comprehend a method.”  We may play with making rules around this metric later, but for now, let’s give the group a more appropriate name.  NDepend is integrated with the expected keyboard shortcuts, so you can fire away with the “rename” shortcut, which is F2.  Do that while “My Rules” is highlighted, and you’ll get the standard rename behavior.

 

Metrics2 (3)

 

Let’s call it “My Metrics.”

 

Metrics2 (4)

Getting Units Right for Our Metric

Now, with various bits of housekeeping out of the way, let’s change the metric “Lines of Code” to be something else — something original.  Open it in the Queries and Rules Editor, and change its name by changing the XML doc comment above the query within the “Name” tag.  Let’s call it “Seconds to Comprehend a Method.”  It’s important in these queries to specify the units in the name of the metric so that it’s clear upon inspection what, exactly, is being measured.  The concept of units does not exist, per se, in NDepend so queries out of the box have titles like “# Source Files” and “# IL Instructions” to indicate that the unit is a count.

Metrics2 (5)

 

Now the title is looking good, but there’s a problem in the actual display of the metric on the side.  Take a look:

 

Metrics2 (6)

 

It seems like a metric called “seconds to comprehend a method” probably shouldn’t internally report in terms of “# of lines of code (LOC)”.  We probably want this to be “seconds.”  The way you can accomplish this is by taking advantage of the fact that, under the covers, what we’re doing in this file is taking advantage of the C# construct of “anonymous type.”

In C#, an anonymous type is one that is defined even as it is instantiated, having its property names, types, and ordering declared in-situ.  In an actual C# method, we might do something like:

That’s not how things work in NDepend, however.  For these queries and rules, we aren’t supplying declarations, per se, but rather returning an expression that the engine knows how to turn into graphical information at the bottom.  In our example, currently, we’re returning an anonymous type consisting of the method itself as the first property, followed by its name and then its number of lines of code.  This is translated in graphical format to what you see underneath the query editor automatically, with the values displayed as data and the property names as the headers.  So, to change the header, we need to change the property name.  Change this:

to this:

and observe the new display results.

Metrics2 (7)

Fluent versus Query Linq

Finally, there’s one more thing that I’m going to do here before we wrap up this post.  My strong preference over the years has been to use the fluent Linq syntax, rather than the query syntax, and based on what I see of the code written by others, I’m not alone.  So, let’s do a conversion here before we get too much further.

This code can be expressed, equivalently, as

Wrapping Up

So far, we’re taking it a little slow on making progress toward our killer, composite metric, but that’s somewhat by design.  These posts split purpose between building a cool metric, and walking you through NDepend in some detail.  In terms of the former goal, we went from having a metric that was just a repeat of one already supplied out of the box to one that is our own — at least at a casual inspection.  Next time around, we’ll actually start doing something that doesn’t just assume it takes one second to comprehend each line of a method.

From a “getting to know NDepend” point of view, we took a walk through creating, renaming, and moving query/rule groups around, and we started to get into CQLinq a little bit in terms of practical application and theory.  That’s a topic that will be visited more throughout the “Let’s Build a Metric” series, and one that we’ll pick right up with in the next post.

 

< Let’s Build a Metric 1: What’s in a Metric?

 

Let’s Build a Metric 3: Compositeness >

Let’s Build a Metric 1: What’s in a Metric?

A while back, I made a post on my blog announcing the commencement of a series on building a better, composite code metric.  Well, it’s time to get started!

But before I dive in headfirst, I think it’s important to set the scene with some logistics and some supporting theory.  Logistically, let’s get specific about the techs and code that I’m going to be using in this series.  To follow along, get yourself a copy of NDepend v6, which you can download here.  You can try to follow along if you have an older version of the tool, but caveat emptor, as I’ll be using the latest bits.  Secondly, I’m going to use a codebase of mine as a guinea pig for this development.  This codebase is Chess TDD on github and it’s what I use for my Chess TDD series on my blog and Youtube.  This gives us a controlled codebase, but one that is both active and non-trivial.

What are Static Analysis and Code Metrics, Anyway?

Now, onto the supporting theory.  Before we can build meaningful code metrics, it’s important to understand what static analysis is and what code metrics are.  Static analysis comes in many shapes and sizes.  When you simply inspect your code and reason about what it will do, you are performing static analysis.  When you submit your code to a peer to have her review, she does the same thing.  

Like you and your peer, compilers perform static analysis on your code, though, obviously, they do so in an automated fashion.  They check the code for syntax errors or linking errors that would guarantee failures, and they will also provide warnings about potential problems such as unreachable code or assignment instead of evaluation.  Products also exist that will check your source code for certain characteristics and stylistic guideline conformance rather than worrying about what happens at runtime.  In managed languages, products exist that will analyze your compiled IL or byte code and check for certain characteristics.   The common thread here is that all of these examples of static analysis involve analyzing your code without actually executing it.

The byproduct of this sort of analysis is generally some code metrics.  MSDN defines code metrics as, “a set of software measures that provide developers better insight into the code they are developing.”  I would add to that definition that code metrics are objective, observable properties of code.  The simplest example on that page is “lines of code” which is just, literally, how many lines of code there are in a given method or class.  Automated analysis tools are ideally suited for providing metrics quickly to developers.  NDepend provides a lot of them, as you can see here.

It’s important to make one last distinction before we move on: simple versus composite metrics.  An example of a simple metric is the aforementioned lines of code.  Look at a method, count the lines of code, and you’ve got the metric’s value.  A composite metric, on the other hand, is a higher level metric that you obtain by performing some kind of mathematical transform on one or more simple metrics.  As a straightforward example, let’s say that instead of just counting lines of code, you defined a metric that was “lines of code per method parameter.”  This would be a metric about methods in your code base that was computed by taking the number of lines of code and dividing them by the number of parameters to the method.

Is this metric valuable?  I honestly don’t know — I just made it up.  It’s interesting (though you’d have to do something about the 0 parameter case), but if I told you that it was important, the onus would be on me to prove it.  I mention this because it’s important to understand that composite metrics like Microsoft’s “maintainability index” are, at their core, just mathematical transforms on observable properties that the creator of the metric asserts you should care about.  There is often study and experimentation behind them, but they’re not unassailable measures of quality.

Let’s Look at NDepend Metrics

With the theory and back-story out of the way, let’s roll up our sleeves and get down to business.  You’re going to need NDepend installed now, so if you haven’t done that yet, check out the getting started guide.  Installing NDepend is beyond the scope of this series.

If you’re going to follow along with me, clone the Chess TDD codebase and open that up in Visual Studio.  The first thing we’re going to need to do is attach an NDepend project.  With the code base open, that will be your first option in the NDepend window.  I attached a project, named it “Chess Analysis” and saved the NDepend project file in the root folder of the project, alongside the .sln file.  This allows you to optionally source control it pretty easily and to keep track of it at a high level.

attach

Once you’ve created an attached the project, run an analysis.  You can do this by going to the NDepend menu, then selecting Analyze->Run Analysis.

analyze

Now, we’re going to take a look in the queries and rules explorer.  NDepend has a lot of cool features and out of the box functionality around metrics, but let’s dive into something specific right now, for this post.  We’ll get to a lot of the other stuff later in this series.  Navigate in the NDepend menu to Rule->View Explorer Panel.  This will open the queries and rules explorer.  Click the “Create Group” button and create a rule group called “My Rules.”

Create Rule

Now, right click on “My Rules” and select “Create Child Query,” which will bring up the queries and rules editor window.  There’s a bit of comment XML at the top, which is what will control the name of the rule as it appears in the explorer window.  Let’s change that to “Lines of Code.”  And, for the actual substance of the query, type:

It should look like this:

Create Rule

Congratulations! You’ve just created your first code metric. Nevermind the fact that lines of code is a metric almost as old as code itself and the fact that you didn’t actually create it. Kidding aside, there is a victory to be celebrated here. You have now successfully created a code metric and started capturing it.

Next time, we’ll start building an actual, new metric that NDepend isn’t already providing you out of the box. Stay tuned!

 

Let’s Build a Metric 2: Getting the Units Right >