NDepend Blog

Improve your .NET code quality with NDepend

Let’s Build a Metric: Wrapping Up

April 7, 2016 3 minutes read

In the penultimate post in the metrics series, I explained the reasoning for winding down this series.  I also talked about performing one last experiment.  Well, that last experiment is in the books, and I’m going to share the results here, today.  The idea was to see what progress so far looked like, when applied to real code bases.

Before we get to the results, let’s recap the (admittedly quite incomplete) experimental formula that we’ve arrived at thus far.

UpdatedTimeToComprehend

T is the time in seconds, and it is a function of p, the number of parameters; n, the number of logical lines of code; f, the number of class fields; and, g, the number of globals.  Let’s see how this stacks up to reality.

I picked the methods to look at basically just by poking around at random C# projects on GitHub.  I tried to look for methods that were not overly jargon intensive and that minimized the factors that are not accounted for by this formula.

The First Method

First up was the following method, taken from this file.

Let’s take stock. We’ve got no parameters, so that’s easy. The method has 12 logical lines of code, 4 fields, and no globals. Plugging that into the calculation, we get an expected time to comprehend of 298.3404 seconds.

So, what was the experimental verdict?  Well, our experimental method readers came in with an average of 131 seconds.  So, not particularly close.

The Second Method

Now, method number 2, taken from this file.

Now to get our figures for this method. This time, we’ve got 3 parameters. The method has 21 logical lines of code, but refers to no fields, and no globals. It rips into static state a lot with File I/O, but doesn’t actually trigger our formula. Plugging these figures into the calculation, we get an expected time to comprehend of 776.8937 seconds.

Unfortunately, the experiment takers logged an average time of only 117.5 seconds.  That’s even more divergent than the last one.  Let’s see if method 3 yields better results.

The Third Method

The third method was taken from this file.

This healthy method has 2 parameters and weighs in at 34 logical lines of code. It features 1 field and 1 global as well. According to our formula, time to comprehend should thus be 2125.0758 seconds.  Wow, that’s a long time.  It looks like we’re way off the rails here.

And, indeed, the experimental value was 85.5 seconds.

Wrapping  Up

So, what went wrong, exactly?  Well, in compiling these figures, I’d say it looks like the number of lines of code was distorting things substantially.  Perhaps it’s not actually quadratic, or perhaps it is up until some maximum cap or something.  There might also be other factors at play as well.  The experiment was really too early for us to be expecting particularly accurate results.

The learning here is that it’s really, really hard to arrive at this figure, though I do not think it’s practically impossible to come up with a good, predictive framework.  I honestly do think it’d be a reasonable project with some serious investment and effort, but I, personally, lack the time to put that in with what I’ve got going on.  Perhaps I’ll take it up someday on my own and maybe even have a Kickstarter for it or something.

For those who have stuck with and continued with the series, my thanks.  I’ve enjoyed this, and I think it’s an important and interesting problem; perhaps at some point, the industry will come into broader agreement.  Until then, happy static analyzing to those reading.