NDepend

Improve your .NET code quality with NDepend

L:ack of Cohesion of Methods: What Is This And Why Should You Care?

Lack of Cohesion of Methods: What Is This And Why Should You Care?

Lack of cohesion of methods (sometimes abbreviated LCOM) is one of those things that occurs fairly high up on the software hierarchy of needs.  What’s the “software hierarchy of needs?”  It’s a thing that I just made up, shamelessly copying Maslow’s Hierarchy of Needs.  (Though in a quick search to see how effectively I’ve planted this flag, I see that Scott Hanselman had the idea before me.  But mine looks a little different than his because I’m talking about the software, rather than the humans writing it.)

Anyway, here’s the hierarchy.

  • The cost of change stays relatively flat because the software is highly maintainable.
  • You can change it over time in response to evolving requirements and market realities.
  • It performs and behaves acceptably in production.
  • Technically, it fulfills the functional requirements and user value proposition.
  • It simply exists in production.

The software hierarchy of needs.

When it comes to something like lack of cohesion of methods, you’re talking about the top two categories.  Early in your career, you’ll tend to have worries like just wheezing past the finish line with a feature and like getting your code to do what you want it to.  With practice and time, you then start worrying about non-functional concerns and maintainability.  And when you start worrying about that, you should start paying attention to lack of cohesion of methods (among other things).

So for the rest of this post, I’ll focus on this one idea: lack of cohesion of methods.  What is it, how do you measure it, and why should you care?

Continue reading Lack of Cohesion of Methods: What Is This And Why Should You Care?

A Guide to Code Coverage Tools for C#

I promise that I’ll get to a treatment of code coverage tools in short order.  But first, I want to offer a quick disclaimer and warning.

If you’re here because you’re looking for a way to let management track the team’s code coverage progress, please stop and reconsider your actions.  I’ve written in the past about how code coverage should not be a management concern, and that holds as true now as ever.  Code coverage is a tool that developers should use — not something that should be done to them.  Full stop.

What Is Code Coverage and Why Do It?

Okay, with that out of the way, let’s talk extremely briefly about what code coverage is.  And take note that this is going to be a very simple explanation, glossing over the fact that there are actually (slightly) different ways to measure coverage.  But the short form is this.  Code coverage measurements tell you which lines of code your test suite executes and which lines it doesn’t.

Let’s look at the simplest imaginable example.  Here’s a pretty dubious implementation of division:

Let’s say this was the only method in our codebase.  Let’s also say that we wrote a single unit test, from which we invoked Divide(2, 1) and asserted that we should get back 2.  We would then have 67% code coverage.  Why?  Well, this test would cause the runtime to test the conditional and then to execute the return x/y statement, thus executing two-thirds of the method.  Absent any other tests, no automated test ever executes that last line.

So in the grossest of terms, that’s the “what.”  As for the “why,” developers can use code coverage analysis to see holes in their testing efforts.  For instance, code coverage analysis would tell us here, “Hey, you need to test the case where you pass a 0 to the method for y.”

What Are Code Coverage Tools?

What, then, are code coverage tools?  Well, as you can imagine, computing code coverage the way I just did would get pretty labor intensive.

So developers have done what developers do.  They’ve automated code coverage detection.  In just about any language and tech stack imaginable, you have ways to detect how thoroughly the unit test suite covers your codebase.  (Don’t confuse code coverage with an assessment of the quality of your tests, though.  It’s just a measure of whether or not the tests execute the code.)

The result is an array of tools to choose from that help you see how well your test suite covers your codebase.  And that can become somewhat confusing.  So today, I’ll take you through some of your options in the .NET ecosystem.

Continue reading A Guide to Code Coverage Tools for C#

C# 8.0 Features: A Glimpse of the Future

C# 8.0 Features: A Glimpse of the Future

It’s been almost 20 years since Microsoft released the first version of the C# language. From its inception—when some unjustly deemed it a mere Java copycat—until now, C# has had a remarkable evolution.

Nowadays, it’s frequently featured in both most used and most loved programming languages lists. You can use it to develop desktop, web, and mobile apps, and you can write code that will run in all the major operating systems. Or you can jump right onto the IOT bandwagon and write code to “smarten” your house. We live in interesting times to be a C# developer indeed.

If the present is already exciting, what about the future? Would it be possible for us to get a glimpse of what lies ahead for the language?

Of course it is. Microsoft has developed C# “in the open” for quite a while now. You can just take a look at the GitHub repo to read (and participate—why not?) in the discussions and proposals.

Today, we’ve selected three feature proposals for C# 8.0 to talk about here: extension everything, default implementations on interfaces, and nullable reference types.

Continue reading C# 8.0 Features: A Glimpse of the Future

Unit testing doesn't affect codebases the way you think.

Unit Testing Doesn’t Affect Codebases the Way You Would Think

I’ve just wrapped up another study.  (The last one was about singletons, if you’re interested.) This time, I looked at unit testing and the impact it has on codebases.

It didn’t turn out the way I expected.

I’m willing to bet that it also won’t turn out that you expected these results.  It had a profound effect on certain unexpected codebase properties while having minimal effect on some that seem like no-brainers.  But I’ll get to the specifics of that shortly.

First, let me explain a bit about expectations and methodology.

Unit Testing: The Expected Effect on Code

Let’s be clear for a moment here.  I’m not talking about the expected effect of unit tests on outcomes. You might say, “We believe in unit testing because it reduces defects” or, “We unit test to document our design intentions,” and I didn’t (and probably can’t) measure those things.

When I talk about the effect of unit testing, I’m talking about the effect it has on the codebase itself.  Let’s consider a concrete example to make it clear what I mean.  You can’t (without a lot of chicanery) unit test private methods, and you can’t easily unit test internal methods.  This creates a natural incentive to make more methods public, so we might expect heavily unit-tested codebases to feature more public methods.

This actually turns out to be true.

The rate of public methods increases with increased prevalence of unit testing.

I’ll talk more about the axes later in the post.  But for now, check out the plot and the trend line.  More unit testing means a higher percentage of public methods.  Score one for that hypothesis!

With a win there, let’s think of some other hypotheses that seem plausible.  Methods with more “going on” tend to be harder to test.  So you’d expect relatively simple methods in a highly tested codebase.  To get specific, here was what I anticipated in heavily tested codebases:

I also had some thoughts about the impact on types:

  • More interfaces (this makes testing easier).
  • Less inheritance (makes testing harder).
  • More cohesion.
  • Fewer lines of code.
  • Fewer comments.

Continue reading Unit Testing Doesn’t Affect Codebases the Way You Would Think

5 Tips to Help You Visualize Code

5 Tips to Help You Visualize Code

Source code doesn’t have any physical weight — at least not until you print it out on paper.  But it carries a lot of cognitive weight.  It starts off simply enough. But before long, you have files upon files, folders upon folders, and more lines of code than you can ever keep straight.  This is where the quest to visualize code comes in.

The solution file and namespaces organization make for a pretty unhelpful visualization aid.  But that’s nothing against those tools. It’s just not what they’re for.  Nevertheless, if the only way you attempt to visualize code involves staring at hierarchical folders, you’re gonna have a bad time.

How do most people handle this?  Well, they turn to whiteboards, formal design documents, architecture diagrams, and the like.  This represents a much more powerful visual aid, and it tends to serve as table stakes of meaningful software development.

But it’s a siren song.  It’s a trap.

Why?  Well, as I’ve discussed previously, those visualization aids just represent someone’s cartoon of what they think the code will look like when complete.  You draw up a nice layer-cake architecture, and you wind up with something that looks more like six tumbleweeds glued to a barbed wire fence.  Those visual aids are great…for visualizing what everyone wishes your code looked like.

What I want to talk about today are strategies to visualize code — your actual code, as it exists.

Continue reading 5 Tips to Help You Visualize Code

A problem with extension methods

We like extension methods. When named accordingly they can both make the caller code clearer, and isolate static methods from classes on which they operate.

But when using extension methods, breaking change can happen, and this risk is very concrete, it actually just happened to us.

Since 2012, NDepend.API proposes a generic Append() extension:

Two default rules use this extension method: Avoid namespaces dependency cycles and Avoid types initialization cycles

Last month, on Oct 17th 2017, Microsoft released .NET Framework v4.7.1 that implements .NET Standard 2.0. Around 200 .NET Standard 2.0 were missing in .NET Framewok v4.6.1, and one of those missing API is:

Within NDepend, rules, quality gates, trend metrics … basically everything, is a C# LINQ query stored as textual and compiled and executed on-the-fly. Since the compilation environment uses both namespaces NDepend.Helpers and System.Linq, when running NDepend on top of the .NET Framework v4.7.1, both Append() extension methods are visible. As a consequence, for each query calling the Append() method, the compiler fails with:

Hopefully a user notified us with this problem that we didn’t catch yet and we just released NDepend v2017.3.2 that fixes this problem Only one clean fix is possible to make it compatible with all .NET Framework versions: refactor all calls to the Append() extension method,  into a classic static method invocation, with an explanatory comment:

We expect support on this within the next weeks and months when more and more users will run the .NET Fx v4.7.1 while not changing their rules-set. There is no lesson learnt, this situation can happen and it happens rarely, this shouldn’t prevent you from declaring and calling extension methods. The more mature the frameworks you are relying on, the less likely it’ll happen.

CRAP Metric is a Thing And It Tells You About Risk in Your Code

CRAP Metric Is a Thing And It Tells You About Risk in Your Code

I won’t lie.  As I thought about writing this post, I took special glee in contemplating the title.  How should I talk about the CRAP metric?  *Snicker*

I guess that just goes to show you that some people, like me, never truly grow up.  Of course, I’m in good company with this, since the original authors of the metric had the same thought.  They wanted to put some quantification behind the common, subjective declaration, “This code is crap!”

To understand that quantification, let’s first consider that CRAP is actually an acronym: C.R.A.P.  It stands for change risk anti-patterns, so it addresses the risk of changing a bit of code.  In other words, methods with high CRAP scores are risky to change.  So the CRAP metric is all about assessing risk.

When you get a firm grasp on this metric, you get a nice way to assess risk in your codebase.

The CRAP Metric: Getting Specific

Okay, so how does one quantify risk of change?  After all, there are a lot of ways that one could do this.  Well, let’s take a look at the formula first.  The CRAP score is a function of methods, so we’ll call it CRAP(m), mathematically speaking.  (And yes, typing CRAP(m) made me snicker all over again.)

Let CC(m) = cyclomatic complexity of a method and U(m) = the percentage of a method not covered by unit tests.

CRAP(m) = CC(m)^2 * U(m)^3 + CC(m).

Alright, let’s unpack this a bit.  To arrive at a CRAP score, we need a method’s cyclomatic complexity and its code coverage (or really lack thereof).  With those figures, we multiply the square of a method’s complexity by the cube of its rate of uncovered code.  We then add its cyclomatic complexity to that.  I’ll discuss the why of that a little later, but first let’s look at some examples.

First, consider the simplest, least CRAP-y method imaginable: a method completely covered by tests and with no control flow logic.  That method has a cyclomatic complexity of 1 and uncovered percentage of 0.  That means that CRAP(m) = 1^2 * 0^3 + 1 = 1.  So the minimum CRAP metric score is 1.  What about a gnarly method with no test coverage and cyclomatic complexity of 6?  CRAP(m) = 6^2 * 1^3 + 1 = 37.

The authors of this metric define a CRAP-y method as one with a CRAP score greater than 30, so that last method would qualify.

Continue reading CRAP Metric Is a Thing And It Tells You About Risk in Your Code

Code reuse is not a good goal.

Code Reuse is Not a Good Goal

Wait, wait, wait.  Put down the pitchforks and listen for a minute.  You’re probably thinking that I’m about to tout the “virtues” of copy/paste programming or something.  But I assure you I’m not going to do that.  Instead, I’m going to speak to a similar but subtly different topic: code reuse as a first-class goal.

If you’ve read The Pragmatic Programmer, you know about the DRY principle.  You’ll also know that the underlying evil comes from duplication of knowledge in your system.  This creates inconsistencies and maintenance headaches.  So, since you duplicate code at your peril, isn’t code reuse a good thing?  Isn’t it the opposite of code duplication?

No, not exactly.  You can write “hello world” without any duplication.  But you can also write it without reusing it or anything in it.

Code Reuse as a First-Class Goal

So what, then, do I mean when I talk about code reuse as a first-class goal?  I’m talking about a philosophy that I see frequently in my consulting, especially in the enterprise.

The idea seems to come from a deep fear of rework and waste.  If we think of the history of the corporation, the Industrial Revolution saw an explosion in manufacturing driven by global efficiency.  The world scrambled to make things cheaper and faster, and waste or rework in the process impeded that.

Today and in our line of work, we don’t manufacture widgets.  Instead, we produce software.  But we still seem to have this atavistic terror that someone, somewhere, might already have written a data access layer component that you’re about to write.  And you writing it would be a waste.

In response to this fear, organizations come up with plans that can get fairly elaborate.  They create “centers of excellence” that monitor teams for code reuse opportunities, looking to stamp out waste.  Or they create sophisticated code sharing platforms and internal app stores.  Whatever the details, they devote significant organizational resources to avoiding this waste.

And that’s actually great, in some respects.  I mean, you don’t want to waste time reinventing wheels by writing your own source control tools and logging frameworks.  But things go sideways when the goal becomes not one of avoiding reinvented wheels but instead one of seeing how much of your own code you can reuse.

Let’s take a look at some of the problems that you see when organizations really get on the “code reuse” horse.

Continue reading Code Reuse is Not a Good Goal

The Singleton Design Pattern: Impact Quantified

The Singleton Design Pattern: Impact Quantified

This post has been about a month in the offing.  Back in August, I wrote about what the singleton pattern costs you.  This prompted a good bit of discussion, most of which was (as it always is) anecdotal.  So a month ago, I conceived of an experiment that I called the singleton challenge.  Well, the results are in.  I’m going to quantify the impact of the singleton design pattern on codebases.

I would like to offer an up-front caveat.  I’ve been listening lately to a fascinating audiobook called “How to Measure Anything,” and it has some wisdom for this situation.  Measurement is primarily about reducing uncertainty.  And one of the driving lessons of the book is that you can measure things — reduce uncertainty — without getting published in a scientific journal.

I mention that because it’s what I’ve done here.  I’ll get into my methodology momentarily, but I’ll start by conceding the fact that I didn’t (and couldn’t) control for all variables.  I looked for correlation as a starting point because going for causation might prove prohibitive.  But I think I took a much bigger bite out of trying to quantify this than anyone has so far.  If they have, I’ve never seen it.

A Quick Overview of the Methodology

As I’ve mentioned in the past on this blog, I earn a decent chunk of my consulting income doing application portfolio assessments.  I live and breathe static code analysis.  So over the years, I’ve developed an arsenal of techniques and intellectual property.

This IP includes an extensive codebase assessor that makes use of the NDepend API to analyze codebases en masse, store the results, and report on them.  So I took this thing and pointed it at GitHub.  I then stored information about a lot of codebases.

But let’s get specific.  Here’s a series of quick-hitter bullets about the experiment that I ran:

  • I found this page with links to tons of C# projects on GitHub, so I used that as a “random” selection of codebases that I could analyze.
  • I gave my mass analyzer an ordered list of the codebase URLs and turned it loose.
  • Anything that didn’t download properly, decompress properly, or compile properly (migrating for Core, restoring NuGet packages, and building from command line) I discarded.  This probably actually creates a bias toward better codebases.
  • Minus problematic codebases, I built all solutions in the directory structure and made use of all compiled, non-third-party DLLs for analysis.
  • I stored the results in my database and queried the same for the results in the rest of the post.

I should also note that, while I invited anyone to run analysis on their own code, nobody took me up on it.  (By all means, still do it, if you like.)

Singleton Design Pattern: the Results In Broad Strokes

First, let’s look at the scope of the experiment in terms of the code I crunched.  I analyzed

  • 100 codebases
  • 986 assemblies
  • 5,086 namespaces
  • 72,615 types
  • 501,257 methods
  • 1,495,003 lines of code

From there, I filtered down raw numbers a bit.  I won’t go into all of the details because that would make this an immensely long post.  But suffice it to say that I discounted certain pieces of code, such as compiler-generated methods, default constructors, etc.  I adjusted this so we’d look exclusively at code that developers on these projects wrote.

Now, let’s look at some statistics regarding the singleton design pattern in these codebases.  NDepend has functionality for detecting singletons, which I used.  I also used more of its functionality to distinguish between stateless singleton implementations and ones containing mutable state.  Here’s how that breaks down:

Continue reading The Singleton Design Pattern: Impact Quantified

You have no excuse for dead code.

You Have No Excuse for Dead Code

In darker times, software management would measure productivity as a function of lines of code.  More code means more done, right?  Facepalm.  When I work with IT management in my capacity as a consultant, I encourage them to view code differently.  I encourage them to view code as a liability, like inventory.  And when useful code is a liability, think of what a boat anchor dead code is.

I once wrote a fun post about the fate of dead code in your codebase.  And while I enjoyed writing that, it had a serious underlying message.  Dead code costs you time, money, and maintenance headaches.  And it has absolutely no upside.

A Working Definition for Dead Code

Okay. If I’m going to make a blog post out of disparaging dead code, I should probably offer a definition.  Let’s do that here.

Some people will draw a distinction between code that can’t be executed (unreachable) and executed code whose effects don’t matter (dead).  I acknowledge this definition but won’t use it here.  For the sake of simplicity and clarity of message, let’s create a single category of dead code: any code in your codebase that has no bearing on your application’s behavior is, for our purposes here, dead.

The Problems with Dead Code

Having defined it, what’s the problem?  If it has no bearing on your application’s behavior, what’s the harm?  How does it cost time and money, as I claimed a moment ago?

Well, simply put, your code does not live in a shrink-wrapped vacuum.  As your application evolves, developers have to change the code.  When you have only code that matters in your codebase, they can do this with the most efficiency.  If, on the other hand, you have thousands of lines of useless code, these developers will spend hundreds of hours maintaining that useless code.

Think of having dead code as being reminiscent of running your heat in the winter while keeping all of your windows open.  It’s self-defeating and wasteful.

But even worse, it’s a totally solvable problem.  Let’s take a look at different types of dead code that you encounter and what you can do about it.

Continue reading You Have No Excuse for Dead Code