NDepend

Improve your .NET code quality with NDepend

Moq a Detailed Look at its Code Quality

Moq: A Detailed Look at Its Code Quality

In case you haven’t seen it, I’ve been doing a series of research-oriented posts for this blog.  This is going to be in the same vein but focused on the Moq codebase instead of focusing on hundreds of codebases.  Why Moq?  Well, I’ll get to that in a moment.

I started this by making a set of observations relating unit test prevalence to properties of clean code.  That generated considerable buzz, so I did some more studies in that vein, refining the methodology and adding codebases as we went.  By the end of the series, we’d grown the sample size to 500 codebases and started doing actual regression analysis.

Since then, we’ve enlisted the help of someone who specializes in data analysis to do some PCA with the data, which far outstrips my background studying data.  We brought this to bear in studying the effects of functional-style programming on codebases and also on categorizing codebases according to simple vs. complex and monolithic vs. decoupled, in addition to functional vs. OOP.  Doing this across more than 500 C# codebases has produced a wealth of information.

Okay, But How Does Moq Fit in?

I give you all this backstory in case you want to read about it but also to explain that I’ve looked at these codebases en masse.  With hundreds of codebases and millions of lines of code, I’m not going in and poking around to see if the code looks clean.  I’m using NDepend to perform large-scale robo-analysis.

And, while that’s been great, I started to want to see just how these categorizations stacked up.  So I started scrolling through the summary data and Moq jumped out at me.

First of all, I know Moq pretty well from using it over the years, and secondly, it had stood out when looking at the rate of unit test methods in the codebase.  Nearly half of its methods are test methods, which seems reasonable for a tool designed to help you write unit tests.  Combine that with these stats in the PCA:

So as a quick interpretation, Moq counts as reasonably non-monolithic, very simple, and very functional.  Add to that the high degree of unit testing, and I figured we’d have a codebase that was a joy to look at.  So I popped open the source code (as it was at the beginning of the year when I was grabbing codebases en masse) and analyzed it with NDepend, fresh off my excitement about using the new dark theme. I wanted to see if it was as much of a joy to look at as all of this data and statistical rigor indicated it would be.  And spoiler alert, it was the kind of codebase I’d feel right at home in.  Let’s take a tour.

NDepend Analysis, Test Coverage, and First Look

The first thing I wanted to look at was code coverage.  This was because I wanted a quick test case to see if my assumption that a high rate of test methods would correspond to high coverage.  And, it did.  Here’s a quick look at what happened after I imported coverage data and then ran analysis on the project.

Now I could see with Visual Studio’s coverage tool itself that Moq was sporting roughly a 90% test coverage.  But by importing the coverage data into NDepend, you can paint a much more compelling picture.

The heat map dominating most of the screen shows squares corresponding to methods in the codebase.  Larger squares are larger methods.  And the coloring indicates test coverage.  Over on the right, among the 2,152 methods in question, you can actually scroll through them in order of percent coverage and navigate to them if you want to take a look.

Taking a Look at the Dashboard and Technical Debt

So far, all systems go.  Most of the stats on the codebase looked good, coming in from the broad aggregates I have in a spreadsheet.  And then, following the same trend, Moq looks great in the IDE from a testing perspective.  But I looked at the dashboard and saw this:

Basic stats about the codebase lined up, and there’s the test coverage, hovering around 90%.  But a C for its tech debt rating?  10 critical rules violated?  This surprised me, given how rosy everything looked in the statistical analysis.  I drilled in to take a look.

And, sure enough, there be some dragons.  Huge types and overly complex methods are a problem.  Also in there are mutually dependent namespaces, which create coupling that hurts you as a codebase grows.  And you’ve got some hiding of base class methods and global state.  To get a sense for what I was looking at, I created another heat map. In this case, we’re looking at types.  The bigger the square, the more lines of code, and the closer to red, the more methods in that type.

This explains why the averages looked good in my spreadsheet but why NDepend has some objections and critical rules violated.  By and large, you have a lot of little green types, which is what you’d hope to see.  But there are some pretty hefty types in the mix, both in terms of lines of code to the type and number of methods to the type.

Some of these are gigantic unit test classes, while others are the API.  For those of you familiar with Moq, this should make sense to you.  Think of how many static methods you invoke on the Mock class. This illustrates the classic tradeoff between shooting for clean code guidelines about methods per type and such and between providing the API you want to furnish.

Drilling Into the Sources of Debt

NDepend has a view where you can drill into the technical debt per type or per method, and sort it accordingly.  I did that per type, and here’s what I saw:

No big surprise there.  The technical debt was coming disproportionately from these large types.  So I took advantage of another view to see where the debt was coming from, by rule violation. Here’s what that looked like:

Topping that list is a series of things that I would make it a priority to address in my own codebases, time permitting: types that are too big, types that have too many methods, namespace dependency cycles, and so on.  There is, however, one exception to what I would worry about as a top priority issue—visibility of nested types as a design choice.  That’s based on a Microsoft guideline and I don’t personally favor an approach where I use this a lot.  But it doesn’t bother me, either.

I see that the creators of Moq did that 395 times, so clearly they view it as a useful design choice.  This made me curious about what would happen to the tech debt grade if I disabled that particular rule.  So I did that, and the result was a somewhat greener and more pleasing grade of B:

What’s the Verdict With Moq?

I also spent some time scrolling through various classes and methods.  I didn’t want the entirety of the experience to be just a matter of data gathering since that’s what I’ve been doing for months.  And the verdict I have is that this particular data point fits in nicely with the aggregate.

Moq has excellent stats for most of what I’ve been looking at.  And, indeed, there are a lot of simple, functional methods in the codebase, almost all of which are thoroughly tested.  I would happily work in this codebase.

But NDepend is calling out real and important opportunities for improvement.  If I were working on this codebase, I’d make an effort to break some of those gigantic unit test classes into smaller ones that are more cohesive over their context.  I’d also take a hard look at the mutually dependent namespaces and either merge them or rework the dependency direction a bit.  And even I’d give some idle thought to how I might segment the large Mock class somehow into smaller chunks if that wound up making sense.

So the whole thing winds up being an interesting microcosm to me.  Moq is, as the stats would indicate, a pretty nice codebase.  But, as with just about any codebase, there’s plenty of room for improvement.  And having a tool to show you where to improve quickly is invaluable.

On the Superiority of the Visual Studio Dark Theme

When I downloaded the newest version of NDepend, something wonderful awaited me.  Was it support for the latest .NET Core version?  The addition of checks for ubiquitous language for DDD projects?  Any of the various rule additions or bug fixes?

Nope.  I’m a lot more superficial than that, apparently.  I saw this and was incredibly excited:

Dark Theme

In case you’re scratching your head and wondering, “what?” instead of sharing my delight, I’ll be more explicit.  NDepend added support for the Visual Studio dark theme.  And I absolutely love it.

Asserting the Superiority of the Visual Studio Dark Theme

Why so excited?  Well, as a connoisseur of the Visual Studio dark theme, this makes my experience with the tool just that much better.

Don’t get me wrong.  I didn’t mind the interface up until this release, per se.  I’ve happily used NDepend for years and never really thought about its color scheme.  But this version is the equivalent of someone going into my house where I already like the bathroom and installing a radiant floor.  I never knew I was missing it until I had it.

Everything in Visual Studio is better in the dark.

Oh, I know there’s evidence to the contrary.  People are, apparently, 26% more accurate when reading dark text on a light background than vice-versa. And it’s easier to focus on and remember text in a light theme.  I understand all of that.  And I don’t care.

The dark theme is still better.

Continue reading On the Superiority of the Visual Studio Dark Theme

Functional C# Improves Your Design without Making Your Code Cleaner, Exactly

Functional C# Improves Your Design Without Making Your Code Cleaner, Exactly

Today I offer another one of the code research posts we’ve been doing.  If you want more backstory on the series, check out the last post in the series, where I give a brief history.  You should also read it if you want to understand both what I mean by functional C# and for details about its impact on codebases at the method level.

Quick editorial note: a couple of people have commented/sent notes asking about p-values.  I’ve been eliding those to keep the posts more narrative.  But as we’ve expanded the set of variables we capture, we’ve been looking only at dramatically lower p-values.  Those cited in this post, for instance, range between 0 and 0.04, with most being less than 0.01.

I’ll summarize the last functional study here, briefly.  Last time, I studied about 500 codebases to see what functional-style programming did to methods and types.  And the answer was that it made them less object-oriented, but it had surprisingly little influence on clean code statistics, like

  • Lines of code per method or type.
  • Cyclomatic complexity.
  • Parameters per method.
  • Method nesting depth.
  • Methods per type.

I expected that functional codebases would correlate with a reduction in all of those things.  In other words, I figured that functional-style programming would lead to smaller, clearer, more focused, and less complex methods.  It didn’t.

Undaunted, I vowed to take a broader look at the effect of functional programming on a wider array of concerns. And I did just that, with the help of my partner who runs the statistical regressions. Continue reading Functional C# Improves Your Design Without Making Your Code Cleaner, Exactly

Your Guide to Winning Arguments About Code

Your Guide to Winning Arguments About Code

The whole “tabs versus spaces” thing occupies sort of an iconic position in the programmer world.  It represents the impossibility of winning arguments that are unwinnable by their very nature.  These are so-called religious wars — our techie version of The Butter Battle Book but without the Cold War overtones.

But this and arguments like it rarely actually play out in team rooms and offices.  At least, that’s always been my experience.  I’ve only ever witnessed live “tabs versus spaces” arguments happen ironically.

Actual Arguments in the Programmer World

So here’s how it really goes down.

You take a job with a company, excited about the new office, the shorter commute, and the bump in pay.  You’re riding high.  But then the onboarding starts with the codebase.

Greg, the most senior member of the group, walks you through the codebase with a slightly smug affect of obvious pride.  The codebase has everything you could ever want, according to Greg.  To your mounting horror, this includes

  • Inheritance hierarchies that are inscrutable, dark, and deep (with miles to go before you sleep).
  • A generous portion of global state.
  • Liberal use of reflection, often for no discernible reason.
  • And, of course, an extensive, homegrown “framework.”

Greg finishes with a flourish: “So, anytime you need to add a new feature, you just open GodClass.cs, scroll down to line 12,423, add another method, and get started!”

You’re new, and you’re not entirely sure that this isn’t an elaborate prank, so you swallow and say, “Oh…great!” while already planning the lengthy suggestions document that you’re going to put together.  And so the stage is set for what will become a never-ending string of arguments of variable politeness about the codebase.

Swap my hypothetical specifics for yours, but the formula is the same.  Someone is asking you to exist in a codebase that you have philosophical reservations about, which will force you to write code you don’t like.

Winning Arguments: How Does One Define This?

I’ve now set the stage, but what does it actually mean to “win” an argument?  This is pretty hard to define in a lot of contexts, such as people arguing on Facebook about politics or fighting at dinner.  Is it the one who got the last word in?  The one who was louder?  The one who didn’t simply give up?

Luckily, in your quest to de-Greg your new company’s codebase, “winning” is easier to define (if admittedly something of a loaded term).  You win if, by mutual consent, however grudging, the thing you think should happen winds up immortalized in the team’s source control.  And the mutual consent part matters.  Just slamming something into source control and being forced to revert later by an angry Greg doesn’t count.

You win when your argument carries the day and results in concrete action.  So let’s look at some techniques for making that more likely.

Continue reading Your Guide to Winning Arguments About Code

Functional Programming Makes Your Code Not OO...And Thats It

Functional Programming Makes Your Code Not OO…And That’s It

Over the course of the fall and winter, I’ve been gaining momentum with code research posts.  Today, I bring that momentum to bear on the subject of functional programming.  Or at least, on functional style of programming in C# codebases.

But before I do that, let me provide a little background in case you haven’t caught the previous posts in the series.  It started with me doing automated static analysis on 100 codebases to see how singletons impact those codebases.  Next up, I used that data to look at how unit tests affect codebases.  That post generated a lot of buzz, so I enlisted a partner to help with statistical analysis and then boosted the codebase sample size up to 500.

At the end of that last post, I suggested some future topics of study.  Well, now I’ve picked one: functional programming.

What Is Functional Programming?

The idea with this post is mostly to report on findings, but I’d be remiss if I didn’t provide at least some background so that anyone reading has some context.  So first, let’s cover the topic of functional programming briefly.

Functional programming is one of the major programming paradigms.  Specifically, its calling card is that it disallows side effects.  In other words, it models the rules of math, in which the result of the function (or method) is purely a deterministic function of its inputs.

So, in pseudo-code, it looks like this:

This is a functional method.  But if you do something like this

or like this

then you’re out of the functional realm because you’re adding side effects.  These two modified versions of Add() each concern themselves with the world beyond processing the inputs to add.  (As an aside, you could “fix” this by passing the global variable or the _databasePlopper dependency to the method as a parameter.)

Now, take note of something because this matters to the rest of the post.  While C# (or any other object-oriented language) is not a functional language, per se, you can write functional methods in C#.

Continue reading Functional Programming Makes Your Code Not OO…And That’s It

Unit Tests Correlate With Desirable Codebase Properties

Unit Tests Correlate With Desirable Codebase Properties

Today, I give you the third post in a series about how unit tests affect codebases.

The first one wound up getting a lot of attention, which was fun.  In it, I presented some analysis I’d done of about 100 codebases.  I had formed hypotheses about how I thought unit tests would affect codebases, and then I tested those hypotheses.

In the second post, I incorporated a lot of the feedback that I had requested in the first post.  Specifically, I partnered with someone to do more rigorous statistical analysis on the raw data that I’d found.  The result was much more clarity about not only the correlations among code properties but also how much confidence we could have in those relationships.  Some had strong relationships while others were likely spurious.

In this post, though, I’m incorporating the single biggest piece of feedback.  I’m analyzing more codebases.

Analysis of 500 (ish) C# Codebases

Performing static analysis on and recording information about 500 codebases isn’t especially easy.  To facilitate this, I’ve done significant work automating ingestion of codebases:

  • Enabling autonomous batch operation
  • Logging which codebases fail and why
  • Building in redundancy against accidentally analyzing the same codebase twice.
  • Executing not just builds but also NuGet package restores and other build steps.

That’s been a big help, but there’s still the matter of finding these codebases.  To do that, I mined a handful of “awesome codebase” lists, like this one.  I pointed the analysis tool at something like 750 codebases, and it naturally filters out any that don’t compile or otherwise have trouble in the automated process.

This left me with 503 valid codebases.  That number came down to 495 once adjusted for codebases that, for whatever reason, didn’t have any (non-third party) methods or types or that were otherwise somehow trivial.

So the results here are the results of using NDepend for static analysis on 495 C# codebases.

Continue reading Unit Tests Correlate With Desirable Codebase Properties

Following the Software Architecture Career Path

Following the Software Architecture Career Path

I can recall a certain day in my career with remarkable clarity.  I say remarkable because this happened well over a decade ago, when I was a relatively fresh-faced software engineer.  My manager had called me in for a chat — quarterly review or some such. He said something that stopped me in my tracks.

“Do you want to follow the technical track or the management track in your career?”

Yikes!  I remember panicking.  On an otherwise unremarkable morning, I had unexpectedly come to a crossroads in my career.  Did I want the organizational clout and higher paychecks of management?  Or would I stick with the technical stuff that I so loved?

Of course, this turned out to be a comical overreaction on my part.  My answer didn’t, in any way, bind me for life.  And the whole thing was something of a false dichotomy anyway.  But it did get me thinking about what I would later regard as the software architecture career path.

The Software Architecture Career Path

I certainly wasn’t alone in my confusion over what becomes of programmers as they advance in their careers.  Some continue programming indefinitely, while others, eagerly or reluctantly, become managers and climb the corporate ladder.  But the software architecture career path splits the difference in a confusing variety of ways.

I challenge you to find a job title with as much variance as “software architect.”  The title itself has many different flavors:

  • Software architect
  • Application architect
  • Technical architect
  • Solutions architect
  • Enterprise architect

You get the idea.  But beyond the title variance, you also see wide diversity in responsibilities.  Some architects are literally just programmers with developmental job titles.  Others are highly technical, mentoring developers and approving of solutions.   Still others are more like project managers or business analysts.  It really varies.

And in this variance lies opportunity.  You can take the opportunity to find a role that suits you well in terms of responsibilities while also advancing your career.

For the rest of this post, I’m going to talk about how to take advantage of this opportunity.  What skills do you need, and how do you find success?  Bear in mind, I’m also going to describe the different flavors of architects in somewhat broad strokes.  The industry doesn’t have a great consensus on what each of these roles really means, so I’m basing this on my own experience.  Whatever you call the different flavors of architect is less important than what they do in their roles and whether the role might suit you.

Continue reading Following the Software Architecture Career Path

The unit test effect study, refined

The Unit Test Effect Study, Refined

About a month ago, I wrote a post about how unit tests affect (and apparently don’t affect) codebases.  That post turned out to be quite popular, which is exciting.  You folks gave a lot of great feedback about where we might next take the study.  I’ve incorporated some of that feedback and have a followup on the unit test effect on codebases.

Summarized briefly, here are the high points of this second installment and refinement of the study:

  • Eliminating the “buckets” from the last time.
  • Introducing more statistical rigor.
  • Qualifying and refining conclusions from last time.

Also, for the purposes of this post, please keep in mind that non-incorporation of feedback is not a rejection of that feedback.  I plan to continue refinement but also to keep posting about progress.

Addressing Some of the Easier Questions and Requests

Before getting started, I’ll answer a few of the quicker-to-answer items that arose out of the comments.

Did your analysis count unit test methods when assessing cyclomatic complexity, etc.?

Yes.  It might be interesting to discount unit test methods and re-run analysis, and I may do that at some point.

Can you show the code you’re using?  Which codebases did you use?

The scraping/analysis tooling I’ve built using the NDepend API is something that I use in my consulting practice and is in a private repo.  As for the list of specific codebases, I’m thinking I’ll publish that following the larger sample size study.  In the most general terms, I’m going through pages like this that list (mostly) C# repos and using their links.

What about different/better categorization of unit test quality (test coverage, bolted on later vs. written throughout vs. demonstrably test driven)?  

This is definitely something I want to address, but the main barrier here is how non-trivial this is to assess from a data-gathering perspective.  So I will do this, but it will also take time.

Think of even just the anecdotally “easy” problem of determining TDD vs. non-TDD.  I approximated this by positing that test-driving will create a certain ratio of test methods to production methods since any production method will be preceded by a test method (notwithstanding future extract method refactorings).  We could, perhaps, do better by auditing source control history and looking for a certain commit cadence (modification to equal numbers of test/production classes, for instance).  But that’s hard, and it doesn’t account for situations with large batch commits, for instance.

The upshot is that it’s going to take some doing, but I think we collectively can figure it out.

Continue reading The Unit Test Effect Study, Refined

L:ack of Cohesion of Methods: What Is This And Why Should You Care?

Lack of Cohesion of Methods: What Is This And Why Should You Care?

Lack of cohesion of methods (sometimes abbreviated LCOM) is one of those things that occurs fairly high up on the software hierarchy of needs.  What’s the “software hierarchy of needs?”  It’s a thing that I just made up, shamelessly copying Maslow’s Hierarchy of Needs.  (Though in a quick search to see how effectively I’ve planted this flag, I see that Scott Hanselman had the idea before me.  But mine looks a little different than his because I’m talking about the software, rather than the humans writing it.)

Anyway, here’s the hierarchy.

  • The cost of change stays relatively flat because the software is highly maintainable.
  • You can change it over time in response to evolving requirements and market realities.
  • It performs and behaves acceptably in production.
  • Technically, it fulfills the functional requirements and user value proposition.
  • It simply exists in production.

The software hierarchy of needs.

When it comes to something like lack of cohesion of methods, you’re talking about the top two categories.  Early in your career, you’ll tend to have worries like just wheezing past the finish line with a feature and like getting your code to do what you want it to.  With practice and time, you then start worrying about non-functional concerns and maintainability.  And when you start worrying about that, you should start paying attention to lack of cohesion of methods (among other things).

So for the rest of this post, I’ll focus on this one idea: lack of cohesion of methods.  What is it, how do you measure it, and why should you care?

Continue reading Lack of Cohesion of Methods: What Is This And Why Should You Care?

A Guide to Code Coverage Tools for C#

I promise that I’ll get to a treatment of code coverage tools in short order.  But first, I want to offer a quick disclaimer and warning.

If you’re here because you’re looking for a way to let management track the team’s code coverage progress, please stop and reconsider your actions.  I’ve written in the past about how code coverage should not be a management concern, and that holds as true now as ever.  Code coverage is a tool that developers should use — not something that should be done to them.  Full stop.

What Is Code Coverage and Why Do It?

Okay, with that out of the way, let’s talk extremely briefly about what code coverage is.  And take note that this is going to be a very simple explanation, glossing over the fact that there are actually (slightly) different ways to measure coverage.  But the short form is this.  Code coverage measurements tell you which lines of code your test suite executes and which lines it doesn’t.

Let’s look at the simplest imaginable example.  Here’s a pretty dubious implementation of division:

Let’s say this was the only method in our codebase.  Let’s also say that we wrote a single unit test, from which we invoked Divide(2, 1) and asserted that we should get back 2.  We would then have 67% code coverage.  Why?  Well, this test would cause the runtime to test the conditional and then to execute the return x/y statement, thus executing two-thirds of the method.  Absent any other tests, no automated test ever executes that last line.

So in the grossest of terms, that’s the “what.”  As for the “why,” developers can use code coverage analysis to see holes in their testing efforts.  For instance, code coverage analysis would tell us here, “Hey, you need to test the case where you pass a 0 to the method for y.”

What Are Code Coverage Tools?

What, then, are code coverage tools?  Well, as you can imagine, computing code coverage the way I just did would get pretty labor intensive.

So developers have done what developers do.  They’ve automated code coverage detection.  In just about any language and tech stack imaginable, you have ways to detect how thoroughly the unit test suite covers your codebase.  (Don’t confuse code coverage with an assessment of the quality of your tests, though.  It’s just a measure of whether or not the tests execute the code.)

The result is an array of tools to choose from that help you see how well your test suite covers your codebase.  And that can become somewhat confusing.  So today, I’ll take you through some of your options in the .NET ecosystem.

Continue reading A Guide to Code Coverage Tools for C#