NDepend

Improve your .NET code quality with NDepend

The Singleton Design Pattern: Impact Quantified

The Singleton Design Pattern: Impact Quantified

This post has been about a month in the offing.  Back in August, I wrote about what the singleton pattern costs you.  This prompted a good bit of discussion, most of which was (as it always is) anecdotal.  So a month ago, I conceived of an experiment that I called the singleton challenge.  Well, the results are in.  I’m going to quantify the impact of the singleton design pattern on codebases.

I would like to offer an up-front caveat.  I’ve been listening lately to a fascinating audiobook called “How to Measure Anything,” and it has some wisdom for this situation.  Measurement is primarily about reducing uncertainty.  And one of the driving lessons of the book is that you can measure things — reduce uncertainty — without getting published in a scientific journal.

I mention that because it’s what I’ve done here.  I’ll get into my methodology momentarily, but I’ll start by conceding the fact that I didn’t (and couldn’t) control for all variables.  I looked for correlation as a starting point because going for causation might prove prohibitive.  But I think I took a much bigger bite out of trying to quantify this than anyone has so far.  If they have, I’ve never seen it.

A Quick Overview of the Methodology

As I’ve mentioned in the past on this blog, I earn a decent chunk of my consulting income doing application portfolio assessments.  I live and breathe static code analysis.  So over the years, I’ve developed an arsenal of techniques and intellectual property.

This IP includes an extensive codebase assessor that makes use of the NDepend API to analyze codebases en masse, store the results, and report on them.  So I took this thing and pointed it at GitHub.  I then stored information about a lot of codebases.

But let’s get specific.  Here’s a series of quick-hitter bullets about the experiment that I ran:

  • I found this page with links to tons of C# projects on GitHub, so I used that as a “random” selection of codebases that I could analyze.
  • I gave my mass analyzer an ordered list of the codebase URLs and turned it loose.
  • Anything that didn’t download properly, decompress properly, or compile properly (migrating for Core, restoring NuGet packages, and building from command line) I discarded.  This probably actually creates a bias toward better codebases.
  • Minus problematic codebases, I built all solutions in the directory structure and made use of all compiled, non-third-party DLLs for analysis.
  • I stored the results in my database and queried the same for the results in the rest of the post.

I should also note that, while I invited anyone to run analysis on their own code, nobody took me up on it.  (By all means, still do it, if you like.)

Singleton Design Pattern: the Results In Broad Strokes

First, let’s look at the scope of the experiment in terms of the code I crunched.  I analyzed

  • 100 codebases
  • 986 assemblies
  • 5,086 namespaces
  • 72,615 types
  • 501,257 methods
  • 1,495,003 lines of code

From there, I filtered down raw numbers a bit.  I won’t go into all of the details because that would make this an immensely long post.  But suffice it to say that I discounted certain pieces of code, such as compiler-generated methods, default constructors, etc.  I adjusted this so we’d look exclusively at code that developers on these projects wrote.

Now, let’s look at some statistics regarding the singleton design pattern in these codebases.  NDepend has functionality for detecting singletons, which I used.  I also used more of its functionality to distinguish between stateless singleton implementations and ones containing mutable state.  Here’s how that breaks down:

Continue reading The Singleton Design Pattern: Impact Quantified

Static analysis of .NET Core 2.0 applications

NDepend v2017.3 has just been released with major improvements. One of the most requested features, now available, is the support for analyzing .NET Core 2.0 and .NET Standard 2.0 projects. .NET Core and its main flavor, ASP.NET Core, represents a major evolution for the .NET platform. Let’s have a look at how NDepend is analyzing .NET Core code.

Resolving .NET Core third party assemblies

In this post I’ll analyze the OSS application ASP.NET Core / EntityFramework MusicStore hosted on github. From the Visual Studio solution file, NDepend is resolving the application assembly MusicStore.dll and also two test assemblies that we won’t analyze here. In the screenshot below, we can see that:

  • NDepend recognizes the .NET profile, .NET Core 2.0, for this application.
  • It resolves several folders on the machine that are related to .NET Core, especially NuGet package folders.
  • It resolves all 77 third-party assemblies referenced by MusicStore.dll. This is important since many code rules and other NDepend features take into account what the application code is using.

It is worth noticing that the .NET Core platform assemblies have high granularity. A simple website like MusicStore references no fewer than 77 assemblies. This is because the .NET Core framework is implemented through a few NuGet packages that each contain many assemblies. The idea is to release the application only with needed assemblies, in order to reduce the memory footprint.

.NET Core 2.0 third party assemblies granularity

NDepend v2017.3 has a new heuristic to resolve .NET Core assemblies. This heuristic is based on .deps.json files that contain the names of the NuGet packages referenced. Here we can see that 3 NuGet packages are referenced by MusicStore. From these package names, the heuristic will resolve third-party assemblies (in the NuGet store) referenced by the application assemblies (MusicStore.dll in our case).

NuGet packages referenced in .deps.json file

Analyzing .NET Standard assemblies

Let’s be clear that NDepend v2017.3 can also analyze .NET Standard assemblies. Interestingly enough, since .NET Standard 2.0, .NET Standard assemblies reference a unique assembly named netstandard.dll and found in C:\Users\[user]\.nuget\packages\NETStandard.Library\2.0.0\build\netstandard2.0\ref\netstandard.dll.

By decompiling this assembly, we can see that it doesn’t contain any implementation, but it does contain all types that are part of .NET Standard 2.0. This makes sense if we remember that .NET Standard is not an implementation, but is a set of APIs implemented by various .NET profiles, including .NET Core 2.0, the .NET Framework v4.6.1, Mono 5.4 and more.

Browsing how the application is using .NET Core

Let’s come back to the MusicStore application that references 77 assemblies. This assembly granularity makes it impractical to browse dependencies with the dependency graph, since this generates dozens of items. We can see that NDepend suggests viewing this graph as a dependency matrix instead.

NDepend Dependency Graph on an ASP.NET Core 2.0 project

The NDepend dependency matrix can scale seamlessly on a large number of items. The numbers in the cells also provide a good hint about the represented coupling. For example, here we can see that  22 members of the assembly Microsoft.EntityFrameworkCore.dll are used by 32 methods of the assembly MusicStore.dll, and a menu lets us dig into this coupling.

NDepend Dependency Matrix on an ASP.NET Core 2.0 project

Clicking the menu item Open this dependency shows a new dependency matrix where only members involved are kept (the 32 elements in column are using the 22 elements in rows). This way you can easily dig into which part of the application is using what.

NDepend Dependency Matrix on an ASP.NET Core 2.0 project

All NDepend features now work when analyzing .NET Core

We saw how to browse the structure of a .NET Core application, but let’s underline that all NDepend features now work when analyzing .NET Core applications. On the Dashboard we can see code quality metrics related to Quality Gates, Code Rules, Issues and Technical Debt.

NDepend Dashboard on an ASP.NET Core 2.0 project

Also, most of the default code rules have been improved to avoid reporting false positives on .NET Core projects.

NDepend code rules on an ASP.NET Core 2.0 project

We hope you’ll enjoy using all your favorite NDepend features on your .NET Core projects!

Understanding Cyclomatic Complexity

Wander the halls of an enterprise software outfit looking to improve, and you’ll hear certain things.  First and foremost, you’ll probably hear about unit test coverage.  But, beyond that, you’ll hear discussion of a smattering of other metrics, including cyclomatic complexity.

It’s actually sort of funny.  I mean, I understand why this happens, but hearing middle managers say “test coverage” and “cyclomatic complexity” has the same jarring effect as hearing developers spout business-meeting-speak.  It’s just not what you’d naturally expect.

And you wouldn’t expect it for good reason.  As I’ve argued in the past, code coverage shouldn’t be a management concern.  Nor should cyclomatic complexity.  These are shop-heavy specifics about particular code properties.  If management needs to micromanage at this level of granularity, you have a systemic problem.  You should worry about these properties of your code so that no one else has to.

With that in mind, I’d like to focus specifically on cyclomatic complexity today.  You’ve probably heard this term before.  You may even be able to rattle off a definition.  But let’s take a look in great detail to avoid misconceptions and clear up any hazy areas.

Defining Cyclomatic Complexity

First of all, let’s get a specific working definition.  This is actually surprisingly difficult because not all sources agree on the exact method for computing it.

How can that be?  Well, the term was dreamed up by a man named Thomas McCabe back in 1976.  He wanted a way to measure “the number of linearly independent paths through a program’s source code.”  But beyond that, he didn’t specify the mechanics exactly, leaving that instead to implementers of the metric.

He did, however, give it an intimidating-sounding name.  I mean, complexity makes sense, but what does “cyclomatic” mean, exactly?  Well, “cyclomatic number” serves as an alias for something more commonly called circuit rank.  Circuit rank measures the number of independent cycles within a cyclic graph.  So I suppose he coined the neologism “cyclomatic complexity” by borrowing a relatively obscure discrete math concept for path independence and applying it to code complexity.

Well then.  Now we have cyclomatic complexity, demystified as a term.  Let’s get our hands dirty with examples and implications.

Continue reading Understanding Cyclomatic Complexity

Understanding the Different Between Static And Dynamic Code Analysis

Understanding the Difference Between Static And Dynamic Code Analysis

I’m going to cover some relative basics today.  At least, they’re basics when it comes to differentiating between static and dynamic code analysis.  If you’re new to the software development world, you may have no idea what I’m talking about.  Of course, you might be a software development veteran and still not have a great idea.

So I’ll start from basic principles and not assume you’re familiar with the distinction.  But don’t worry if you already know a bit.  I’ll do my best to keep things lively for all reading.

Static and Dynamic Code Analysis: an Allegory

So as not to bore anyone, bear with me as I plant my tongue in cheek a bit and offer an “allegory” that neither personifies intangible ideas nor has any real literary value.  Really, I’m just trying to make the subject of static and dynamic code analysis the slightest bit fun on its face.

So pull your fingers off the keyboard and let’s head down to the kitchen.  We’re going to do some cooking.  And in order to that, we’re going to need a recipe for, say, chili.

We all know how recipes work in the general life sense.  But let’s break the cooking activity into two basic components.  First, you have the part where you read and synthesize the recipe, prepping your materials and understanding how things will work.  And then you have the execution portion of the activity, wherein you do the actual cooking — and then, if all goes well, the eating.

Static and Dynamic Recipe Analysis

Having conceived of preparing the recipe in two lights, think in a bit more detail about each activity.  What defines them?

First, the recipe synthesis.  Sure, you read through it to get an overview from a procedural perspective, rehearsing what you might do.  But you also make inferences about the eventual results.  If you’ve never actually had chili as a dish, you might contemplate the ingredients and what they’d taste like together.  Beef, tomato sauce, beans, spicy additives…an idea of the flavor forms in your head.

You can also recognize the potential for trouble.  The recipe calls for cilantro, but you have a dinner guest allergic to cilantro.  Yikes!  Reading through the recipe, you anticipate that following it verbatim will create a disastrous result, so you tweak it a little.  You omit the cilantro and double check against other allergies and dining preferences.

But then you have the actual execution portion of preparing a recipe.  However imaginative you might be, picturing the flavor makes a poor substitute for experiencing it.  As you prepare the food, you sample it for yourself so that you can make adjustments as you go.  You observe the meat to make sure it really does brown after a few minutes on high heat, and then you check on the onions to make sure they caramelize.  You observe, inspect, and adapt based on what’s happening around you.

Then you celebrate success by throwing cheese on the result and eating until you’re uncomfortably full.

Continue reading Understanding the Difference Between Static And Dynamic Code Analysis

The Role of Static Analysis in Testing

The Role of Static Analysis in Testing

“What do you do?”

In the United States, people ask this almost immediately upon meeting one another for the first time.  These days, I answer the question by saying that I do IT management consulting.  That always feels kind of weird rolling off the tongue, but it accurately describes how I’ve earned a living.

If you’re wondering what this means, basically I advise leadership in IT organizations.  I help managers, directors, and executives better understand how to manage and relate to the software developers in their groups.  So you might (but hopefully won’t) hear me say things like, “You should stop giving out pay raises on the basis of who commits the most lines of code.”

In this line of work, I get some interesting questions.  Often, these questions orient around how to do more with less.  “How can we keep the business happy when we’re understaffed?”  “What do we do to get away from this tech debt?”  “How should we prioritize our work?”  That sort of thing.

Sometimes, they get specific.  And weird.  “If we do this dependency injection thing, do we really need to deploy as often?”  Or “If we implement static analysis, do we still need to do QA?”

I’d like to focus on the latter question today — but not because it’s a particularly good or thought-provoking one.  People want to do more with less, which I get. But while that particular question is a bit of a non sequitur, it does raise an interesting discussion topic: what is the role of static analysis in testing?

Static Analysis in Testing: An Improbable (But Real) Relationship

If you examine it on the surface, you won’t notice much overlap between testing and static analysis.  Static analysis involves analyzing code without executing it, whereas QA involves executing the code without analyzing it (among other things).

A more generous interpretation, however, starts to show a relationship.  For instance, one could argue that both activities relate deeply to code quality.  Static analysis speaks to properties of the code and can give you early warnings about potential problems.  QA takes a black box approach to examining the code’s behavior, but it can confirm the problems about which you’ve received warnings.

But let’s dive even a bit deeper than that.  The fact that they have some purview overlap doesn’t speak to strategy.  I’d like to talk about how you can leverage static analysis as part of your testing strategy — directly using static analysis in testing.

Continue reading The Role of Static Analysis in Testing

How Has Static Code Analysis Changed Through the Years?

Years ago, I found myself staring at about 10 lines of source code.  This code had me stymied, and I could feel the onset of a headache as I read and re-read it.  What did this brain-bending code do, you might wonder?  It sorted elements in an array.

Now you probably assume that I was learning to code at the time.  Interestingly, no.  I had worked gainfully as a software developer for years and was also completing a master’s degree in computer science.  In fact, the pursuit of this degree had brought me to this moment of frustration, rubbing my temples and staring tiredly at a simple bubble sort in C.

Neither inexperience nor the difficulty of the code had brought me to that point.  Rather, I struggled to formally prove the correctness of this tiny program, in the mathematical sense.  I had enrolled in a class called “Formal Methods of Software Development” that taught the math underpinning our lives as programmers.

This innocuous, simple bubble sort had seven execution paths.  Here was number five, from a piece of homework I kept in my digital files.

Code analysis ranges from the academic to the pragmatic.

Hopefully I now seem less like an incompetent programmer and more like a student grappling with some weighty concepts.  But why should a simple bubble sort prove so difficult?  Well, the short answer is that actually proving things about programs with any complexity is really, surprisingly hard.  The longer answer lies in the history of static code analysis, so let’s take a look at that.

Continue reading How Has Static Code Analysis Changed Through the Years?

Is Your Team Wrong About Your Codebase? Prove It. Visually.

I don’t think I’ll shock anyone by pointing out that you can find plenty of disagreements among software developers.  Are singletons evil?  Is TDD a good idea (or dead)?  What’s the best IDE?  You can see this dynamic writ large across the internet.

But you can also see it writ small among teammates in software groups.  You’ve seen this before.  Individuals or small camps form around certain competing ideas, like how best to lay out the unit test suite or whether or not to use a certain design pattern. In healthy groups, these disagreements take the form of friendly banter or good-natured ribbing.  In less healthy groups, they create an us vs. them kind of dynamic and actual resentment.

I’ve experienced both flavors of this dynamic in my career.  Having to make concessions about how you do things is never fun, but group work requires it.  And so you live with the give-and-take of this in healthy groups.  But in an unhealthy group, frustration mounts with no benefit of positive collaboration to mitigate it.  This holds doubly true when one of the two sides has the decision-making authority or perhaps just writes a lot of the code and claims a form of squatter’s rights.

Status Quo Preservation

Division into camps can, of course, take many forms.  But I think the one you see most commonly happens when you have a group of developers or architects who have laid the ground rules for the codebase and then a disparate group of relative newcomers that want to change the status quo.

I once coined a term for a certain archetype in the world of software development: the expert beginner.  Expert beginners wind up in decision-making positions by default and then refuse to be swayed in the face of mounting evidence, third party opinions, or, well, really anything.  They dig in and convince themselves that they’re right about all matters relating to the codebase, and they express no interest in hearing dissenting opinions.  This commonly creates the toxic, adversarial dynamic here, and it leaves the rest of the group feeling helpless and frustrated.

Of course, this cuts the other way as well.  Sometimes the longest tenured decision makers of the group earned their position for good reason and acquit themselves well in defense of their positions.  Perhaps you shouldn’t adopt every passing fad and trend that comes along.  And these folks might find it tiresome to relitigate past architectural decisions ad nauseum every time a new developer hires on.  It probably doesn’t help when newbies throw around pejorative terms like “legacy code” and “the old way,” either.

Continue reading Is Your Team Wrong About Your Codebase? Prove It. Visually.

Code Quality Metrics: Separating the Signal from the Noise

Code Quality Metrics: Separating the Signal from the Noise

Say you’re working in some software development shop and you find yourself concerned with code quality metrics.  Reverse engineering your team’s path to this point isn’t terribly hard because, in all likelihood, one of two things happened.

First, it could be that the team underwhelmed someone, in some business sense — too many defects, serially missed deadlines, that sort of thing.  In response to that, leadership introduced a code quality initiative.  And you can’t improve on it if you can’t measure it.  For that reason, you found yourself googling “cyclomatic complexity” to see why the code you just wrote suddenly throws a warning.

The second option is internal motivation.  The team introduced the metrics of its own accord.  In this capacity, they serve as rumble strips on the side of your metaphorical road.  Doze off at the wheel a little, get a jolt, and correct course.

In either case, an odd sort of gulf emerges between the developers and the business.  And I think of this gulf as inherently noisy.

Code Quality Metrics for Developers

I spend a lot of time consulting with software shops.  And shops hiring consultants like me generally have code quality improvement initiatives underway.  As you can imagine, I see an awful lot of code metrics.

Here are some code quality metrics that I see tracked most commonly.  I don’t mean for this list to be an exhaustive one of all that I see.

  • Lines of code.  (This is an interesting one because, in aggregate, it’s often used to track progress.  But normalized over smaller granularities, like types and methods, people correlate it negatively with code quality — “that method is too big.”)
  • Cyclomatic complexity: the number of execution paths that exist through a given unit of code.  Less is more.
  • Unit test coverage: the percentage of paths through your code executed by your unit test suite.  More is more.
  • Static analysis tool/lint tool violations count: run a tool that provides automated code checks and then count the number of issues.

As software developers, we can easily understand these concepts and internalize them.  But to explain to the business why these matter requires either a good bit of effort or a “just trust us.”  After all, the business won’t understand these concepts as more than vague generalities.  There’s more testing coverage, or something…that sounds good, right?

These metrics can then have noise in them, meaning that how important they are for business outcomes becomes unclear.

Continue reading Code Quality Metrics: Separating the Signal from the Noise

What is static analysis?

What is Static Analysis? An Explanation for Everyone

Static analysis, as a concept, seems to earn itself a certain reputation.  The general population may regard programming as a technocratic, geeky pursuit.  But inside the world of programmers, static analysis has that equivalent rap.  It’s a geeky subject even among geeks.

I suspect this arises from the academic flavor to static analysis. You hear terms like “halting problem,” “satisfiability,” and “correctness proofs,” and you find yourself transported back to some 400-level discrete course from your undergrad.  And that’s assuming you did a CS undergrad.  If not, your eyes might glaze over.  Oh, and googling “static analysis” only to see things like this probably doesn’t help:

A static analysis screenshot that scares anyone looking at it

I have two CS degrees, concentrated heavily on the math side of things, and I specialize in static analysis. And that featured image makes my eyes glaze over.  So let’s hit the reset button here.  Let’s make the subject at least approachable and maybe, just maybe, even interesting.

Defining Static Analysis Dead Simply

Whether you’re a grizzled programming veteran, fresh out of a bootcamp, or can’t program a lick, you can understand the concept.  I’ll use an analogy first, to ease into things.

When you write software, you write instructions in a format that you and other programmers understand.  A program called the compiler (in many cases) then translates these into terms that computers understand and eventually into automation output.  So think of programming as writing a grocery list for a personal shopper.  You write down what you want, in easily understood terms.  The personal shopper then maps this list to his knowledge of the grocery store’s layout and eventually produces output in the form of food that he brings you.

What, then, is static analysis in this world?  Well, it’s analyzing the written grocery list itself and using it to speak to what the grocery shopping and groceries will be like.  For instance, you might say, “Wow, 140 watermelons, huh?  We’re going to need to rent a truck, so that’s going to cost you extra.”

When it comes to writing code, people usually reason about it by running it and seeing what happens.  In our world, that means the shopper simply takes the list, goes on the shopping trip, and sees how things go.  “Wow, this is a lot of watermelon,” he says as he fills the 15th cart full of the things.  Only then does he start to understand the ramifications of this.

Static analysis capitalizes on the fact that you can understand things about the upcoming grocery run without actually executing it.

Continue reading What is Static Analysis? An Explanation for Everyone

How to Evaluate Your Static Analysis Process

I often get inquiries from clients and prospects about setting up and operationalizing static analysis.  This makes sense.  After all, we live in a world short on time and with software developers in great demand.  These clients always seem to have more to do than bandwidth allows.  And static analysis effectively automates subtle but important considerations in software development.

Specifically, it automates peer review to a certain extent.  The static analyzer acts as a non-judging, mute reviewer of sorts.  It also stands in for a tiny bit of QA’s job, calling attention to possible issues before they leave the team’s environment.  And, finally, it helps you out by acting as architect.  Team members can learn from the tool’s guidance.

So, as I’ve said, receiving setup inquiries doesn’t surprise me.  And I applaud these clients for pursuing this path of improvement.

What does surprise me, however, is how few organizations seem to ask another, related question.  They rarely ask for feedback about the efficacy of their currently implemented process.  Many organizations seem to consider static analysis implementation a checkbox kind of activity.  Have you done it?  Check.  Good.

So today, I’ll talk about checking in on an existing static analysis implementation.  How should you evaluate your static analysis process?

Continue reading How to Evaluate Your Static Analysis Process