This post has been about a month in the offing. Back in August, I wrote about what the singleton pattern costs you. This prompted a good bit of discussion, most of which was (as it always is) anecdotal. So a month ago, I conceived of an experiment that I called the singleton challenge. Well, the results are in. I’m going to quantify the impact of the singleton design pattern on codebases.
I would like to offer an up-front caveat. I’ve been listening lately to a fascinating audiobook called “How to Measure Anything,” and it has some wisdom for this situation. Measurement is primarily about reducing uncertainty. And one of the driving lessons of the book is that you can measure things — reduce uncertainty — without getting published in a scientific journal.
I mention that because it’s what I’ve done here. I’ll get into my methodology momentarily, but I’ll start by conceding the fact that I didn’t (and couldn’t) control for all variables. I looked for correlation as a starting point because going for causation might prove prohibitive. But I think I took a much bigger bite out of trying to quantify this than anyone has so far. If they have, I’ve never seen it.
A Quick Overview of the Methodology
As I’ve mentioned in the past on this blog, I earn a decent chunk of my consulting income doing application portfolio assessments. I live and breathe static code analysis. So over the years, I’ve developed an arsenal of techniques and intellectual property.
This IP includes an extensive codebase assessor that makes use of the NDepend API to analyze codebases en masse, store the results, and report on them. So I took this thing and pointed it at GitHub. I then stored information about a lot of codebases.
But let’s get specific. Here’s a series of quick-hitter bullets about the experiment that I ran:
- I found this page with links to tons of C# projects on GitHub, so I used that as a “random” selection of codebases that I could analyze.
- I gave my mass analyzer an ordered list of the codebase URLs and turned it loose.
- Anything that didn’t download properly, decompress properly, or compile properly (migrating for Core, restoring NuGet packages, and building from command line) I discarded. This probably actually creates a bias toward better codebases.
- Minus problematic codebases, I built all solutions in the directory structure and made use of all compiled, non-third-party DLLs for analysis.
- I stored the results in my database and queried the same for the results in the rest of the post.
I should also note that, while I invited anyone to run analysis on their own code, nobody took me up on it. (By all means, still do it, if you like.)
Singleton Design Pattern: the Results In Broad Strokes
First, let’s look at the scope of the experiment in terms of the code I crunched. I analyzed
- 100 codebases
- 986 assemblies
- 5,086 namespaces
- 72,615 types
- 501,257 methods
- 1,495,003 lines of code
From there, I filtered down raw numbers a bit. I won’t go into all of the details because that would make this an immensely long post. But suffice it to say that I discounted certain pieces of code, such as compiler-generated methods, default constructors, etc. I adjusted this so we’d look exclusively at code that developers on these projects wrote.
Now, let’s look at some statistics regarding the singleton design pattern in these codebases. NDepend has functionality for detecting singletons, which I used. I also used more of its functionality to distinguish between stateless singleton implementations and ones containing mutable state. Here’s how that breaks down: