Ah, the old “versus” Google search. Invariably, you’re in the research stage of some decision when you type this word into a search engine. Probably not something like Coke vs Pepsi. Maybe “C# vs Java for enterprise projects” or “angular vs react.” Or if you landed here, perhaps you’re looking at “log4net vs NLog.”
With a search like this, you expect a certain standard script. The writer should describe each one anecdotally, perhaps with a history. Then comes the matrix with a list of features and checks and exes for each one, followed by a sober list of strengths and weaknesses. Then, with a flourish, I should finish with a soggy conclusion that it really depends on your needs, but I maybe kinda sorta like one better.
I’m not going to do any of that.
Log4net vs NLog: Which Should You Use? I Dunno.
Let’s be clear here. This post isn’t intended to compare features. It’s intended to compare the effects they have on codebases that use them—or, at least, what features of codebases correlate with their usage.
If you haven’t seen it before, I’m doing a series of posts in which we use an expansive amount of data to study codebase properties. It’s sort of like a Freakonomics-inspired look at source code. NDepend’s API and CQLinq query language provide a unique capability to gather data about codebases. So we’ve gathered that data, analyzed it, and run statistical regressions on it. Over the course of this series, we’ve written about questions such as
- How does functional-style programming affect codebases?
- What effects does unit testing have on codebases?
- What differences correlate with codebases that use singletons with ones that don’t?
Today, I’m going to look at a similar question: what different properties do codebases that use NLog exhibit from codebases that use log4net? Will this tell you which one you should use? Of course not. But will it be interesting? I sure think so.
Methodology and Caveats
Before diving into the results, let me speak a bit to methodology. We’ve got almost 600 codebases that we’ve pulled into our corpus, built, and analyzed. I started to think it would be interesting to see which NuGet packages correlated with different codebase properties. So I parsed the packages.config files in a codebase to get a list of all NuGet packages that each used. This means that a codebase uses a package if any of the assemblies in that codebase have it installed.
Of our total codebases, 20 have NLog installed and 21 have log4net installed. Using those codebases, we split them into two “values” on an X-axis and ran regression analysis against them with dozens of quantified codebase properties (e.g., method cyclomatic complexity, type rank, etc.).
For the last few rounds of the study, we’ve used relationships with p-values below 0.05, meaning that there’s less than a 5% chance of the relationship being a false positive. With this study, because we were mainly looking to see if a narrative would emerge suggesting more in-depth research, we relaxed that standard to 0.15. This gives us a more fleshed out story here, but it slightly raises the chances of some false positives.
So this gives us some things to bear in mind as you read through the rest of the article:
- We have not (yet) quantified the prevalence of use of the package.
- Compared to the last study, any given “statistically significant” relationship has a slightly higher chance of being a false positive.
- This evaluation has nothing to do with the code of NLog or log4net. It looks only at the code of the codebases using them.
- Correlation (what we’re discussing here) does not mean causation. None of this is to say that use of these packages causes these differences in code. It could be the other way around or it could be that there’s a third, common cause.
- The idea here is mainly to find directions for future, more rigorous research. This is a blog post—not a white paper or a scientific publication. I want to raise awareness that we can use data in relatively unprecedented ways to study codebases.
So that said, let’s take a look at the findings.
1. NLog Codebases Are a Little “Cleaner” than log4net Ones at the Method Level
Across these codebases, differences jumped out at the method level. We have two relationships with a p-value of around 0.1, and both see serious differences in the quantities they represent.
- Codebases using NLog have two fewer lines of code per method.
- Codebases using NLog have 1.2 fewer variables per method.
Now, those p-values aren’t great, but the trend also isn’t isolated. If we were to relax the p-values a little further, the NLog codebases also have less nesting depth, less cyclomatic complexity, and smaller fan-out. So, while we might expect a false positive among all of that, the broader trend slopes toward NLog codebases in terms of method simplicity.
The relative simplicity of the NLog codebases at the method level makes this section somewhat unexpected. The NLog codebases have a number of properties that make them feel somewhat more dated than the log4net codebases. Here’s a list of differences:
- The NLog codebases have a substantially higher rate of explicit interface implementation, perhaps indicating naming collisions encouraged by the API. (This relationship has a p-value around 0.01, incidentally.)
- But on the whole, the NLog codebases have way fewer interfaces and make less use of abstraction across the board.
- NLog’s codebases make a lot more use of operator overloading, indexers, and class constructors.
- NLog also has a lot more mutable state, as indicated by the percentage of methods that change instance state.
- And, finally, NLog codebases have much larger instance sizes.
All of this combines to form sort of a loose gestalt of older practices—things that have gradually fallen out of favor over the years.
3. Log4net Codebases Have More Tests and Better Testability
Another area of stark difference has to do with unit tests among these codebases. It’s a little counterintuitive as well. You’d think that log4net’s having larger and less “clean” methods would mean that codebases using it would have fewer unit tests. But it’s emphatically the opposite.
The NLog codebases have 10% fewer test methods, as a function of the percentage of total methods in the codebase that are test methods. Probably contributing to this is the fact that the NLog codebases have busier constructors, meaning more lines of code per constructor. If you combine that with fewer interfaces, less abstraction, more mutable state, and more class constructors, you start to see a picture of relative simplicity at the method level. But there are testing difficulties as you zoom out in code element granularity.
Log4net codebases thus appear to be both easier to test and more tested.
4. Codebases Using These Libraries Have a Curious Relationship With Boxing and Unboxing
I’ll close with something that I find to be a bit mysterious. I suspect this might be directly attributable to API differences between the two logging packages, but I’m not experienced enough with the NLog API to say for sure. (This would make a good piece of follow-up research.)
- At both the method and type level, NLog codebases use less boxing than the log4net codebases.
- At both the method and type level, NLog codebases use more unboxing than log4net codebases.
Boxing and unboxing are the processes of converting value types to reference types and extracting value types from reference types, respectively. Generally speaking, the prevalence of these two tends to correlate in codebases.
So it’s a weird outcome that two sets of codebases would vary the way NLog and log4net do. One set really likes to convert value types to reference types, while the other really likes to extract value types from reference types.
So, What Should You Take Away?
So what’s the takeaway here? Is this guide going to help you decide between these two tools? I wouldn’t recommend using it that way, myself. I promise you that there’s nothing stopping you from keeping your methods compact while you use log4net, nor is there anything stopping you from writing unit tests while using NLog.
Rather, what I hope you take away from this is the idea that the packages and libraries that you choose have differences not just in your user experience with them. They also exert influence on how you code and the nature of your code. So keep this in mind as you choose them, and ask yourself frequently what sorts of design decisions your tools steer you toward, however subtly.