I am working on .NET development full-time since 2002 and there is a point that still annoys me after all these years: the default .NET build behavior leads to a waste of resources and very few .NET shops are addressing the issue!
For example, if we build a solution with a console referencing a library, by default the library gets duplicated:
- one instance can be found in the library output directory
- one instance can be found in the console output directory because the console requires it to run
This duplication behavior can be easily solved with a Directory.Build.props file at the root of the solution with this content:
<?xml version="1.0" encoding="utf-8"?>
Now there is a single binary directory defined at the root of the solution and a single instance of the library DLL. Moreover the console app is still runnable:
I know many developers who’d argue: “who care, I have a 2TB SSD hard-drive and need to focus on getting the job done!”
- First nowadays most builds occur in CI/CD with VM size limitation and cost. So having a build output that weights many GBs can make a real difference in terms of budget and feasibility. To quote a Roslyn engineer from this page Always use hard links when building in CI “One effect of this is it makes us fit within the size budget of the built-in AzDo Windows agents, which are slower, but highly available”. Another quote found here: “This is used in our CI because it gets our build size down to a point where we can run on stock azure VM images (IIRC they have a ~10 Gig limit that we violate without hard links)”
- Second stopping the bytes waste can significantly prolong the life of an SSD.
- Third and foremost, what kind of developer are you? I like being a minimalist developer. If I need only 4x instances of a library (
\Release\net472) there is no need to get dozens of instances. Each unnecessary redundant version is a potential enemy that might not get well refreshed and might result somehow in a problem. This is a bit like code duplication. The worst part is that the amount of waste is squared with the number of projects
Moreover this redundancy issue used to have a severe impact on build time but this has been fixed (since VS 2013 if I remember well). Nevertheless as we’ll see below a significant impact remain (15% extra build duration with DLL duplication). Dealing with Nx copies of a DLL is necessarily slower than dealing with a single instance.
Case Study: The Roslyn \bin\artifacts directory footprint is 11.8 GB
The footprint of the
\bin\artifacts directory of Roslyn v6.1 is 11.8 GB!
The 3,29MB Roslyn DLL
Microsoft.CodeAnalysis.dll is duplicated 117 times for a total footprint of 377 MB!
The 24MB DLL
Microsoft.CodeAnalysis.Test.Utilities.dll is duplicated 75 times for a footprint of 1.75 GB!
Scanning this directory with the Windirstat tool sheds light on the Roslyn DLL set duplicated many times:
Interestingly enough the Windirstat treemap view of directory content has been an inspiration for the NDepend Code Metrics view. Below a zoomable view of the NDepend code base with 87% code coverage by tests. This view is of great help to take decision of what to test next. Moreover some code coverage rules prevent coverage regression on core classes that needs to remain 100% covered:
Hard Link to the Rescue
It is possible to configure MSBuild to use NTFS hard links and reduce drastically the binaries weight. Here is a quote from this link “A hard link is a file that represents another file on the same volume without duplicating the data of that file. More than one hard link can be created to point at the same file. Hard links cannot link to a file that is on a different partition, volume or drive. Hard links on directories are not supported as it would lead to inconsistencies in parent directory entries.”. Since MSBuild v4.0 there is support for hard links:
MSBuild.exe project.msbuild /p:CreateHardLinksForCopyLocalIfPossible=true; CreateHardLinksForCopyFilesToOutputDirectoryIfPossible=true; CreateHardLinksForCopyAdditionalFilesIfPossible=true; CreateHardLinksForPublishFilesIfPossible=true;
However there is still a major issue opened: MSBuild disables hard linking when building in Visual Studio
In this issue link we can see that Roslyn can be compiled using hard links. Here are the benefits:
|Build Time||Binaries Size Explorer||Binaries Size DU|
|No hard link||4:00||13.9GB||14.9GB|
Do Care for your Build Output
The MSBuild hard link solution doesn’t sound ideal to me. Structuring the build output well deserves some care and effort. For example, since the NDepend inception in 2004, we segregate executables from libraries. Libraries are all compiled in a
.\Lib folder. With time a few extra folders appeared but we never had to deal with the duplication syndrome.
AppDomain.CurrentDomain.AssemblyResolve to redirect libraries assemblies resolving from executable at runtime. This API is still supported in .NET 7, 6 … despite being an
AppDomain stuff. Here are explanations of how we use this AssemblyResolve API in the context of NDepend open-source PowerTools resolving the NDepend libraries assemblies.
It is likely that your .NET shop suffers from this wasting syndrome. Enjoy being a minimalist developer a developer who cares, a developer who enjoys working in a well controlled environment. Make sure that each artefact in your development environment is required and is not unnecessarily duplicated. To do so you can get inspiration from the root
Directory.Build.props file shown in the introduction (with
<BaseOutputPath>..\bin</BaseOutputPath>) and adapt it to your own environment.
There is actually a competition of minimalist coders: the 4KB demo scene. While we all have TB hard-drives, some coders are pushing the limit of what can be achieved with 4.096 bits only. This is freaking cool if you ask me