NDepend

Improve your .NET code quality with NDepend

Case Study : Complex UI Testing

In the previous post Case Study: 2 Simple Principles to achieve High Code Maintainability I explained that the principles layered code + high coverage ratio by test are 2 simple principles that can be objectively applied, validated and measured. When these 2 principles are applied they lead to High Code Maintainability: As a consequence the management saves money in the long term and developers are happy to work in a cleaner code base with principles easy to follow.

Large and complex UI 90%+ covered by tests

In this previous post I used the example of the new NDepend v2020.1 graph. This new tool is a large and complex UI with dozens of actions proposed to the user and drastic performance requirement (it scales live on 100.000+ elements). This graph implementation is 90% covered by tests. It is not because there is a lot of UI code that it should not been well tested. We didn’t spend a good part of our resources in writing tests just for the sake of it. We did it because we know by experience that it’ll pay off: probably a few bugs will be reported as for all 1.0 implementation although beta test phases already caught some. But we are confident that it won’t take a lot of resources to fix them. We can look forward the future confidently (like supporting properly .NET 5 that will be released in November 2020).  And indeed 10 days after its 1.0 release no bug has been reported (nor logged) on this new graph although many users downloaded it: so far it looks rock-solid and we can focus on what’s next.

The picture below shows all namespace, classes and methods of the graph implementation. Smaller rectangles are methods and the color of each rectangle indicates how well a method is covered by tests. Clearly we tolerate some gaps in UI code, while non UI code like Undo/Redo actions implementations are 100% covered. Experience told us how to balance our resources and that everything does not have to be perfect to achieve high maintainability.

NDepend Graph Implementation 90% covered by tests

How did we achieve High Coverage Ratio on UI Code?

It is easy: we have a simple MVC (Model View Controller) design. Some controller classes contain the logic for all actions the user can do and those classes pilot the UI. Concretely in our scenario actions are: load/save, change group-by, change layout direction, zoom, generate a call graph for this method, change filters…

Then we wrote a test suite that first starts the UI and then invokes all actions. Each complex peculiarity of each action gets fully tested, hence complex actions get invoked several times by tests but differently each time, to make sure all scenarios get tested.

The video below shows the UI under testing: more than 40 actions get tested in less than a minute. It would take more than an hour to do all this work manually and any change in code could potentially ruin the validity of manual tests.

In such a complex UI there are many classes that are not directly related to UI. For example the grape of classes that describe the underlying model are tested separately.

As usual, a side benefit of writing tests is better design : the code gets structured in a way that makes it easy to invoke it through tests. Concretely some abstractions are introduced (that wouldn’t make sense without tests), some classes and some methods get splitted, some logic gets refined and as a result developers are happy to live in a code base where the logic is smoothly implemented.

High Coverage Ratio is not Enough: Assertions to the rescue

Typically at this point comes the remark: but code coverage is not enough, results must be asserted. And indeed, if nothing gets asserted nothing gets tested even if the code is entirely covered by tests.  We want tests to fail if something can go wrong.

Of course our tests contain many assertions for example load / save actions are invoked and asserted this way:

But these assertions are not enough. Per definition the UI code contains tons of visual peculiarities represented by states that can be potentially corrupted. As a consequence our UI code is stuffed with thousands of assertions: everything that can be asserted gets asserted.

  • A Rectangle with width/height in certain range
  • The state of a node or an edge when another element gets selected (is it a caller, a callee…?).
  • The current application state when a new graph is demanded by the controller.
  • The graph UI contains many asynchronous computation to avoid UI freezing. This leads to many assertions to check that mutable states are not corrupted by concurrent accesses.

All those states asserted would be hardly reachable from test code. However they get naturally accessed by the UI code itself so it is the right place to assert that they are not corrupted.

Btw, We still use the good old System.Diagnostics.Debug.Assert(…) for that, it has several advantages:

  • It is simple.
  • It is understood by tools like Roslyn/Resharper/CodeRush analyzers.

  • An assertion that fails cannot be missed both when running automatic tests and when running manual tests on the Debug mode version.
  • Debug assertions are removed by the compiler in Release mode: assertions are not executed in production and users get better performance. The idea is to not consider users as testers: code released in production is supposed to be rock-solid. Assertions are like scaffolding that gets removed when a building gets delivered. If there is still a bug we’ll discover it from users feedback, from production logs or from our own manual tests.

Debug.Assert(…) is enough for us and it is understandable that some other teams wants more sophisticated assertions framework. The key is to take the habit to assert everything that can be asserted when writing code (UI code or not). Each assertion is a guard that helps making the code rock-solid. Also each assertion improves the code readability. At code-review time we’ve all been wondering: can this integer be zero? can this string be empty? can this reference be null?. Hopefully C#8 non-nullable discards the last question but so many questions remain open without assertions.

Design by Contracts

This idea of stuffing code with assertions is actually an important software correctness methodology named DbC, Design by Contract, that is really worth knowing. Contracts mean much more than the usual approach with exception:

  • Explicitly throwing an exception says: zero is tolerated, it is not a bug, but you won’t get the result you’d like, be prepared to catch some exceptions.
  • Writing a contract says: don’t even thing of passing a zero value. The real type of argument is not Int32 it is [Int32 minus 0].  Ideally such violation could be caught by compilers and analyzers (and is indeed sometime caught as we saw in the screenshot above).

Conclusion

Any complex UI can be automatically tested as long as:

  • It is well designed with some controllers that pilot the UI and that can be invoked from tests.
  • UI code gets stuffed with assertions to make sure that no state becomes corrupted at runtime.

In short assertions embedded in code tested matter as much as assertions embedded in tests. If an assertion gets violated there is a problem, no matter the assertion location, and it must not be missed nor ignored. This powerful idea doesn’t necessarily applies only to UI code and is known as DbC, Design by Contract.

Actually in this post I added a third principle to achieve high code maintainablity and high code correctness : layered code + high coverage ratio by test + contracts

Case Study: 2 Simple Principles to achieve High Code Maintainability

High Code Maintainability is the key to make both the management and the developers happy:

  • Maintainability lets a product evolves naturally at a sustained pace with controlled cost.
  • Maintainability lets developers add new features and improve existing ones without spending most of their time refactoring old dusty code and fixing bugs.

After 16 years of development on our product NDepend (first release in April 2004!) we came to the conclusion that:

Highly Maintainable Code can be achieved through two simple, objective and verifiable principles: Layered Architecture and High Test Coverage Ratio

Layered Architecture prevents entangled code, the well know spaghetti code phenomenon. Dependencies get mastered and when it is time for the code to evolve new classes and interfaces naturally integrate with existing ones.

High Test Coverage Ratio means that when code covered by tests get refactored, existing tests get impacted. With not much efforts the developer discovers regression problems and fix them before they go to production and become bugs to fix. The more code is covered by tests the more you’ll benefit from this shield.

When writing a tool for developers, the most satisfying part is to challenge the tool on its own code: this practice is named dogfooding. We just rewrote completely the dependency graph of NDepend so let’s use this important refactoring as a case study. Then we’ll see how to automatize the validation of these principles.

Case Study: Layered Architecture

Let’s first present the layered architecture principle and then the test coverage principle.

See below a graph of the 250+ classes, interfaces and enumerations used to implement the new dependency graph. A 2.500+ classes, methods and fields SVG vector dependency graph is available here.

The class GraphController is selected:

  • The blue classes are the ones directly used by GraphController
  • The light-blue classes are the ones indirectly used by GraphController (indirectly means used by a classes used by a class … used by GraphController). Clearly GraphController relies on everything.
  • The red classes are the ones mutually dependent with GraphController.
The NDepend Dependency Graph used to visualize its own code

Several things can be said on how this code is structured:

  • This is not an API so we can use namespaces the way we want. Here namespaces implement the concept of components.
  • Box size is proportional to the number of lines of code. We can see that the overall namespaces box size is well balanced. This is a good practice to avoid having a few monster components and tons of smaller components.
  • The biggest component in terms of number of classes and lines of code is the implementation of the Undo/Redo system. More than 30 actions are implemented (expand/collapse, change GroupBy, select/unselect, generate a call graph…). These actions are relatively low level in the structure. While they act on the entire system they are not coupled with the controller, the UI rendering or the layout computation.
  • The two lowest components are Base and Model. Both contain few logic and are used by almost all other components.

In the future, whether we add new actions on the graph or decide to improve the layout somehow, this architecture won’t undergo drastic modifications. Thanks to this view it’ll be easy to decide in which component to add our new classes or if new components should be added and what they can and cannot use.

Ideally the GraphControl class shouldn’t be entangled with the GraphController class. These two classes have been developed together. See below the coupling graph between GraphController and GraphControl. It has been obtained by double-clicking the red arrow between the two classes. It wouldn’t be difficult to introduce an interface to inject one implementation in the other one but we didn’t do it (see below the coupling graph between the two classes) . This is the key when it comes to care for maintainability: which move will offer the highest ROI? Not everything has to be perfect just for the sake of it. Experience shows that having only two classes entangled does not impact much the maintainability. We estimated that spending our resources to satisfy the two principles has a better ROI in the long run.

Coupling Graph between GraphController class and GraphControl class

 

Case Study: High Test Coverage Ratio

The graph implementation is 90% covered by tests. It is not because there is a lot of UI code that it should not been well tested. We didn’t spend a good part of our resources in writing tests just for the sake of it. We did it because we know by experience that it’ll pay off: probably a few bugs will be reported as for every 1.0 implementation although beta test phases already caught some. But we are confident that it won’t take a lot of resources to fix them. We can look forward the future confidently (like supporting properly .NET 5 that will be released in November 2020).

The picture below shows all namespace, classes and methods. Smaller rectangles are methods and the color of each rectangle indicates how well a method is covered by tests. Clearly we tolerate some gaps in UI code, while non UI code like Undo/Redo actions implementations are 100% covered. Here also experience tells us how to balance our resources and that everything does not have to be perfect to achieve high maintainability.

NDepend Graph Implementation 90% covered by tests

In terms of lines of code the NDepend Graph is not even 5% of the entire product, it is a tool in the toolset. The worst case scenario would be that each tool implementation regularly spits some bugs: all our resources would be spent fixing them, we couldn’t continue adding value to the product and the business would probably die at a point. Not even mentioning the frustration of users of a buggy product.

Each year we fix a few dozens of bugs that each impact few users but that doesn’t take us more than a tiny percentage of our overall development resources. The overall code base is 86.5% covered by tests and is entirely layered: maintenance doesn’t cost us much.

Typically at this point comes the remark: but code coverage is not enough, results must be asserted by unit-tests. And indeed, if nothing gets asserted nothing gets tested even if the code is entirely covered by tests.  We want tests to fail when something is going wrong. In this next post Case Study : Complex UI Testing I explain how millions of assertions get checked while running our test suite against the graph implementation.

Automatically Validate Layered Architecture and High Test Coverage Ratio

NDepend offers hundreds of default code rules but only 4 of them are used to validate these key points:

The fourth rule Avoid namespaces mutually dependent helps a lot to layer a large super-component. In this situation the first thing to do is to make sure there is no pair of components that use each other. For each such pair of namespaces matched, this rule has an heuristic and tells which type should not use which other type, same for method level. A technical-debt estimation is also given in terms of development effort it’ll cost to fix each pair. Here it says that 11 man-day (8 hours a day) should be spent if someone decides to layer the NHibernate code base. Unfortunately this is not possible because it would break thousands of client code base bound with it. Also let’s note that an interest estimation is also given in terms of: how much development effort does it takes per year if I let issues unfixed. Here this rule estimates that not fixing all those pairs of namespaces entangled costs 5 man-days a year to the development team.

Avoid Namespaces Mutually Dependent with advices on what to do and costs estimation

These rules can be validated during the build process (Azure DevOps / TFS, Jenkins, TeamCity, Bamboo, SonarQube…) and the team can know when the new code written diverges from these two maintainability goals.

 

Conclusion: Objective, Verifiable, Simple

What is interesting with these two simple concepts, layering and code coverage, is that they can be objectively applied, validated and measured. Last year in 2019 I wrote a blog post series on SOLID principles and there have been so much debate about how to apply them in the real-world. SOLID principles are a great way to improve our understanding of Object Oriented Programming and how encapsulation, abstraction, polymorphism, inheritance … should be used and not used. But when it comes to write maintainable code everyone has a different opinion.

If it is decided that the code structure should be layered there is not much debate about which part should be abstracted from other ones. If a class A should use a class B and B is in a higher layer than A, somehow an interface IB must be created at A level to inject the B implementation in A without breaking the layering.

These 2 concepts emerged over the years because we had the utter need to produce maintainable code. What I really like is that they are simple. And KISS (Keep It Simple Stupid) is a great principle in software engineering.

If a third principle should be added it would definitely be about user documentation: we offer free email support to users but we also offer tons of embedded and online documentation. Everytime a question starts to be asked a few times, we make sure that users can get the response immediately from both a tooltip (or a smart UI change) and from the online documentation. Some other ISV decides to make money with support. Personally I don’t find this fair because it is a clear incentive to produce rotted documentation and hence frictions for the user.

How did we obtain the image in this post

Let’s show that all those images in this post have been obtained within a few clicks.

  • First let’s search for Graph Panel in the entire NDepend code base (they get zoomed automatically).
  • Then let’s reset the metric view with NDepend.UI.Graph.* namespaces to get the colored treemap.
  • Then let’s go back to the graph and only keep NDepend.UI.Graph.* namespaces matched by the search.
  • Then un-group by parent assembly to get a graph made of namespaces only.
  • Then change the layout direction from Top to Bottom to have a nicer layout.
  • Then expand all namespaces to get all classes.
  • Finally expand all classes to get all methods and fields.
Using the NDepend Graph to obtain a clear view of the implementation of the Graph

 

Don’t rely on someone else to protect your software

This morning I stumbled on this post Decompilation of C# code made easy with Visual Studio on the Visual Studio blog. Basically VS will soon be able to not only decompile third-party code but also generate some sort of PDB information that will make the decompiled code debuggable. The promise is no more  “No Symbols Loaded” or “Source Not Found” from within debugging session in VS and personally I found this awesome.

However most of this post’s comments are like “how do I protect my code from being decompiled then?!” and “Microsoft does not care about its customers’ intellectual property.”. These comments are absurd. Since its inception in 2002 .NET compiled code can be decompiled and read crystal clear with popular tools like .NET Reflector, IL Spy, dotPeak… This is a direct consequence of having IL/byte code and a CLR with a JIT compiler. I wonder to which extend those that wrote these comments are aware of that?

Protect your Intellectual Property

The first step is to make sure that your EULA forbid from decompiling your code, something like:  Licensee may not reverse-engineer, decompile, disassemble, modify, or translate the Product, or make any attempt to discover the source code of the Product;

The second step is to obfuscate your compiled code. Since 2002 those that want to protect their compiled code from decompilation just have to obfuscate it. There are mature free tools like ConfuserEx and paid tools available. This is what we do within our .NET shop since 2007 with success.

Also with .net native and AOT (Ahead of Time compilation) one can add a whole new layer of complexity by skipping the IL code and compile directly to machine code (e.g. X64 instructions).

But keep in mind that Obfuscators and AOT only protect the intellectual property to some extent. Your code best kept secrets are still executable in both scenarios, it means they are still there. Someone skilled ready to spend a large amount of time to reverse engineer your code can still have access to your intellectual property.

The ultimate way to protect your intellectual property is to provide your services as online SaaS. This way nobody will ever have access to your code. For example the whole SEO industry is based on guessing what Google and others web search engine are doing. These algorithms are protected because nobody except Google employees have access to them. In many scenarios Saas means also that one forces his clients to share their sensitive data, since they are processed on one’s server, and in many scenario this is not applicable. By spying your data Google and Facebook knows you better than you do, but it seems that only a fraction of the humanity disagree with that. Ok I digressed…

Protect your software from hackers

Keep in mind that Obfuscators and AOT don’t protect your code from hackers. It is still easy for any solid hacker to crack your license-checking-layer and provide a free version of your software online as a warez. The only possible protection from hackers that have access to your code are integrity checks. An integrity check is made of two parts:

  1. A layer that detects if some bytes of your compiled code have been tweaked (typically with a custom or standard hash function)
  2. A subtle malfunction of your software that prevents to use it if the compiled code has been tweaked.

The whole point of an integrity check is to consume the time of the hacker. This is why the word subtle is in bold, the malfunction is not a dumb exception to provoke a fail-fast, the malfunction must be something that appears minutes after the integrity check failed and finally makes the software totally unusable (like a massive memory leak – clearing some data – freezing some UIs – firing a timer to close the user session after a random number of minutes…)

By multiplying the integrity checks and the subtle corresponding malfunctions one can only hope to discourage a talented hackers to waste days, weeks or months to crack his software. But be aware that these guys are primarily driven by challenges… The good news is that with .NET there are tons of possible ways to write subtle integrity checks.

Conclusion

Protecting the intellectual property and the software itself is a difficult task. Don’t complain to Microsoft or anyone else that they should offer an out-of-the-box tool for that. If such a mainstream protection tool existed it would be a challenge for the best talented hackers and it wouldn’t resist long anyway.

The question you should ask for is: is it worth spending resources to protect my assets instead of spending these resources to make my paid clients even more happy. This is a difficult trade-off that must be carefully thought out. But always keep in mind that you must not rely on someone else to protect your software.

Not planning now to migrate your .NET 4.8 legacy, is certainly a mistake

2020 will see the achievement of the massive remodeling of the .NET platform initiated by Microsoft in November 2014 with the introduction of .NET Core 1, with the promise of an open-source, a multi-platform and a modernizable framework (thanks to no rock-solid backward compatibility constraint) – everything that the .NET Framework isn’t. This U-turn in the Microsoft plans for .NET is part of the new Microsoft’s strategy initiated by Satya Nadella that became CEO of Microsoft in February 2014, succeeding to Steve Ballmer.

.NET 5 will be released in November this year. Within 6 years Microsoft will have succeeded a complete platform shift like no other. The .NET Core brand was here to make clear that two .NET platforms were living side by side. But now we know that the .NET Framework 4.8 won’t evolve anymore and that all Microsoft efforts will be put on .NET Core continuation with well scheduled releases ahead. Let’s use the term .NET OSS to designate [.NET Core, .NET 5, .NET 6…. [ in the remainder of this post.

We can expect an early beta of .NET 5 before July 2020. Today in January 2020 the .NET 5.0 milestone is 72% achieved.

By now, the way to get prepared to .NET 5 and later is to migrate to .NET Core 3.1. Despite the branding change from .NET Core 3.1 to .NET 5 it is no mystery that .NET 5 will be mostly based on the actual .NET Core platform.

The cost of migration from .NET 4.8 to .NET OSS can get pretty high, especially if the legacy relies on some deprecated APIS (like WCF, WWF, WebForms or AppDomain). Thus it may seems attractive to stick with .NET 4.8 if your application is intended to run on Windows only.

Why it is not a good idea to not anticipate the migration now?

.NET 4.8 won’t evolve but some security patches will be provided for as long as we can foresee. However we can predict that .NET 4.8 will be quickly considered as a thing-of-the-past:

  • Developer mindset: .NET OSS and also C# will evolve. There will come a point where it’ll feel pretty awkward for .NET programmers working with 4.8 to not be able to use all the new goodies. Could you imagine programming with C#3 nowadays?
  • Third-Party Libraries: The .NET 4.8 / .NET OSS increasing gap will push open-sourced libraries authors toward .NET OSS. The cost of maintaining two code bases will be too high for an OSS developer. If your .NET 4.8 application consumes some OSS libraries, not migrating it will put you in an awkward situation where you’ll have to maintain the OSS code consumed yourself! Certainly serious commercial libraries will be maintained on both platforms for a longer period of time, but not forever.
  • Performance: We can expect more and more performance improvements with .NET OSS.
  • Tooling: Tools will continue to evolve and with time, less and less tool will support .NET 4.8 application.

Recently I’ve discussed with Jean-Baptiste Evain that develops the OSS library Cecil. Jb is also responsible for UnityVS at MS. Here at NDepend we’re relying on Cecil for more than a decade. Cecil processes compiled .NET assemblies bytes and thus will obviously benefit from Span<T> only available on .NET Core. This concrete situation illustrates well the points mentioned above:

  • By using Span<T> Jb is not enthusiast to have to maintain two versions of Cecil, one relying on Span<T> and the .NET 4.8 one.
  • Even though these two versions will co-exist because it is too early to discard the .NET 4.8 version of Cecil used by many serious projects, it is a matter of a few years until .NET 4.8 Cecil version gets deprecated.
  • Without Span<T> the .NET 4.8 version of Cecil will be slower.

Our case

NDepend is still running on .NET 4.8. NDepend is a CI tool, a standalone UI tool, an Azure DevOps extension and a Visual Studio extension. Developing an extension is a sensitive situation because we need to align our platform with the platform of the host. VS is such a massive application that I don’t expect it to run on .NET 5 in 2021. On the other hand VS is evolving so quickly nowadays that this possibility is not totally excluded. It is also possible that Microsoft takes an incremental approach and that the main VS process (devenv.exe) will remain on .NET Fx 4.8 for a while, while children processes run on .NET OSS (VS runs with quite a few children processes!).

At this point the reasonable move for us is to anticipate the migration to .NET OSS mostly by compiling as much code as possible against .NET Standard 2.0, supported by both .NET Fx 4.7.2+ and .NET Core. We also need to make sure that our WPF and Winforms code will be easily movable (which shouldn’t be a problem since most of the WPF/Winforms APIs are supported by .NET 3.1). We are also mulling over on having our own child process(es) but all the UI part must remain in the main VS process.

We also keep in mind that it will be tricky to support the future VS version running on .NET OSS and previous VS versions running on .NET v4.8 for a few years.

Conclusion

Those like us still working on a large .NET 4.8 legacy are entering into a turbulence zone for the years to come. However for all the reasons explained above, we can expect that in not so long (2023? 2025?) successful applications still running on .NET 4.8 will be the exception. Certainly not anticipating legacy migration now is likely a strategic mistake.

4 Predictions for the Future of .NET

In May 2019, Microsoft officially announced .NET 5, the future of .NET: it will be based on all the .NET Core work already achieved. Here is the schedule announced:

On one hand the future of .NET has never been so bright. On the other hand this represents a massive move for all .NET development shops, especially for those that still target .NET Framework 4.x that won’t evolve anymore. But not everything is clear from this announcement. Such massive move will have many collateral consequences that we can only guess by now. Certainly many points are not yet cast in stone and still debated.

Hence for large .NET legacy code bases some predictions must be made to plan now a seamless and in-time migration toward the future of .NET. So let’s do some predictions: it’ll still be interesting to come back in a few years and see how good or bad they were.

.NET Standard won’t evolve much

.NET Standard was introduced as a common API set that all .NET flavors must implement. .NET Standard superseded PCL (Portable Class Library). Now that several .NET frameworks will be unified upon .NET Core bases, and that the .NET Framework 4.x won’t support future versions of .NET Standard anymore, it sounds like the need for more .NET standard API will decrease significantly. Actually .NET Framework 4.8 doesn’t even support latest .NET Standard 2.1: “.NET Framework 4.8 will remain on .NET Standard 2.0 rather than implement .NET Standard 2.1”.

However .NET Standard is certainly not dead yet: it is (and will be for years to come) an essential tool to compile code into portable components that can be reused across several .NET flavors. However with this unification process the future of .NET Standard is compromised.

Visual Studio will run on .NET 5 or 6 (and in a x64 process)

It has to. Imagine the consequences if in 3 years from now (2019 Q4) the main Microsoft IDE for .NET professional developments still run on .NET Framework v4.8:

  • Engineers working on VS would lack access to all new .NET APIs, performance improvements and langage improvements. They would remain locked in the past.
  • As a consequence they wouldn’t use their own tool (dogfooding) and dogfooding is a key aspect of developing tools for developers.
  • Overall the message sent wouldn’t be acceptable for the users.

On the other hand, if you know a bit how VS works, imagine how massive this migration is going to be. For more than a decade there have been a lot of complaints from the community about Visual Studio not running in a 64 bits process. See some discussions on reddit here for example. If I remember well this x64 request was the most voted one when VS feedback was still handled by UserVoices. Some technical explanations have been provided by Microsoft like those ones provided 10 years ago! If in 2019 Visual Studio still doesn’t run in a x64 process, this says a lot on how large and complex such migration is.

It seems inevitable that this time the Visual Studio legacy will evolve toward what will be the future of .NET. One key benefit will be to run in a x64 process and have plenty of memory to work with very large solutions. Another implication is that all Visual Studio extensions, like our extension, must evolve too. Here at NDepend we are already preparing it but it will take time, not because we’ll miss much API (we’ll mostly miss AppDomain) but because:

  • We depend on some third-parties that we’d like to get rid of to have full control over our migration, and overall code.
  • For several years we’ll have to support both future Visual Studio versions and Visual Studio 2019, 2017 and maybe 2015 that runs on .NET Framework v4.x (btw we still support VS 2013/2012/2010 but this will have to be discarded to benefit from .NET Standard reused DLLs)

We cannot know yet if Visual Studio vNext will run on .NET 5 or if it’ll take more years until we see it running upon .NET 6?

Btw here are 2 posts Quickly assess your .NET code compliance with .NET Standard and An in-depth analysis of .NET Core 3.0 support for WPF and Winforms APIs that can help plan your own legacy migration.

.NET will propose a cross-platform UI Framework: WPF or a similar XAML UI Framework

On October 4, 2019 Satya Nadella revealed why Windows may not be the future of Microsoft’s business. In August 2019 Microsoft provided a .NET Cross Platform UI Framework Survey. Clearly a .NET cross-platform UI Framework is wanted: the community is asking for it. So far Microsoft closed the debate about WPF: WPF won’t be multi-platform.

Let’s also be crystal clear. This (WPF cross platform) is a very hard project. If the cost was low, this would be a very different conversation and very likely a different outcome. We have enough trouble being compatible with OpenSSL and that’s just one library.  Rich Lander – Dec 5, 2018

But given the immense benefits of what WPF running cross-platform would offer, I wouldn’t be surprise to see WPF become cross-platforms within the next years. Or at least a similar XAML UI framework. Moreover WPF is now open-source so who knows…

The Visual Studio UI is mostly based on WPF hence one of the benefit of having WPF cross-platform would be to have a unique cross-platform Visual Studio: the same way Microsoft is now unifying .NET Frameworks, they could unify the Visual Studio suite into a single cross-platform product.

Xamarin Forms and Avalonia are also natural candidates to be the .NET cross-platform UI Framework. But it seems those frameworks doesn’t receive enough love from the community, this is my subjective feeling. Also we have to keep in mind that Microsoft did a survey and that the community is massively asking for it.

Blazor is promised to a bright future

If you didn’t follow the recent Blazor evolution, the promises of this technology are huge:

  • Run .NET code in all browsers (like Silverlight)
  • with no browser plugin needed (unlike Silverlight)
  • with near-native performance
  • with components compiled to a compact binary format

This is all possible thanks to the WebAssembly (Wasm) format supported by most browsers.

WebAssembly (abbreviated Wasm) is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable target for compilation of high-level languages like C/C++/Rust, enabling deployment on the web for client and server applications.

Blazor was initially a personal project created by Steve Sanderson from Microsoft. It was first introduced during NDC OSLO in July 2017: the video is worth being watched, also read how enthusiastics are the comments. However Blazor is not yet finalized and still has some limitations: it doesn’t offer yet a decent debugging experience and the application size to download (a few MBs) is still too large because dependencies have to be loaded too. Those ones are currently being addressed (see here for debugging and here for download size, runtime code will be trimmed and cached and usage of CDN (Content Distribution Network) is mentioned).

The community is enthusiast, the technology is getting mature and there is no technological nor political barrier in sight: the Blazor future looks bright. Don’t miss the Blazor FAQ to learn more.

.NET Core 3.0 New APIs

.NET Core 3.0 has just been released, see here the official announcement. In this post we’re going to explain how to list and explore the new APIs introduced since .NET Core 2.2 (this API diff is also available here).

To diff the API versions download NDepend trial, start VisualNDepend.exe, and click Compare 2 versions of a code base.

  • In the Older Build add assemblies in the folder: C:\Program Files\dotnet\shared\Microsoft.NETCore.App\2.2.5
  • In the Newer Build add assemblies in the folder: C:\Program Files\dotnet\shared\Microsoft.NETCore.App\3.0.0

Then click OK. Depending on your hardware it’ll take between a dozen of seconds and a minute to analyze these two versions of .NET Core.

Here are the lists of new APIs obtained:

To obtain these lists we’ve edited 5 code queries and exported their results to HTML. For example to obtain the 28 new namespaces we edited the query:

And got this result, that also lists new types for each new namespace:

.NET Core 3.0 New Namespaces

The 4 others code queries used to list new types, new types and their members and new methods and new fields introduced in existing types are:




Find API Breaking Changes in your .NET Libraries and Frameworks

If you are developing a framework, the last thing you want to happen when releasing a new version of your product is to break the code of your clients because of an API breaking change. For example, you want to make sure that all public methods you had in previous versions are here in the next version, unless you tagged some of them with System.ObsoleteAttribute for a while before deprecation.

Let’s underline that we are talking here of syntactic breaking changes that can be all detected with tooling as we are about to see. On the other hand semantic breaking changes are code behavior changes that will break your client code. Those breaking changes cannot be found by a tool, but are typically caught by a solid test suite.

One key feature of NDepend is to be able to compare all results against a baseline snapshot. Code querying can explore changes since the baseline. For example this code query detects methods of an API that are not anymore visible from clients:

This query code is pretty simple and readable but let’s detail it a bit:

  • The warnif count > 0 prefix transforms this code query into a code rule
  • We iterate on methods in the baseline thanks to codeBase.OlderVersion().Application.Methods
  • We want methods that were publicly visible in the baseline and not just public. Being declared as public is not enough for a method to be publicly visible. A public method can still be declared in a class declared as internal.
  • We don’t warn for public methods that were deemed as obsolete in the baseline
  • We handle both situations: A) publicly visible methods removed since the baseline and B) publicly visible methods not publicly visible anymore.

This query is pretty similar than the source of the default rule API Breaking Changes: Methods that is a bit more sophisticated to handle more situations, like when the public method return type changes. This rule also presents results in a polished way.

Similar rules are available for publicly visible types and publicly visible fields.

Some other sorts of breaking changes are detected when for example, a publicly visible interface or base class changes: in such situation clients code that implement such interface or derive from such abstract class will be broken.

Some rules also detect when a serializable types gets broken and when an enumeration Flags status change.

Real Experiments

From comparing NHibernate v5.2.x against NHibernate v4.1.x here are breaking changes found: 36 types, 995 methods, 205 fields (including enumeration values) and 121 interfaces or abstract classes have been changed. I guess most of those breaking changes are about elements that were not intended to be seen by API consumers in the baseline version but a detailed analysis would be needed here.

NHibernate v5.2 vs v4.1 Breaking Changes

I’ve also compared .NET Core v2.2.5 with .NET Core v3.0 Preview 8 but hopefully didn’t find any breaking change. However NDepend has an heuristic code query to find types moved from one namespace or assembly to another and here more than 200 types were matched like the class ArrayList moved from the assembly System.Collections.NonGeneric.dll to the assembly System.Private.CoreLib.dll. Hopefully such changes are invisible to the clients consuming .NET Core as a NuGet package.

ArrayList moved from System.Collections.NonGeneric.dll to System.Private.CoreLib.dll

Find new APIs consumed and APIs not used anymore

A related and interesting topic is to review new APIs used by an application or APIs not used anymore. This is possible from the NDepend > Search Elements by Changes panel with the buttons Third Party Code Elements – Used Recently / Not Used Anymore. For example the screenshot below shows new API consumed by the application eShopOnWeb:

New APIs used by eShopOnWeb

Before releasing a new version, this is interesting to have a glance at API consumption changes. Doing so often sheds light on interesting points.

SOLID Design: The Dependency Inversion Principle (DIP)

After having covered the Open-Close Principle (OCP), the Liskov Substitution Principle (LSP), the Single Responsibility Principle (SRP) and the Interface Segregation Principle (ISP) let’s talk about the Dependency Inversion Principle (DIP) which is the D in the SOLID acronym. The DIP definition is:

a. High-level modules should not depend on low-level modules. Both should depend on abstractions.
b. Abstractions should not depend on details (concrete implementation). Details should depend on abstractions.

The DIP has been introduced in the 90s by Robert C Martin. Here is the original article.

A Dependency is a Risk

As all SOLID principles DIP is about system maintainability and reusability. Inevitably some parts of the system will evolve and will be modified. We want a design that is resilient to changes. To avoid that a change breaks too much, we must:

  • first identify parts of the code that are changes-prone.
  • second avoid dependencies on those changes-prone code portion.

The Liskov Substitution Principle (LSP) and the Interface Segregation Principle (ISP) articles explains that interfaces must be carefully thought out. Both principles are 2 faces of the same coin:

  • ISP is the client perspective: If an interface is too fat probably the client sees some behaviors it doesn’t care for.
  • LSP is the implementer perspective: If an interface is too fat probably a class that implements it won’t implement all its behaviors. Some behavior will end up throwing something like a NotSupportedException.

Efforts put in applying ISP and LSP result in interfaces stability. As a consequence these well-designed interfaces are less subject to changes than concrete classes that implement them.

Also having stable interfaces results in improved reusability. I am pretty confident that the interface IDisposable will never change. My classes can safely implement it and this interface is re-used all over the world.

In this context, the DIP states that depending on interfaces is less risky than depending on concrete implementations. DIP is about transforming this code:

into this code:

DIP is about removing dependencies from high-level code (like the ClientCode() method) to low-level code, low-level code being implementation details like the SqlConnection class. For that we create interfaces like IDbConnection. Then both high-level code and low-level code depend on these interfaces. The key is that SqlConnection is not visible anymore from the ClientCode(). This way the client code won’t be impacted by implementation changes, like when replacing the SQL Server RDBMS implementation with MySql for example.

Let’s underline that this minimal code sample doesn’t do justice to the word Inversion in the DIP acronym. The inversion is about interfaces introduced (to be consumed by high-level code) and implementation details: implementation details depends on the interfaces, not the opposite, here is the inversion.

DIP and Dependency Injection (DI)

The acronym DI is used for Dependency Injection and since it is almost the same as the DIP acronym this provokes confusion. The I is used for Inversion or Injection which might add up confusion. Hopefully DI and DIP are very much related.

  • DIP states that classes that implement interfaces are not visible to the client code.
  • DI is about binding classes behind the interfaces consumed by client code.

DI means that some code, external to client code, configures which classes will be used at runtime by the client code. This is simple DI:

Many .NET DI frameworks exist to offer flexibility in binding classes behind interfaces. Those frameworks are based on reflection and thus, they offer some kind of magic. The syntax looks like:

And then comes what is called a Service Locator. The client can use the locator to create instances of the concrete type without knowing it. It is like invoking a constructor on an interface:

Thus while DIP is about maintainable and reusable design, DI is about flexible design. Both are very much related. Let’s notice that the flexibility obtained from DI is especially useful for testing purposes. Being DIP compliant improves the testability of the code:

DIP and Inversion of Control (IoC)

The Inversion word is used both in DIP and IoC acronyms. This provokes confusion. Remember that the word Inversion in the DIP acronym is about implementation details depending on interfaces, not the opposite. The Inversion word in the IoC acronym is about calls to Library transformed into callbacks from Framework.

IoC is what differentiates a Framework from a Library. A library is typically a collection of functions and classes. On the other hands a framework also offers reusable classes but massively relies on callbacks. For example UI frameworks offers many callback points through graphical events:

The method m_ButtonOnClick() bound to the Button.OnClick event is a callback method. Instead of client code calling a framework method, the framework is responsible for calling back client code. This is an inversion in the control flow.

We can see that IoC is not related to DIP. However we can see Dependency Injection has a specialization of IoC:  DI is an IoC used specifically to manage dependencies.

DIP and the Level metric

Several code metrics can be used to measure, and thus constraint, the usage of DIP. One of these metric is the Level metric. The Level metric is defined as followed:

From this diagram we can infer that:

  • The Level metric is not defined for components involved in a dependency cycle. As a consequence null values can help tracking component dependency cycles.
  • The Level metric is defined for any dependency graph. Thus a Level metric can be defined for various granularity: methods, types, namespaces, assemblies.

DIP mostly states that types with Level 0 must be interfaces and enumerations (note that interfaces using others interfaces have a Level value higher than 0). If we say that a component is a group of types (like a namespace or an assembly) the DIP states that components with Level 0 must contain mostly interfaces and enumerations. With a quick code query like this one you can have a glance at types Level and check if most of low level types are interfaces:

The Level metric can also be used to track classes with high Level values: it is a good indication that some interfaces must be introduced to break the long chain of concrete code calls:

The class Program has a Level of 8 and if we look at the dependency graphs of types used from Program we can certainly see opportunities to introduce abstractions to be more DIP compliant:

DIP and the Abstractness vs. Instability Graph

Robert C. Martin not only coined the DIP but also proposed some code metrics to measure the DIP compliance. See these metrics definitions here. From these metrics an intriguing Abstractness vs. Instability diagram can be plotted. Here we plotted the 3 assemblies of the OSS eShopOnWeb application. This diagram has been obtained from an NDepend report:

  • The Abstractness metric is normalized : it takes its values in the range [0,1]. It measures the interfaces / classes ratio (1 means the assembly contains only interfaces and enumerations).
  • The Instability metric is normalized and measures the assembly’s resilience to change. In this context, being stable means that a lot of code depends on you (which is wrong for a concrete class and fine for an interface) and being unstable means the opposite: not much code depends on you (which is fine for a concrete class and wrong for an interface, a poorly used interface is potentially a waste of design efforts).

This diagram shows a balance between the two metrics and defines some green/orange/red zones:

  • A dot in the red Zone of Pain means that the assembly is mostly concrete and used a lot. This is a pain because all those concrete classes will likely undergo a lot of changes and each change will potentially impact a lot of code. An example of a class living in the Zone of Pain would be the String class. It is massively used but it is concrete: if a change should occur today in the String class the entire world would be impacted. Hopefully we can count on the String implementation to be both performance-wise and bug-free.
  • A dot in the red Zone of Uselessness means that the assembly contains mostly interfaces and enumerations and is not much used. This makes these abstractions useless.
  • The Green zone revolves around the Main Sequence line. This line represents the right balance between both metrics. Containing mostly interfaces and being used a lot is fine. Containing mostly classes and not being used much is fine. And then comes all intermediate well balanced values between these 2 extremes represented by the Main Sequence line. The Distance from Main Sequence metric can be normalized and measures this balance. A value close to 0 means that the dot is near the line, in the green zone, and that the DIP is respected.

Conclusion

As the Open-Close Principle (OCP), the Liskov Substitution Principle (LSP) and the Interface Segregation Principle (ISP) the DIP is a key principle to wisely harness the OOP abstraction and polymorphism concepts in order to improve the maintainability, the reusability and the testability of your code. No principle is an island (except maybe the Single Responsibility Principle (SRP)) and they must be applied hands-in-hands.

This article concludes this SOLID posts serie. Being aware of SOLID principles is not enough: they must be kept in mind during every design decision. But they also must be constrained by the KISS principle, Keep It Simple Stupid, because as we explained in the post Are SOLID principles Cargo Cult? it is easy to write entangled code in the name of SOLID principles. Then one can learn from experience. With years, identifying the right abstractions and partitioning properly the business needs in well balanced classes is becoming natural.

Are SOLID principles Cargo Cult?

My last post about SOLID Design: The Single Responsibility Principle (SRP) generated some discussion on reddit. The discussion originated from a remark considering SOLID principles as a Cargo Cult. Taking account the definition of Cargo Cult the metaphor is a bit provocative but it is not unfounded.

cargo cult is a belief system among members of a relatively undeveloped society in which adherents practice superstitious rituals hoping to bring modern goods supplied by a more technologically advanced society

The recent Boeing’s 737 Max fiasco revealed that some parts of their software have been outsourced to $9-an-hour engineers. Those engineers shouldn’t be blamed for not achieving top notch software taking account the budget. Nevertheless it is clear that a lot of software written nowadays look like this cargo cult plane. For many real-world developers, SOLID principles are superstitious rituals whose primary goal is to succeed during job interview.

The SRP article underlines that SRP is the only SOLID principle not related to the usage of abstraction and polymorphism. SRP is about logic partitioning into code: which logic should be declared in which class. But SRP is so vague it is practically useless from its two definitions.

Definition 1: A class should have a single responsibility and this responsibility should be entirely encapsulated by the class.

Definition 2: A class should have one reason to change.

One can justify any class design choice by tweaking somehow what is a responsibility or what is a reason to change. In other words, as someone wrote in comment: Most people who “practice” it don’t actually know what it means and use it as an excuse to do whatever the hell they were going to do anyways.

We can feel bitterness in those comments, certainly coming from seasoned developers whose job is to fix mistakes of $9 an hour engineers.

SOLID Principles vs. OOP Patterns

We must remember that SOLID principles emerged in the 80s and 90s from the work of world-class OOP experts like Robert C. Martin (Uncle Bob) and Bertrand Meyer. Software writing is often considered as an art. Terminologies such as clean code or beautiful code have been widely used. But art is a subjective activity. In this context, SOLID principles necessarily remain vague and subject to interpretation. And this is what makes the difference between a SOLID principle and an OOP pattern:

  • A SOLID Principle is subjective. It helps to guide the usage of powerful concepts of Object Oriented Programming (OOP).
  • An OOP Pattern is objective. It is a set of recipes to implement a well identified situation with the OOP concepts.

Despite a restraint number of keywords and operators, the OOP toolbelt of languages such as C# or Java is very rich. With a few dozens of characters it is possible to write code that puzzle experts. C# especially gets richer and richer with many syntactic sugars to express complex situations with just a few characters. This power is a double edged sword: seasoned developers can write neat and compact code. But on the other hand it is easy to misuse this power, especially for junior developers and all those that write code just to pay their bills.

Always keep in mind the KISS principle

Someone wrote in comments: “SOLID encourages abstraction, and abstraction increases complexity. It’s not always worth it, but it’s always presented as the non-plus ultra of good approaches.”

The only reason to be for abstraction in OOP is to simplify the implementation of a complex business rule.

  • Abstracting Circle, Rectangle and Triangle with an IShape interface will dramatically simplify the implementation of a shape drawing software.
  • On the other hand, creating an interface for each class is a waste of resource: not every concepts in your program deserve an abstraction.

This is why the Keep It Simple Stupid KISS principle should be always kept in mind: don’t add up extra implementation complexity on top of the business complexity.

SOLID and Static Analysis

I have been in the .NET static analysis industry since 2004. At that time I was consulting for large companies with massive legacy apps that were very costly to maintain. Books like Robert Martin’s Agile Principles, Patterns, and Practices made me realize that the source code is data. This data can be measured with code metrics. And the same way relational data can be crawled with SQL queries, code as data can be crawled with code queries. For example:

This query will objectively match complex methods not fully covered by tests. There are situations where one can argue that static analysis returns false positives but there is no justification for complex methods not well tested.

Not all aspects of SOLID principles can be objectively measured and verified. However static analysis can help bring objectiveness. For example:

SOLID and Testability

Regularly applying such rules will avoid taking SOLID too far to the point it becomes detrimental. However there are still all those aspects of SOLID, and code design in general, that must be left to creativity and interpretation. Experience in software development helps a lot here: over the years one refines his/her gut feeling about which design will increase flexibility and maintainability.

By definition juniors developer have no experience. However anyone can relentlessly struggle for 100% code coverage by tests. Being able to fully cover your code means, by definition, that your code is testable. Testability doesn’t come by chance. The properties that leads to full testability are the same properties that leads to high maintainability. Those properties include:

  • Easiness to use API
  • Domain classes well isolated
  • Careful map of logic to classes
  • Short classes and short methods
  • Cohesive classes
  • Abstractions and polymorphism used judiciously
  • Careful management of states mutability

Advice to add up objectivity when applying SOLID principles

Not everyone is a senior developer with a passion for well designed code. As a consequence Cargo Cult usage of SOLID principles is common. To improve the design some objectivity needs to be added in the development process. Here are my 3 advices for that:

  • KISS principle first, always struggle for simplicity: if it is complicated it is not SOLID.
  • Use static analysis to automatically monitor some measurable aspects of SOLID. Gross violations of code quality rules and metrics are also SOLID principles violations.
  • Refactor your code until it becomes seamlessly 100% coverable by tests. Code that cannot be easily 100% covered by tests is not SOLID.

 

 

SOLID Design: The Single Responsibility Principle (SRP)

After having covered The Open-Close Principle (OCP) and The Liskov Substitution Principle (LSP) let’s talk about the Single Responsibility Principle (SRP) which is the S in the SOLID acronym. The SRP definition is:

A class should have a single responsibility and this responsibility should be entirely encapsulated by the class.

This leads to what is a responsibility in software design? There is no trivial answer, this is why Robert C. Martin (Uncle Bob) rewrote the SRP principle this way:

A class should have one reason to change.

This leads to what is a reason to change?

SRP is a principle that cannot be easily inferred from its definition. Moreover the SRP lets a lot of room for own opinions and interpretations. So what is SRP about? SRP is about logic partitioning into code: which logic should be declared in which class. Something to keep in mind is that SRP is the only SOLID principle not related to the usage of abstraction and polymorphism.

The goal of this post is to propose objective and concrete guidelines to increase your classes compliance with SRP, and in-fine, increase the maintainability of your code.

SRP and Concerns

Typically the ActiveRecord pattern is used to exhibit a typical SRP violation. An ActiveRecord class has two responsibilities:

  • First an ActiveRecord object stores in-memory data retrieved from a relational database.
  • Second the record is active in the sense that data in-memory and data in the relational database are kept mirrored. For that, the CRUD (Create Read Update Delete) operations are implemented by the ActiveRecord.

To make things concrete an ActiveRecord class can look like that:

If Employee was a POCO class that doesn’t know about persistence and if the persistence was handled in a dedicated persistence layer the API would be improved because:

  • Not all Employee consumer wants to deal with persistence.
  • More importantly an Employee consumer really needs to know when an expensive DB roundtrip is triggered: if the Employee class is responsible for the persistence who knows if the data is persisted as soon as a setter is invoked?

Hence better isolate the persistence layer accesses and make them more explicit. This is why at NDepend we promote rules like UI layer shouldn’t use directly DB types that can be easily adapted to enforce any sort of code isolation.

Persistence is what we call a cross-cutting concerns, an aspect of the implementation that tends to spawn all over the code. We can expect that most domain objects are concerned with persistence. Other cross-cutting-concerns we want to separate domain objects from include: validation, log, authentication, error handling, threading, caching. The need to separate domain entities from those cross-cutting concerns can be handled by some OOP pattern like the pattern decorator for example. Alternatively some Object-Relational Mapping (ORM) frameworks and some Aspect-Oriented-Programming (AOP) frameworks can be used.

SRP and Reason to Change

Let’s consider this version of Employee:

The ComputePay() behavior is under the responsibility of the finance people and the ReportHours() behavior is under the responsibility of the operational people. Hence if a financial person needs a change to be implemented in ComputePay() we can assume this change won’t affect the ReportHours() method. Thus according to the version of SRP that states “a class should have one reason to change”, it is wise to declare these methods in different dedicated modules. As a consequence a change in ComputePay() has no risk to affect the behavior of ReportHours() and vice-versa. In other words we want these two parts of the code to be independent because they will evolve independently.

This is why Robert C. Martin wrote that SRP is about people : make sure that logics controlled by different people are implemented in different modules.

SRP and High-Cohesion

The SRP is about encapsulating logic and data in a class because they fit well together. Fit well means that the class is cohesive in the sense that most methods use most fields. Actually cohesion of a class can be measured with the Lack of Cohesion Of Methods (LCOM) metric. See below an explanations of LCOM (extracted from this great Stuart Celarier placemat) What matters is to understand that if all methods of a class are using all instances fields, the class is considered utterly cohesive and has the best LCOM score, which is 0 or close to 0.

Typically the effect of a SRP violation is to partition a class methods and fields into groups with few connections. The fields needed to compute the pay of an employee are not the same than the fields needed to report pending work. This is why the LCOM metric can be used to measure adherence to SRP and take actions. You can use the rule Avoid types with poor cohesion to track classes with poor cohesion between methods and fields.

SRP and Fat Code Smells

While we can hardly find an easy definition for what is a responsibility we noticed that adhering to SRP usually results in classes with a good LCOM score. On the other hand, not adhering to SRP usually leads to the God class phenomenon: a class that knows too much and does too much. Such god class is usually too large: violations of rules like Avoid types too big, Avoid types with too many methods, Avoid types with too many fields are good candidate to spot god classes and refactor into a finer-grained design.

Guidelines to adhere to SRP

Here are a set of objective and concrete guidelines to adhere to SRP:

  • Domain classes must be isolated from Cross-Cutting Concerns: code responsible for persistence, validation, log, authentication, error handling, threading, caching…
  • When implementing your domain, favor POCO classes that do not have any dependency on an external framework. Note that a POCO class is not necessarily a fields and properties only class, but can implement logic/behavior related to its data.
  • Use your understanding of the Domain to partition code: logics related to different business functions should be kept separated to avoid interference.
  • Regularly check the Lack of Cohesion Of Methods (LCOM) score of your classes.
  • Regularly check for too large and too complex classes.