NDepend

Improve your .NET code quality with NDepend

SOLID Design: The Open-Close Principle (OCP)

The Open-Close principle (OCP) is the O in the well known SOLID acronym.

Bertrand Meyer is generally credited for having originated the term open/closed principle, which appeared in his 1988 book Object Oriented Software Construction. Its original definition is

  • A module will be said to be open if it is still available for extension. For example, it should be possible to add fields to the data structures it contains, or new elements to the set of functions it performs.
  • A module will be said to be closed if it is available for use by other modules. This assumes that the module has been given a well-defined, stable description (the interface in the sense of information hiding)

Upon these definitions the principle is usually expressed and summarized this way:

Modules should be open for extension and closed for modification.

There have been (and still are) a lot of debates about the meaning of this simple definition and its implication on the usage of Object-Oriented Programming (OOP) in the real-world.

In this post I’ll try to be as concrete and concise as possible.

The classical code example to explain OCP

The classical code example to explain OCP can be translated in C# this way:

I find this example a bit gross. I believe one must have no idea of what OOP is to write such code. It can obviously be refactored to something like this:

Let’s look at the implication of this refactoring:

  • We introduced a new abstraction IShape to represent a concept that we already had in mind. Indeed, in the first code sample the method DrawShapes() already accepted a sequence of shapes. With the new IShape abstraction our design is now open to accept more shapes like Triangle or Pentagone.
  • The method DrawShapes() will draw any new shape without any need for modification. In other words the DrawShapes() method implementation is closed.

Here is how the OCP is usually presented. It is all about anticipating the future changes in the system in a way that:

  • When a change occurs existing code is left untouched. Existing code here is DrawShapes() concrete method body, IShape interface and Circle and Square classes.
  • When a change occurs new code to implement changes is written in new classes that implement existing abstraction. New classes here could be Triangle and Pentagone.

The Point of Variation principle

Another way to see the OCP is the Point of Variation (PV) principle that states:

Identify points of predicted variation and create a stable interface around them.

I find PV more understandable than OCP because it is actionable. First identify potential variations, second build the proper abstractions around these variations.

The real challenge: Anticipation

But the real challenge is anticipation, anticipating is hard. If anticipation was easy we’d be all billionaire in bitcoins. In real world, when you anticipate the risk is high that:

  1. We do anticipate variations that won’t vary. This is what the YAGNI principle says (You aren’t gonna need it): Always implement things when you actually need them, never when you just foresee that you need them. Developing and maintaining an abstraction has a cost and if we won’t need it this cost is negative.
  2. The risk is also high that we don’t anticipate the variation that will really be needed. But once the need for variation becomes real this is your developer responsibility to refactor and create the right abstractions and the right stable code that will act upon these abstractions. This is the fool me once, don’t fool me twice idea: I am not supposed to foresee what I’ll need but I am supposed to identify and then write the right abstraction when I need it.

OCP in the real world

Here is a pragmatic approach to OCP:

  • In any case the KISS principle applies (Keep It Simple Stupid): don’t underestimate the difficulty of anticipating and don’t waste your resources creating abstractions you won’t need.
  • Write automatic tests: one of the greatest benefit of writing tests is that for a while, you must look at your code from the client perspective. If your code contains some area difficult to cover by tests, it certainly means that your code should be refactored to be easily 100% testable. Experience shows that when refactoring from poorly testable code to fully testable code, the need for right abstractions naturally pops up.
  • Some static analyzers can help you pinpoint typical OCP violations:
    • when downcasting reference (i.e casting from a base class or interface to a subclass or leaf class.),
    • when using the is or as operators (as in the first example above).
    • NDepend has the rule Base class should not use derivatives: matches of this rules are obvious violation of the OCP.
  • Keep in mind the fool me once, don’t fool me twice idea. You must refactor code as soon as the need to abstract some concepts is identified. Of course sometime it is not possible if tons of client code depend upon your API: in this situation you cannot easily refactor and often you’ll have to live with wrong design. This is why public API design is such a sensitive topic: you have no other choice than doing your best to anticipate and to accept to live with your past design mistake.

Be open to more than one variation with the Visitor pattern

Finally let’s underline that in the real world, data objects (like the shapes here) don’t implement themselves algorithm such as drawing. Experience tells that this is a clear violation of OCP because when a new algorithm is needed on data objects, like persistence in addition of drawing for example, all shape classes must be modified again. This is also a violation of the Single Responsibility Principle (SRP, the S in SOLID) because a shape class has now two responsibilities: 1) holding the shape data 2) drawing the shape.

Hence we now have two variations: we  need a way to abstract both the shapes and the algorithms applied on the shapes, in order to to write something like algorithm.ApplyOn(shape). This sort of call on two abstractions is named a double dispatching call: the implementation really invoked depends both on the IShape object’s type and the IAlgorithm object’s type. If you have N shapes and M algorithms you need [N x M] implementations.

Fortunately the visitor pattern helps implementing double dispatching. The code with the new persistence algorithm would then look like:

 

 

 

SOLID design: The Liskov Substitution Principle (LSP)

The Liskov substitution principle is the L in the well known SOLID acronym. The original principle definition is:

Methods that use references to base classes must be able to use objects of derived classes without knowing it.

At first glance this principle is pretty easy to understand. At second glance it seems redundant with the OOP concept of polymorphism. After all, the whole point of polymorphism is to consume an abstraction without knowing the implementation behind isn’t it?

However it is a good thing that the community emphases the Liskov substitution principle. This principle is in fact a caveat for developers that polymorphism is both powerful and tricky : in the real world, the usage of polymorphism often leads to a dead-end situation, it must be wisely used.

LSP is often summarized with a counter-example of Duck Test“If it looks like a duck, quacks like a duck, but needs batteries – you probably have the wrong abstraction”

Let’s details some common polymorphism pitfalls that the Liskov substitution principle attempts to prevent by reminding the developer to adopt the client perspective.

Prevent situations where a method cannot be implemented

When a class implements an interface or derives from a base class, refactor tooling like Visual Studio refactor tools, Resharper or CodeRush propose to insert abstract methods stubs to implement. Typically the default body of such inserted method is throw NotImplementedException().

Obviously this behavior must remain temporary and must not be released in production. Client code that hold a reference on the interface or the base class doesn’t expect to get a NotImplementedException raised upon a method call. This is why NDepend has a default rule named Do implement methods that throw NotImplementedException, to prevent such situation.

On the other hand, if it doesn’t make sense to implement an abstract method, it clearly means that the design is wrong. Here is such wrong design, assuming that all birds can fly:

This code could then be refactored to:

The problem was that Bird with its Fly() method was too coarse, we needed some refinement because not all bird can fly. From my experience such wrong assumptions on interfaces and base classes happen quite often in the real world. When you stumble on such situation, see it as a good starting point for refactoring … if possible. Indeed, sometime refactoring is not an option if many clients depend already on the wrong design.

Example of a LSP violation in the .NET framework design

One dreaded LSP violation is .NET System.Array implementing the ICollection<T> interface. Hence Array has to implement the ICollection<T>.Add() method but calling this method on an array throws at runtime a NotSupportedException:

The C# compiler doesn’t even warn on such simple erroneous program.

Of course we’d need to ensure that ICollection<T>.IsReadOnly is false before modifying a collection through a reference of IList<T> or a ICollection<T> but frankly this is an error-prone design. I can remember having stumbled on this situation quite a few times during the last 15 years I am programming with C#/.NET.

Moreover refactoring this original design mistake is not an option anymore, even when .NET Core was introduced, since millions of programs are relying on this API. IReadOnlyCollection<T> has been introduced with .NET v4.5 in 2012 but the original design cannot be changed.

Think twice before applying the ISA trick

Another common example to explain the Liskov substitution principle is the Rectangle/Square paradigm. A square is-a rectangle isn’t it? So we should be able to write such code:

Clearly this is a wrong usage of the ISA principle: yes a square is a rectangle but the client doesn’t expect that when modifying the height of the rectangle, the width gets modified also.

Here also such wrong design emerges quite often in the real world. For example when dealing with a complex control hierarchy with dozens of fields to maintain at various level, it can become quite tricky to maintain coherence in your objects’ states at runtime.

Code Contract to help?

In theory code contract could help here. A contract is one or several assertions that prevent corrupted state. When a corrupted state is reached at runtime (whether in production or at test run time), such assertion must deadly fail because continuing running with corrupted state can potentially lead to more corrupted states.

Often contracts are used at interface level to improve the semantic by adding constraints to classes that implement the interface. We might want to force all implementations of IRectangle.set_Width to let the Height value untouched. By using Microsoft Code Contracts we could try to write something like that:

Unfortunately, as far as I know, Microsoft Code Contracts has no support for such PreserveValue() possibility. More generally it seems that Microsoft Code Contracts doesn’t receive much love nowadays despite how useful code contract can be. Ok C#8 non-nullable reference addresses many of the situations usually verified through contracts or assertions, but not all contracts are about nullable reference, as this example suggests.

Use polymorphism with great caution

These classical Bird and Rectangle examples show well how polymorphism and inheritance can quickly lead to rotten design, even in apparently simple situations. The Array class implementing ICollection<T> situation shows that in practice LSP violations just happen.

In my point of view, what the Liskov substitution principle really says is that when writing an API relying on polymorphism, you should first take the point of view of the client of your API before writing any interface and any class hierarchy.

  • Do really all birds can fly? What happen if I try to call Fly() on a bird that cannot fly?
  • Is a square really a rectangle? What happen if I change the width on a square?

In the real world this looks like:

  • Can all collections really be modified? What happen if I add or remove an element to an array?
  • Do all controls are scrollable? What happen if a scrollbar is displayed on a control that should not scroll?
  • Do withdrawal applies to all bank account? What happen if we try to withdraw money from a locked long term deposit account? Should we fail withdrawal badly in this situation or should we prevent such situation with an IAccountThatAuthorizeWithdraw abstraction?

A pattern emerges here: for each members of your interface you should question yourself: Do this member applies seamlessly to all objects that will implement this interface?

Notice that API writing principle is more general, it doesn’t only apply to polymorphism: when writing an API first take the point of view of the client of your API. This is the only way to achieve elegant API that clients will love to consume. This is another way to explain Test-Driven Development (TDD), where client code must be written for test and design purposes before writing the code itself.

In a previous post I explained why your classes should be declared as sealed when possible  and why NDepend proposes the default rule Class with no descendant should be sealed if possible. In the real world, a class is never well designed for inheritance by chance and thus should be sealed by default. Designing well a class for inheritance requires quite an extensive understanding of your domain. As for everything in life, it requires a lot of effort to build something that others will find easy to use.

 

 

Identify .NET Code Structure Patterns with no Effort

The two pillars of code maintainability are automatic testing and clean code structure.

  • Testing is used to regularly challenge code correctness and detect regression early. Testing can be easily assessed with numbers like code coverage ratio and the amount of assertions tested.
  • A clean code structure prevents the phenomenon of spaghetti code, entangled code that is hard to understand and hard to maintain. However assessing the code structure cannot be achieved through numbers like for testing. Moreover the structure emerges from a myriad of details buried in many source files and thus appropriate tooling is needed.

For most engineers, code dependency graph is the tool of choice to explore code structure. Boxes and arrows graph is intuitive and well adapted to visualize a small amount of dependencies. However to visualize complex portion of code the Dependency Structure Matrix (DSM) is more adapted. See below the same set of 34 namespaces visualized with the NDepend Dependency Graph and the NDepend Dependency Matrix.

NDepend dependency graph
NDepend dependency graph

 

NDepend Dependency Matrix
NDepend Dependency Matrix

If the concept of dependency matrix is something new to you, it is important to note that:

  • The Matrix headers’ elements represent graph boxes
  • The Matrix non-empty cells correspond to graph arrows. Numbers on the cells represents a measure of the coupling in terms of numbers of methods and fields involved. In a symmetric matrix a pair of blue and green cell is symmetric because both cells represents the same thing: the blue cell represents A uses B and the green cells represents B is used by A.

Here is a 5 minutes introduction video if you are not familiar with the dependency matrix:

Clearly the graph is more intuitive, but apart the two red arrows that represent two pairs of namespaces mutually dependent this graph tells few things about the overall structure.

On the other hand the matrix algorithm naturally attempts to layer code elements, exhibit dependency cycles, shows which element is used a lot or not… Let’s enumerate some structural patterns that can be visualized at a glance with the dependency matrix:

Layers

One pattern that is made obvious by a DSM is layered structure (i.e acyclic structure). When the matrix is triangular, with all blue cells in the lower-left triangle and all green cells in the upper-right triangle, then it shows that the structure is perfectly layered. In other words, the structure doesn’t contain any dependency cycle.

On the right part of the snapshot, the same layered structure is represented with a graph. All arrows have the same left to right direction. The problem with graph, is that the graph layout doesn’t scale. Here, we can barely see the big picture of the structure. If the number of boxes would be multiplied by 2, the graph would be completely unreadable. On the other side, the DSM representation wouldn’t be affected; we say that the DSM scales better than graph.

Notice that NDepend proposes 2 rules out of the box to control layering by preventing dependency cycles to appear: ND1400 Avoid namespaces mutually dependent and ND1401 Avoid namespaces dependency cycles.

Interestingly enough, most of graph layout algorithms rely on the fact that a graph is acyclic. To compute layout of a graph with cycles, these algorithms temporarily discard some dependencies to deal with a layered graph, and then append the discarded dependencies at the last step of the computation.

Cycles

If a structure contains a cycle, the cycle is displayed by a red square on the DSM. We can see that inside the red square, green and blue cells are mixed across the diagonal. There are also some black cells that represent mutual direct usage (i.e A is using B and B is using A).

The NDepend’s DSM comes with the option Indirect Dependency. An indirect dependency between A and B means that A is using something, that is using something, that is using something … that is using B. Below is shown the same DSM with a cycle but in indirect mode. We can see that the red square is filled up with only black cells. It just means that given any element A and B in the cycle, A and B are indirectly and mutually dependent.

Here is the same structure represented with a graph. The red arrow shows that several elements are mutually dependent. But the graph is not of any help to highlight all elements involved in the parent cycle.

Notice that in NDepend, we provided a button to highlight cycles in the DSM (if any). If the structure is layered, then this button has for effect to triangularize the matrix and to keep non-empty cells as closed as possible to the diagonal.

High Cohesion / Low-Coupling

The idea of high-cohesion (inside a component) / low-coupling (between components) is popular nowadays. But if one cannot measure and visualize dependencies, it is hard to get a concrete evaluation of cohesion and coupling. DSM is good at showing high cohesion. In the DSM below, an obvious squared aggregate around the diagonal is displayed. It means that elements involved in the square have a high cohesion: they are strongly dependent on each other although. Moreover, we can see that they are layered since there is no cycle. They are certainly candidate to be grouped into a parent artifact (such as a namespace or an assembly).

On the other hand, the fact that most cells around the square are empty advocate for low-coupling between elements of the square and other elements.

In the DSM below, we can see 2 components with high cohesion (upper and lower square) and a pretty low coupling between them.

While refactoring, having such an indicator can be pretty useful to know if there are opportunities to split coarse components into several more fine-grained components.

Too many responsibilities

The popular Single Responsibility Principle (SRP) states that: a class shouldn’t have more than one reason to change. Another way to interpret the SRP is that a class shouldn’t use too many different other types. If we extend the idea at other level (assemblies, namespaces and method), certainly, if a code element is using dozens of other different code elements (at same level), it has too many responsibilities. Often the term God class or God component is used to qualify such piece of code.

DSM can help pinpoint code elements with too many responsibilities. Such code element is represented by columns with many blue cells and by rows with many green cells. The DSM below exposes this phenomenon.

Popular Code Elements

A popular code element is used by many other code elements. Popular code elements are unavoidable (think of the String class for example).

A popular code element is not a flaw. However it is advised that popular elements are interfaces and enumerations. This way consumers rely on abstractions and not on implementations details. The benefit is that consumers are less often broken because abstraction are less subject to change than implementations.

A popular code element is represented by columns with many green cells and by rows with many blue cells. The DSM below highlights a popular code element.

Something to notice is that when one is keeping its code structure perfectly layered, popular components are naturally kept at low-level. Indeed, a popular component cannot de-facto use many things, because popular component are low-level, they cannot use something at a higher level. This would create a dependency from low-level to high-level and this would break the acyclic property of the structure.

Mutual dependencies

You can see the coupling between 2 components by right clicking a non-empty cell, and select the menu Open this dependency.

If the opened cell was black as in the snapshot above (i.e if A and B are mutually dependent) then the resulting rectangular matrix will contains both green and blue cells (and eventually black cells as well) as in the snapshot below.

In this situation, you’ll often notice a deficit of green or blue cells (3 blue cells for 1 green cell here). It is because even if 2 code elements are mutually dependent, there often exists a natural level order between them. For example, consider the System.Threading namespaces and the System.String class. They are mutually dependent; they both rely on each other. But the matrix shows that Threading is much more dependent on String than the opposite (there are much more blue cells than green cells). This confirms the intuition that Threading is upper level than String.

Ensure that your classes are declared as sealed when possible

Inheritance is one of the pillar of OOP. However, in the real world, most classes are not designed to be properly inheritable.

Properly designing a class to be inheritable is a tricky task. One must anticipate state initialization, state correctness, state mutability and also non-public methods calls from descendants. Anticipating well code re-use is hard and this requires experience. As a consequence there is no chance that a class gets well designed for inheritance by chance.

However, in both C# and VB.NET, a class is by default inheritable. The C# keyword sealed and the VB.NET keyword NotInheritable must be specified explicitly. Since there is no compiler warning most developers don’t bother to seal their classes. Hence most classes are not properly designed for inheritance but are implicitly declared as inheritable.

As if writing high quality code was not complicated enough here we have a language pitfall.

This is why NDepend proposes the default rule Class with no descendant should be sealed if possible. The CQLinq source code of this rule is pretty obvious:

As explained in the rule’s comments, you won’t get warnings per default for public non-sealed classes with no descendants. Indeed, it might be well that the analyzed code base is a framework and that descendants of such classes are not available at the time of analysis. But if the code base analyzed is an application with no public API it is important to comment this clause and get most public classes declared as sealed.

If you are not convinced that most classes should be sealed, there are also non-obvious reasons. Here is a quote from Jeffrey Richter from his essential book CLR via C#:

When defining a new type, compilers should make the class sealed by default so that the class cannot be used as a base class. Instead, many compilers, including C#, default to unsealed classes and allow the programmer to explicitly mark a class as sealed by using the sealed keyword. Obviously, it is too late now, but I think that today’s compilers have chosen the wrong default and it would be nice if it could change with future compilers. There are three reasons why a sealed class is better than an unsealed class:

  • Versioning: When a class is originally sealed, it can change to unsealed in the future without breaking compatibility. (…)
  • Performance: (…) if the JIT compiler sees a call to a virtual method using a sealed types, the JIT compiler can produce more efficient code by calling the method non-virtually.(…)
  • Security and Predictability: A class must protect its own state and not allow itself to ever become corrupted. When a class is unsealed, a derived class can access and manipulate the base class’s state if any data fields or methods that internally manipulate fields are accessible and not private.(…)

In the same vein you can also refer to this Eric Lippert 2004 post where he explains why most .NET Framework classes are sealed per default.

The continuous adaptation of Visual Studio extensions

One could think that developing an extension for a two-decades+ product as mature as Visual Studio is headache-less.

Not really. Visual Studio is a big big beast used by millions of professional developers daily. Versions after versions Microsoft is spending an astounding amount of effort to improve it and make sure it respects the latest standard in terms of dev platforms, IDE features, usability and performance.

This implies a lot of deep changes and all VS extensions must adapt to these changes … or die. Most of this adaptation efforts come from the support of new platforms (.NET Core, Xamarin, upcoming .NET 5…) but here I’d like to focus on VS integration recent changes.

Deferred extensions loading

On Monday 11th December 2017 all VS partners received an mail announcing VS package deferred loading, also referred to extension asynchronous initialization. In this mode VS extensions are loaded after VS startup and initialization. As a consequence VS startup is not slowed down by extensions.

Implementing this scenario required serious changes described here but this deferred loading scenario became mandatory only with VS 2019.1 (16.1) released this spring 2019, around 18 months later after the initial announcement. VS extension ISVs got plenty of time (and plenty of high quality support from Microsoft engineers) to implement and test this async loading scheme.

VS2019 Extensions Menu

In February 2019, we were developing the support of NDepend for VS 2019 and were downloading the latest VS2019 preview, installed our NDepend extension and immediately noticed there were no more NDepend global menu? After investigation we saw our NDepend main menu, and other extensions menus, under a new Extensions menu –with no possibility to get back extensions menu in the main bar–.

Visual Studio 2019 Extensions Menu

This change impacts millions of extensions users and hundreds of partners ISVs but has never been announced publicly ahead nor debated. As an immediate consequence someone asked to get rid of this new Extensions menu and this generated a long list of complains.

I guess this major (and blunt) change was provoked initially by the VS2019 compact menu look, but I am not sure: even on a 1920 pixel wide monitor (DPI 100%) there is still plenty of width space to show extension menus.

Visual Studio 2019 Compact Menu

Sometime in April 2019 someone anonymously published on the discussion the link to a new extension to put back extensions main menus in the main bar.

Getting back Extensions in Main Menu

I guess the VS APIs used to achieve this magic are not well documented, hence we can wonder if this initiative comes from MS or not? At least now users can get back their extensions menus in the main bar. But the Extensions in Main Menus extension has been downloaded only 800 times in around 3 months. It means only a tiny subset of VS extension users are aware of it and are actually using it.

Personally I still have hope Microsoft will reconsider this move and will embed a similar option in a future VS version to let the users chose which features (out-of-the-box VS feature or extension feature) deserves a no-brainer/main bar access or not.

VS2019 Optimize rendering for screens with different pixel densities

We published NDepend v2019.2.0 on 14th March 2019 with proud support for VS2019 and .NET Core 3. Two weeks later a user reported severe UI bugs we’ve never observed. The repro scheme was easy:

  • install .NET Fx 4.8,
  • enable the new VS2019 option Optimize rendering for screens with different pixel densities
  • run VS2019 with the NDepend extension in a multi-monitors environment with various DPI scale for each monitor
Optimize rendering for screens with different pixel densities
Optimize rendering for screens with different pixel densities

We then discussed with VS engineers. They pointed us to this documentation and kindly offered quite a few custom advices: Per-Monitor Awareness support for Visual Studio extenders

Actually this change is pretty deep for (the tons of) WPF and Winforms controls rendered in the Visual studio UI. This took us time but we just released on July 3rd 2019 NDepend v2019.2.5 with full support for this Optimize rendering for screens with different pixel densities option. This required quite tricky fixes and we’d wish we have had a few months to adapt to this major breaking changes as we had for deferred loading explained above. As far as I know extensions ISVs were not informed in advance. On the other hand I’d like to underline the awesome support we got from Visual Studio engineers (as always actually) thanks again to all of them.

I noticed that Resharper pops up a dialog at VS2019 starting time (not sure if they got all fixed at the time of writing).

Resharper Message related to Optimize rendering for screens with different pixel densities

I also noticed that up to VS2019 16.1.3 there were still some bugs in the VS UI, see the Project Properties and Diagnostics Tool windows below. I just tried with 16.1.5 and it seems fixed.

VS 2019 16.1.3 UI issues (now fixed)

My point is that this great improvement (Optimize rendering for screens with different pixel densities) should have been highlighted ahead during VS2019 preview time to avoid a wide range of UI bugs experienced by users during the early months of VS2019 RTM.

What can we expect next?

With .NET 5 public announcement on May 6th 2019 the .NET community now knows that sooner or later most of .NET Core or .NET Fx applications will have to be migrated to benefit from latest innovations and improvements.

Hopefully .NET Core apps migration will be quite straightforward since .NET 5 is .NET Core vNext. But .NET Fx apps migration will be (more or less) painful. Although WPF and Winforms desktop APIs are already supported by .NET Core 3, some others major APIs are not – and won’t be – supported (ASP.NET Web Forms, WCF, Windows Workflow, .NET Remoting, AppDomain…).

For too complex migration plans here is the Microsoft recommendation: What do you do with your older applications that you are not spending much engineering time on? We recommend leaving these on .NET Framework. (…) .NET Framework will continue to be supported and will receive minor updates. Even here at Microsoft, many large products will remain on .NET Framework. There are absolutely no changes to support and that will not change in the future. .NET Framework 4.8 is the latest version of .NET Framework and will continue to be distributed with future releases of Windows. If it is installed on a supported version of Windows, .NET Framework 4.8 will continue to be supported too.

Visual Studio 2019 is still running in a 32 bits process upon the .NET Fx 4 CLR with a massive WPF based UI and some COM usage.

Can we expect VS2019 to run on .NET 5 sometime in 2021? or in a later version? or never?

Only the future will tell but I hope extension VS ISVs will be advised ahead to anticipate the migration and continue to offer a seamless integration experience to users.