NDepend

Improve your .NET code quality with NDepend

10 Visual Studio Solution Explorer Productivity Tips

The Visual Studio Solution Explorer panel is like home for Visual Studio users. It presents all projects, source files and items thanks to a treeview layout.

This panel is quite sophisticated and it is likely that you don’t use all the power of this great tool. Here are some tips:

1) Automatically Track active Item in Solution Explorer

On a large Visual Studio solution with hundreds or thousands of source files, the Visual Studio Solution Explorer button Sync with Active Document is often clicked to find your way.

Visual Studio Solution Explorer Sync with Active Document

This sync can be done automatically. For that you just need to check: Visual Studio  >  top menus   >  Tools  >  Options…  >  Projects and Solutions  > General  >  Track Active Item in Solution Explorer.

Track Active Item in Solution Explorer
Track Active Item in Solution Explorer

2) Improve Visual Studio Startup Performance

Visual Studio startup is faster with these 3 settings:

  • Allow parallel project initialization checked
  • Reopen documents on solution load unchecked
  • Restore Solution Explorer project hierarchy state on solution load unchecked

This tip is especially useful to accelerate Visual Studio Startup when opening a large solution.

Improve Visual Studio Startup Performance
Improve Visual Studio Startup Performance

3) Search in Visual Studio Solution Explorer

The shortcut Ctrl+; set the focus in the Visual Studio Solution Explorer search textbox and you can start typing your search term(s). If the Solution Explorer panel is not visible this shortcut makes it visible. When displaying search results the treeview hierarchy is kept so you know the location of matched items.

The search setting Search within file contents is set by default but can be turned off to restreint the search on file name only.

Search in Visual Studio Solution Explorer
Search in Visual Studio Solution Explorer

The search setting Search within external items is set by default to match files in External Dependencies folders. Turning it off prevents searching within the C/C++ External Dependencies folder. This setting is not related to C#, VB.NET and F# projects.

Search within external items
Search within external items

4) Browse Types, Methods and Fields

In Visual Studio Solution Explorer a source file item can be expanded to browse type(s), methods and fields declared in the source file. Types, methods and fields items are shown in the same order of their declarations in the source file. These items can be double-clicked to open the source declaration and right-clicked to show the related menus.

Visual Studio Solution Explorer Types, Methods and Fields Items
Visual Studio Solution Explorer Types, Methods and Fields Items

5) Filter File Opened or Pending Changes

Two filters are proposed:

  • Open Files Filter: This filter is especially useful when a code review or a change spawn on many files.
  • Pending Changes Filter: Often a refactoring session spawn on several files and this filter lets keep track of file changed.

It is also possible to create your own filter as explained in this documentation: Extend the Solution Explorer filter

Open Files or Pending Changes Filters
Open Files or Pending Changes Filters

6) Quickly Set the Startup Project

Most developers I know setup the startup project this way: 1) find the project in the Solution Explorer 2) right click the project and find in the long list of menus the Set as Startup Project menu.

Here is a quicker way : select the Solution item, Alt+Enter shortcut to view the Solution Property Pages dialog, there you can quickly chose the startup project in a combo-box and even define multiple startup projects:

Quickly Set the Startup Project
Quickly Set the Startup Project

7) Preview Selected Items on Project

When the checkbox Preview Selected Items is checked, when selecting a source file it is automatically opened in the preview tab. This feature is widely used but what you might not know is that this feature also work to open project files XML content!

Preview Selected Item Works on Project Item
Preview Selected Item Works on Project Item

8) Scope to a Single Project

Often a development session only concerns a single project. You can restreint the Visual Studio Solution Explorer to only a single project with the project right-click menu Scope to This.

Scope to a Single Project
Scope to a Single Project

9) Multiple Solution Explorer Views

I wish that the Scope to This facility shown above would be available when right-clicking several projects to restreint the focus on a few projects, but it is not.

However it is possible to work with multiple solution explorer views, each one scoping to a project. To do so right-click a project and click the menu New Solution Explorer View. Notice that extra Solution Explorer View(s) are not persisted. When you close and re-open Visual Studio they are not here anymore.

New Solution View
New Solution View

10) Switch to Folder Views

Most of the time the Visual Studio Solution Explorer displays assets structured in a logical way. For example solution folders used to group projects don’t necessarily represent real OS file system folders. It is often useful to switch to a physical presentation of the solution folders and files. This can be achieved with the Switch Views combo box.

Solution Explorer Folder View
Solution Explorer Folder View

11) Bonus: Explore Solution Architecture

The new NDepend powerful dependency graph is like a second Solution Explorer that focuses on the solution architecture. Folders and files can be drag-and-dropped from the Visual Studio Solution Explorer to the graph.

Drag and Drop from the Visual Studio Solution Explorer to the NDepend Dependency Graph
Drag and Drop from the Visual Studio Solution Explorer to the NDepend Dependency Graph

There are many more facilities presented in the videos below:

  • It can handle live very large solutions made of dozens or hundreds of projects.
  • Right clicking a solution explorer item like a project or a source file shows a menu to generate a graph of the item’s child, callers and callees.
  • Box size in the graph is proportional to the number of lines of code to get an intuitive view.
  • Items in the graph can be searched, expanded and collapsed the same way as in the Visual Studio Solution Explorer.
  • The graph has a navigation bar to quickly generate call graph, coupling graph, class diagram, graph of dependency cycles, graph of changes since baseline and more.

 

Case Study: 2 Simple Principles to achieve High Code Maintainability

High Code Maintainability is the key to make both the management and the developers happy:

  • Maintainability lets a product evolves naturally at a sustained pace with controlled cost.
  • Maintainability lets developers add new features and improve existing ones without spending most of their time refactoring old dusty code and fixing bugs.

After 16 years of development on our product NDepend (first release in April 2004!) we came to the conclusion that:

Highly Maintainable Code can be achieved through two simple, objective and verifiable principles: Layered Architecture and High Test Coverage Ratio

Layered Architecture prevents entangled code, the well know spaghetti code phenomenon. Dependencies get mastered and when it is time for the code to evolve new classes and interfaces naturally integrate with existing ones.

High Test Coverage Ratio means that when code covered by tests get refactored, existing tests get impacted. With not much efforts the developer discovers regression problems and fix them before they go to production and become bugs to fix. The more code is covered by tests the more you’ll benefit from this shield.

When writing a tool for developers, the most satisfying part is to challenge the tool on its own code: this practice is named dogfooding. We just rewrote completely the dependency graph of NDepend so let’s use this important refactoring as a case study. Then we’ll see how to automatize the validation of these principles.

Case Study: Layered Architecture

Let’s first present the layered architecture principle and then the test coverage principle.

See below a graph of the 250+ classes, interfaces and enumerations used to implement the new dependency graph. A 2.500+ classes, methods and fields SVG vector dependency graph is available here.

The class GraphController is selected:

  • The blue classes are the ones directly used by GraphController
  • The light-blue classes are the ones indirectly used by GraphController (indirectly means used by a classes used by a class … used by GraphController). Clearly GraphController relies on everything.
  • The red classes are the ones mutually dependent with GraphController.
The NDepend Dependency Graph used to visualize its own code

Several things can be said on how this code is structured:

  • This is not an API so we can use namespaces the way we want. Here namespaces implement the concept of components.
  • Box size is proportional to the number of lines of code. We can see that the overall namespaces box size is well balanced. This is a good practice to avoid having a few monster components and tons of smaller components.
  • The biggest component in terms of number of classes and lines of code is the implementation of the Undo/Redo system. More than 30 actions are implemented (expand/collapse, change GroupBy, select/unselect, generate a call graph…). These actions are relatively low level in the structure. While they act on the entire system they are not coupled with the controller, the UI rendering or the layout computation.
  • The two lowest components are Base and Model. Both contain few logic and are used by almost all other components.

In the future, whether we add new actions on the graph or decide to improve the layout somehow, this architecture won’t undergo drastic modifications. Thanks to this view it’ll be easy to decide in which component to add our new classes or if new components should be added and what they can and cannot use.

Ideally the GraphControl class shouldn’t be entangled with the GraphController class. These two classes have been developed together. See below the coupling graph between GraphController and GraphControl. It has been obtained by double-clicking the red arrow between the two classes. It wouldn’t be difficult to introduce an interface to inject one implementation in the other one but we didn’t do it (see below the coupling graph between the two classes) . This is the key when it comes to care for maintainability: which move will offer the highest ROI? Not everything has to be perfect just for the sake of it. Experience shows that having only two classes entangled does not impact much the maintainability. We estimated that spending our resources to satisfy the two principles has a better ROI in the long run.

Coupling Graph between GraphController class and GraphControl class

 

Case Study: High Test Coverage Ratio

The graph implementation is 90% covered by tests. It is not because there is a lot of UI code that it should not been well tested. We didn’t spend a good part of our resources in writing tests just for the sake of it. We did it because we know by experience that it’ll pay off: probably a few bugs will be reported as for every 1.0 implementation although beta test phases already caught some. But we are confident that it won’t take a lot of resources to fix them. We can look forward the future confidently (like supporting properly .NET 5 that will be released in November 2020).

The picture below shows all namespace, classes and methods. Smaller rectangles are methods and the color of each rectangle indicates how well a method is covered by tests. Clearly we tolerate some gaps in UI code, while non UI code like Undo/Redo actions implementations are 100% covered. Here also experience tells us how to balance our resources and that everything does not have to be perfect to achieve high maintainability.

NDepend Graph Implementation 90% covered by tests

In terms of lines of code the NDepend Graph is not even 5% of the entire product, it is a tool in the toolset. The worst case scenario would be that each tool implementation regularly spits some bugs: all our resources would be spent fixing them, we couldn’t continue adding value to the product and the business would probably die at a point. Not even mentioning the frustration of users of a buggy product.

Each year we fix a few dozens of bugs that each impact few users but that doesn’t take us more than a tiny percentage of our overall development resources. The overall code base is 86.5% covered by tests and is entirely layered: maintenance doesn’t cost us much.

Typically at this point comes the remark: but code coverage is not enough, results must be asserted by unit-tests. And indeed, if nothing gets asserted nothing gets tested even if the code is entirely covered by tests.  We want tests to fail when something is going wrong. In this next post Case Study : Complex UI Testing I explain how millions of assertions get checked while running our test suite against the graph implementation.

Automatically Validate Layered Architecture and High Test Coverage Ratio

NDepend offers hundreds of default code rules but only 4 of them are used to validate these key points:

The fourth rule Avoid namespaces mutually dependent helps a lot to layer a large super-component. In this situation the first thing to do is to make sure there is no pair of components that use each other. For each such pair of namespaces matched, this rule has an heuristic and tells which type should not use which other type, same for method level. A technical-debt estimation is also given in terms of development effort it’ll cost to fix each pair. Here it says that 11 man-day (8 hours a day) should be spent if someone decides to layer the NHibernate code base. Unfortunately this is not possible because it would break thousands of client code base bound with it. Also let’s note that an interest estimation is also given in terms of: how much development effort does it takes per year if I let issues unfixed. Here this rule estimates that not fixing all those pairs of namespaces entangled costs 5 man-days a year to the development team.

Avoid Namespaces Mutually Dependent with advices on what to do and costs estimation

These rules can be validated during the build process (Azure DevOps / TFS, Jenkins, TeamCity, Bamboo, SonarQube…) and the team can know when the new code written diverges from these two maintainability goals.

 

Conclusion: Objective, Verifiable, Simple

What is interesting with these two simple concepts, layering and code coverage, is that they can be objectively applied, validated and measured. Last year in 2019 I wrote a blog post series on SOLID principles and there have been so much debate about how to apply them in the real-world. SOLID principles are a great way to improve our understanding of Object Oriented Programming and how encapsulation, abstraction, polymorphism, inheritance … should be used and not used. But when it comes to write maintainable code everyone has a different opinion.

If it is decided that the code structure should be layered there is not much debate about which part should be abstracted from other ones. If a class A should use a class B and B is in a higher layer than A, somehow an interface IB must be created at A level to inject the B implementation in A without breaking the layering.

These 2 concepts emerged over the years because we had the utter need to produce maintainable code. What I really like is that they are simple. And KISS (Keep It Simple Stupid) is a great principle in software engineering.

If a third principle should be added it would definitely be about user documentation: we offer free email support to users but we also offer tons of embedded and online documentation. Everytime a question starts to be asked a few times, we make sure that users can get the response immediately from both a tooltip (or a smart UI change) and from the online documentation. Some other ISV decides to make money with support. Personally I don’t find this fair because it is a clear incentive to produce rotted documentation and hence frictions for the user.

How did we obtain the image in this post

Let’s show that all those images in this post have been obtained within a few clicks.

  • First let’s search for Graph Panel in the entire NDepend code base (they get zoomed automatically).
  • Then let’s reset the metric view with NDepend.UI.Graph.* namespaces to get the colored treemap.
  • Then let’s go back to the graph and only keep NDepend.UI.Graph.* namespaces matched by the search.
  • Then un-group by parent assembly to get a graph made of namespaces only.
  • Then change the layout direction from Top to Bottom to have a nicer layout.
  • Then expand all namespaces to get all classes.
  • Finally expand all classes to get all methods and fields.
Using the NDepend Graph to obtain a clear view of the implementation of the Graph

 

Mythical man month : 10 lines per developer day

The mythical book, Mythical man month quotes that no matter the programming language chosen, a professional developer will write on average 10 lines of code (LoC) day.

After 14 years of full-time development on the tool NDepend I’d like to elaborate a bit here.

Let’s start with the definition of logical Line of Code. Basically, a logical LoC is a PDB sequence point except sequence points corresponding to opening and closing method brace. So here we have a 5 logical lines of code method for example:

I already hear readers complaining that LoC has nothing to do with productivity. Bill Gates once said “Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.“.

And indeed, measured on a few days or a few weeks range, LoC has nothing to do with productivity. As a full-time developer some days I write 200 LoC in a row, some days I spend 8 hours fixing a pesky bug by not even adding a LoC. Some day I clean dead code and remove some LoC. Some other days I refactor existing code without, all in all, adding a single LoC. Some days I create a large and complex UI control and the editor generates automatically 300 additional LoC. Some days are dedicated solely to performance enhancement or writing tests…

What is interesting is the average number of LoC obtained from the long term. And if I do the simple math our average is around 80 LoC per day. Let’s precise that we are strict on high code quality standard both in terms of code structure and formatting, and in terms of testing and code coverage ratio (see the last picture of this post that shows the NDepend code coverage map). For a code quality tool for developers, being strict on code quality means dogfooding☺.

So this average score of 80 LoC produced per day doesn’t sacrifice to code quality, and is a sustainable rhythm. Things get interesting with LoC after calibration: caring about counting LoC becomes an accurate estimation tool. After coding and measuring dozens of features achieved in this particular context of development, the size of any feature can be estimated accurately in terms of LoC. Hence with simple math, the time it’ll take to deliver a feature to production can be accurately estimated. To illustrate this fact, here is a decorated treemap view of the NDepend code base, K means 1.000 LoC. This view is obtained from the NDepend metric view panel with handmade coloring to illustrate my point. The small rectangles are methods grouped by parent classes, parent namespaces and parent assemblies. A rectangle area is proportional to the corresponding method #LoC.

treemap code-metric

Thanks to this map, I can compare the size in terms of LoC of most components. Coupling this information with the fact that the average coding score if 80 LoC per day, and looking back on cost in times for each component, we have an accurate method to tune our way of coding and estimate future schedules.

Of course not all components are equals. Most of them are the result of a long evolutive coding process. For example, the code model had undergone much more refactoring since the beginning than say, the dependency matrix for example that had been delivered out-of-the-box after a few months of development.

This picture reveals something else interesting. We can see that all these years spent polishing the tool to meet high professional standards in terms of ergonomy and performance, consumed actually quite a few LoC. Obviously building a performant code query engine based of C# LINQ that is now the backbone of the product took years. This feature alone now weights 34K LoC. More surprisingly just having a clean Project Properties UI management and model takes (model + UI) =(4K + 7K) = 11K LoC. While a flagship feature such as the interactive Dependency Graph only consumes 8K LoC, not as much as the Project Properties implementation. Of course the interactive Dependency Graph capitalizes a lot on the existing infrastructure developed for other features including the Dependency Model. But as a matter of fact, it took the same amount of effort to develop  the Dependency Graph than to develop a polished Project Properties model and UI.

All this confirms an essential lesson for everyone in charge of an ISV. It is lightweight and easy to develop a nice and flashy prototype application that’ll bring enthusiast users. What is really costly is to transform it into something usable, stable, clean, fast with all possible ergonomy candy to make the life of the user easier. And these are all these non-functional requirements that will make the difference between a product used by a few dozens of enthusiast users only, and a product used by the mass.

To finish, it is also interesting to visualize the code base through the prism of code coverage ratio. The NDepend code base being 86% covered, by comparing both pictures we can easily see which part is almost 100% covered and which part need more testing effort.

Find API Breaking Changes in your .NET Libraries and Frameworks

If you are developing a framework, the last thing you want to happen when releasing a new version of your product is to break the code of your clients because of an API breaking change. For example, you want to make sure that all public methods you had in previous versions are here in the next version, unless you tagged some of them with System.ObsoleteAttribute for a while before deprecation.

Let’s underline that we are talking here of syntactic breaking changes that can be all detected with tooling as we are about to see. On the other hand semantic breaking changes are code behavior changes that will break your client code. Those breaking changes cannot be found by a tool, but are typically caught by a solid test suite.

One key feature of NDepend is to be able to compare all results against a baseline snapshot. Code querying can explore changes since the baseline. For example this code query detects methods of an API that are not anymore visible from clients:

This query code is pretty simple and readable but let’s detail it a bit:

  • The warnif count > 0 prefix transforms this code query into a code rule
  • We iterate on methods in the baseline thanks to codeBase.OlderVersion().Application.Methods
  • We want methods that were publicly visible in the baseline and not just public. Being declared as public is not enough for a method to be publicly visible. A public method can still be declared in a class declared as internal.
  • We don’t warn for public methods that were deemed as obsolete in the baseline
  • We handle both situations: A) publicly visible methods removed since the baseline and B) publicly visible methods not publicly visible anymore.

This query is pretty similar than the source of the default rule API Breaking Changes: Methods that is a bit more sophisticated to handle more situations, like when the public method return type changes. This rule also presents results in a polished way.

Similar rules are available for publicly visible types and publicly visible fields.

Some other sorts of breaking changes are detected when for example, a publicly visible interface or base class changes: in such situation clients code that implement such interface or derive from such abstract class will be broken.

Some rules also detect when a serializable types gets broken and when an enumeration Flags status change.

Real Experiments

From comparing NHibernate v5.2.x against NHibernate v4.1.x here are breaking changes found: 36 types, 995 methods, 205 fields (including enumeration values) and 121 interfaces or abstract classes have been changed. I guess most of those breaking changes are about elements that were not intended to be seen by API consumers in the baseline version but a detailed analysis would be needed here.

NHibernate v5.2 vs v4.1 Breaking Changes

I’ve also compared .NET Core v2.2.5 with .NET Core v3.0 Preview 8 but hopefully didn’t find any breaking change. However NDepend has an heuristic code query to find types moved from one namespace or assembly to another and here more than 200 types were matched like the class ArrayList moved from the assembly System.Collections.NonGeneric.dll to the assembly System.Private.CoreLib.dll. Hopefully such changes are invisible to the clients consuming .NET Core as a NuGet package.

ArrayList moved from System.Collections.NonGeneric.dll to System.Private.CoreLib.dll

Find new APIs consumed and APIs not used anymore

A related and interesting topic is to review new APIs used by an application or APIs not used anymore. This is possible from the NDepend > Search Elements by Changes panel with the buttons Third Party Code Elements – Used Recently / Not Used Anymore. For example the screenshot below shows new API consumed by the application eShopOnWeb:

New APIs used by eShopOnWeb

Before releasing a new version, this is interesting to have a glance at API consumption changes. Doing so often sheds light on interesting points.

Answers to arguments against 100% coverage

I’ve been enthusiast about 100% coverage for more than a decade. The large code base of NDepend we are working on will reach soon 90% overall coverage. Most classes tested are being 100% covered.

In the heatmap below small rectangles are methods. Grapes of rectangles are classes namespaces and assemblies. By importing code coverage in this heat-map we can see at a glance that most classes are green: they are 100% covered. This heatmap also shows in red and orange areas with room for improvements. Areas in gray represent untestable code tagged with System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverageAttribute.

Not everybody agrees with 100% coverage and the same points against used to popup again and again so I take a chance to address them one by one.

I don’t waste my time writing tests to cover trivial code

Often the example of trivial code proposed is property getter and setter. Indeed writing tests to cover such get/set methods doesn’t bring any value. But such trivial code is meant to be used by real code, isn’t it?

For example suppose we have a Contact class with a property like Contact.FirstName, this properties is meant to be used. The module that builds an invoice and that prints the contact name on it certainly uses it. And if the build invoice module is well tested, Contact.FirstName must get called implicitly at test time.

So if some trivial code is not covered by tests, the idea is not to write dumb tests to cover it, the idea is to question yourself why the trivial code is not already implicitly covered by tests that exercise real code. And certainly you’ll find room for improvements in the tests suite in charge of testing the real code, that depends upon the trivial uncovered code. A future post will go through benefit of 100% coverage but as this point suggests, one key benefit is that a hole in 100% covered code is always an indicator of something interesting to improve.

90% coverage is enough

Often teams get satisfied with 90% or 95% coverage. For the scale of a large code base such score is outstanding. But for the scale of a class or a component 100% must be the goal.

The key is testability. We all have experienced some easy to test code and some hard to test code. The maintainability is hard to measure but the testability is something concrete. And good testability leads to good maintainability. In other words if a component can be fully tested seamlessly it means this component will be easy to maintain.

On the other hand, in the real world we often end up with a component that contains a small portion that is hard to test. And we get satisfied with 90 or 95% coverage. But this is sweeping dust under the carpet: this small portion of code is not testable because it is not well designed and as a consequence it is bug-prone. Hence we end up not testing the most sensitive part of the code that likely concentrates most of the problems!

If after having wrote a test suite you end up with a few if/else blocks that are hard to test, do a favor to yourself: refactor to reach seamless full testability.

What matters is what gets asserted in tests, not the amount of coverage

Suppose you have a 1.000 lines of code linear method: it only contains a linear list of processing with no if/then/switch/case… Suppose you have a test with zero assertions invoking this method. You end up with 1.000 lines of code 100% covered but not tested at all.

This fictitious situation often proposed as 100% coverage counter argument doesn’t reflect the reality. In the real world if one works hard to get a complex class 100% covered, I cannot imagine that the tests don’t contain an healthy amount of assertions. No matter if TDD Test First Design approach is used or if tests get written at the same time as code tested : writing tests leads to think more and to think better. One doesn’t write a test without having a few points in mind to assert.

But there is more. Why only tests should contain assertions? The code itself is well suited to contain assertions. For more than a decade we stuff the NDepend code base with assertions. Everything that can be asserted gets asserted. We end up with more than 26K assertions. A good half of those comes from non-nullable references. C#8 nullable reference will relieve all those assertions. But will remain all other assertions about non-empty strings, special string formats, non-zero counter, IDictionary.ContainsKey(), non-zero denominator of a division, !object.IsDisposed, loop invariant…. And all those assertions are checked at test time and fail tests when violated. Ultimately most tests end up with a 1/10 ratio between the number of assertions in test checked, and the number of assertions in code checked.

We still rely on the good old System.Diagnostics.Debug.Assert() but this is an implementation detail. Any assertion library can be used from the code itself (including a custom one, or the Code.Contracts library that would have deserved more love). The only thing that matters is to tune assertions to fail a test upon violation, typically by sending an exception. A related topic is if those assertions must be checked or not at production time. This is another interesting debate with pros and cons.

Some code just cannot be tested

Indeed there are some API call that just cannot be tested like calls to System.Windows.MessageBox. Such calls need to be mocked. This way the code is well splitted between 100% covered code and code tagged with System.Diagnostics.CodeAnalysis.ExcludeFromCodeCoverageAttribute. Dependency Injection can be used to inject untestable implementation in tested code.

Again testability is a central characteristic of code. Partitioning code according to its testability makes sense.

However, there are situations where such untestable API tend to prolifer. Then your choice is to either mock everything or accept to live with non-tested code. But being in this situation is the sign that such API is immature because not test-prone. Thus something radical must be done about that: get rid of it, contribute to it (if OSS), fork it…

UI Code is untestable

Since testing code has become a popular practice, UI code has always been treated as an awkward case: there are some frameworks dedicated to UI testing but such practice remains tedious and increases test maintenance. This is not satisfying.

As for any code, UI code must be written with testability in mind. UI code can contain UI logic but should not contain business logic. For example the UI class shouldn’t contain the implementation of a IsContactReadOnly() method (business logic) but can call such method declared in a non-UI testable class, to decide if a Contact First Name textbox should be readonly or editable (UI logic). Ideally UI code looks like an empty shell as much as possible.

Nowadays web development outperforms desktop developments. The backend spits HTML CSS and Javascript textual content. This content seen as data represents some easily testable material. However when possible, it is preferable to test the IsContactReadOnly() business logic directly from the backend code, than to test it indirectly, for example by checking if an HTML textbox is readonly or editable. Hopefully modern web platforms like ASP.NET Core propose testing strategies.

Desktop UI is a different beast to test and there is no magic. To test our UI specific code we’ve invested in an architecture where the UI code can be piloted, both from the main-form code and from tests. Such architecture relies massively on some mediator classes: any UI portion can pilot any other UI portion through these mediators. These mediators are used at production runtime, and tests supersede them to pilot the UI at testime. Tests dedicated to UI testing have few assertions, but such tests are not useless: they execute all the assertions stuffed in our UI code. At the end of the day some of our UI classes remain not fully covered but closed to. Small portions left untested doesn’t contain error-prone code but untestable UI peculiarities, like complex designer code or some DPI related code that can only be tested manually. The experience proved that this approach saved us from regression bugs many times.


100% coverage is a sensitive topic. In the NDepend team we are 100% coverage driven. In a future post we’ll detail the benefits of being 100% coverage driven. But I wanted to detail first the non-trivial reasons that make 100% coverage worth in most situations.

 

Static Analysis and Dependency Injection

For quite some years now, we (the NDepend team) got some demand about resolving Dependency Injection, see this page on our User Voices. Lately we’ve been considering such support carefully but came to the conclusion that it’d be awkward. On the other hand we have alway been prioritizing features development based on user feedback and need. So our conclusions are not definitive yet. I write this post to not only expose our point of view, but eventually trigger more productive discussions.

Static Analysis

First let’s underline that NDepend is a static analyzer. It gathers 100% of its data from the code itself, and also parses more data like code coverage. Static analysis is in opposition with dynamic analysis that gathers data from runtime execution.

  • Static analysis deals with classes, methods and design measures like lines of code or complexity. It is design-time analysis.
  • Dynamic analysis deals with objects and runtime measures like performance or memory consumption. It is run-time analysis

Dependency Injection (DI)

Recently I explained the basis of Dependency Injection (DI) and how it relates to DIP in the post SOLID Design: The Dependency Inversion Principle (DIP). DI is all about depending on interfaces instead of depending on classes. Here is simple DI sample code, ClientCode() depends on IDbConnection and not on SqlConnection. The power of DI is that now ClientCode() can run on other RDMS than SQL Server, like Oracle or SQLLite. ClientCode() can now also be unit-tested seamlessly by mocking the interface IDbConnection.

Stable Dependency Principle (SDP)

DI is an application of the Stable Dependency Principle (SDP), which is not part of the SOLID principles, but is much related with the Dependency Inversion Principle (DIP).

SDP states: The dependency should be in the direction of the stability. Stability is a measure of the likeliness of change. The more likely it will change, the less it is stable.

The Liskov Substitution Principle (LSP) and the Interface Segregation Principle (ISP) articles explain how interfaces must be carefully thought out. As a result an interface is a stable code element, that is less likely to change than the the classes that implement it. Following the SDP, it is better design to rely on interfaces than to rely on concrete classes.

The Users need: supporting DI through Static Analysis

In this context, users want NDepend to be able to parse DI code like that.

Concretely users want NDepend to bind all calls to an IDbConnection member with a call to the corresponding SqlConnection member . For example a dependency toward SqlConnection.Open() should be created for any method calling IDbConnection.Open().

But why would this be useful? Here is a user feedback:

We have layered architecture: Foundation, Feature, Project. Following current NDepend report, everything follows right dependencies. Project depends on Feature and Foundation. Feature depends on Foundation. However due to some of DI registrations, we have complex dependencies (…) there is dependency from Foundation to Project in reality. Interface is declared in Foundation, but implementation is in Project and registered via DI. It means that Foundation projects, that are core and stable part of solution, depend on Project level, which is not so stable and reliable.

In reality is highlighted in bold, because the user really meant at run-time. At run-time Foundation depends on Project, the same way that at run-time ClientCode() depends on SqlConnection. But at design-time Foundation doesn’t depend on Project, the same way that at design-time ClientCode() doesn’t depends on SqlConnection. At design-time ClientCode() only knows about IDbConnection.

From this, one implies that not only at design time, but also at runtime dependencies should go from Project towards Foundation. In other words, one feels that implementing an adapter into Project and injecting it into a Foundation component “is not so stable and reliable”, but that’s not how those principles are meant and described.

As a matter of fact, being able to inject Project adapters into Foundation components (where the Project adapter implements one of Foundation’s interfaces) is the core of the Dependency Inversion Principle (DIP). Without it, it is impossible to create flexible systems. This can be unreliable, but only if one doesn’t follow the Liskov Substitution Principle (LSP), when Foundation interfaces are not well thought-out, like for example when ICollection<T> forces the Array class to implement ICollection<T>.Add() which obviously cannot be supported.

Conclusion

Hence it looks like this feature need results from a confusion between design-time dependencies and run-time dependencies. The purpose of design is to reduce the cost of change, anything else is not design. Managing design-time dependencies is at the heart of design. All SOLID principles and all principles surrounding SOLID like the Stable Dependency Principle (SDP) are about design-time dependencies.

Moreover adding DI support would be a never-ending story—and I guess it cannot be done right without information collected at run-time. DI Containers are very complex beasts internally, their behavior often changes from one major version to the next, and there are dozens of DI Containers in .NET (Autofac, UnityContainer, NInject, StructureMap, MEF, SimpleInjector, Castle.Windsor, LightInject, Prism, Spring.net … just to name a few).

Again all our developments are driven by users-need, but on this particular need we’re not sure it’d make sense. We let the discussion open on this post and on the User Voices page. Whether you agree or disagree with our positions, feel free to comment.

 

 

SOLID Design: The Dependency Inversion Principle (DIP)

After having covered the Open-Close Principle (OCP), the Liskov Substitution Principle (LSP), the Single Responsibility Principle (SRP) and the Interface Segregation Principle (ISP) let’s talk about the Dependency Inversion Principle (DIP) which is the D in the SOLID acronym. The DIP definition is:

a. High-level modules should not depend on low-level modules. Both should depend on abstractions.
b. Abstractions should not depend on details (concrete implementation). Details should depend on abstractions.

The DIP has been introduced in the 90s by Robert C Martin. Here is the original article.

A Dependency is a Risk

As all SOLID principles DIP is about system maintainability and reusability. Inevitably some parts of the system will evolve and will be modified. We want a design that is resilient to changes. To avoid that a change breaks too much, we must:

  • first identify parts of the code that are changes-prone.
  • second avoid dependencies on those changes-prone code portion.

The Liskov Substitution Principle (LSP) and the Interface Segregation Principle (ISP) articles explains that interfaces must be carefully thought out. Both principles are 2 faces of the same coin:

  • ISP is the client perspective: If an interface is too fat probably the client sees some behaviors it doesn’t care for.
  • LSP is the implementer perspective: If an interface is too fat probably a class that implements it won’t implement all its behaviors. Some behavior will end up throwing something like a NotSupportedException.

Efforts put in applying ISP and LSP result in interfaces stability. As a consequence these well-designed interfaces are less subject to changes than concrete classes that implement them.

Also having stable interfaces results in improved reusability. I am pretty confident that the interface IDisposable will never change. My classes can safely implement it and this interface is re-used all over the world.

In this context, the DIP states that depending on interfaces is less risky than depending on concrete implementations. DIP is about transforming this code:

into this code:

DIP is about removing dependencies from high-level code (like the ClientCode() method) to low-level code, low-level code being implementation details like the SqlConnection class. For that we create interfaces like IDbConnection. Then both high-level code and low-level code depend on these interfaces. The key is that SqlConnection is not visible anymore from the ClientCode(). This way the client code won’t be impacted by implementation changes, like when replacing the SQL Server RDBMS implementation with MySql for example.

Let’s underline that this minimal code sample doesn’t do justice to the word Inversion in the DIP acronym. The inversion is about interfaces introduced (to be consumed by high-level code) and implementation details: implementation details depends on the interfaces, not the opposite, here is the inversion.

DIP and Dependency Injection (DI)

The acronym DI is used for Dependency Injection and since it is almost the same as the DIP acronym this provokes confusion. The I is used for Inversion or Injection which might add up confusion. Hopefully DI and DIP are very much related.

  • DIP states that classes that implement interfaces are not visible to the client code.
  • DI is about binding classes behind the interfaces consumed by client code.

DI means that some code, external to client code, configures which classes will be used at runtime by the client code. This is simple DI:

Many .NET DI frameworks exist to offer flexibility in binding classes behind interfaces. Those frameworks are based on reflection and thus, they offer some kind of magic. The syntax looks like:

And then comes what is called a Service Locator. The client can use the locator to create instances of the concrete type without knowing it. It is like invoking a constructor on an interface:

Thus while DIP is about maintainable and reusable design, DI is about flexible design. Both are very much related. Let’s notice that the flexibility obtained from DI is especially useful for testing purposes. Being DIP compliant improves the testability of the code:

DIP and Inversion of Control (IoC)

The Inversion word is used both in DIP and IoC acronyms. This provokes confusion. Remember that the word Inversion in the DIP acronym is about implementation details depending on interfaces, not the opposite. The Inversion word in the IoC acronym is about calls to Library transformed into callbacks from Framework.

IoC is what differentiates a Framework from a Library. A library is typically a collection of functions and classes. On the other hands a framework also offers reusable classes but massively relies on callbacks. For example UI frameworks offers many callback points through graphical events:

The method m_ButtonOnClick() bound to the Button.OnClick event is a callback method. Instead of client code calling a framework method, the framework is responsible for calling back client code. This is an inversion in the control flow.

We can see that IoC is not related to DIP. However we can see Dependency Injection has a specialization of IoC:  DI is an IoC used specifically to manage dependencies.

DIP and the Level metric

Several code metrics can be used to measure, and thus constraint, the usage of DIP. One of these metric is the Level metric. The Level metric is defined as followed:

From this diagram we can infer that:

  • The Level metric is not defined for components involved in a dependency cycle. As a consequence null values can help tracking component dependency cycles.
  • The Level metric is defined for any dependency graph. Thus a Level metric can be defined for various granularity: methods, types, namespaces, assemblies.

DIP mostly states that types with Level 0 must be interfaces and enumerations (note that interfaces using others interfaces have a Level value higher than 0). If we say that a component is a group of types (like a namespace or an assembly) the DIP states that components with Level 0 must contain mostly interfaces and enumerations. With a quick code query like this one you can have a glance at types Level and check if most of low level types are interfaces:

The Level metric can also be used to track classes with high Level values: it is a good indication that some interfaces must be introduced to break the long chain of concrete code calls:

The class Program has a Level of 8 and if we look at the dependency graphs of types used from Program we can certainly see opportunities to introduce abstractions to be more DIP compliant:

DIP and the Abstractness vs. Instability Graph

Robert C. Martin not only coined the DIP but also proposed some code metrics to measure the DIP compliance. See these metrics definitions here. From these metrics an intriguing Abstractness vs. Instability diagram can be plotted. Here we plotted the 3 assemblies of the OSS eShopOnWeb application. This diagram has been obtained from an NDepend report:

  • The Abstractness metric is normalized : it takes its values in the range [0,1]. It measures the interfaces / classes ratio (1 means the assembly contains only interfaces and enumerations).
  • The Instability metric is normalized and measures the assembly’s resilience to change. In this context, being stable means that a lot of code depends on you (which is wrong for a concrete class and fine for an interface) and being unstable means the opposite: not much code depends on you (which is fine for a concrete class and wrong for an interface, a poorly used interface is potentially a waste of design efforts).

This diagram shows a balance between the two metrics and defines some green/orange/red zones:

  • A dot in the red Zone of Pain means that the assembly is mostly concrete and used a lot. This is a pain because all those concrete classes will likely undergo a lot of changes and each change will potentially impact a lot of code. An example of a class living in the Zone of Pain would be the String class. It is massively used but it is concrete: if a change should occur today in the String class the entire world would be impacted. Hopefully we can count on the String implementation to be both performance-wise and bug-free.
  • A dot in the red Zone of Uselessness means that the assembly contains mostly interfaces and enumerations and is not much used. This makes these abstractions useless.
  • The Green zone revolves around the Main Sequence line. This line represents the right balance between both metrics. Containing mostly interfaces and being used a lot is fine. Containing mostly classes and not being used much is fine. And then comes all intermediate well balanced values between these 2 extremes represented by the Main Sequence line. The Distance from Main Sequence metric can be normalized and measures this balance. A value close to 0 means that the dot is near the line, in the green zone, and that the DIP is respected.

Conclusion

As the Open-Close Principle (OCP), the Liskov Substitution Principle (LSP) and the Interface Segregation Principle (ISP) the DIP is a key principle to wisely harness the OOP abstraction and polymorphism concepts in order to improve the maintainability, the reusability and the testability of your code. No principle is an island (except maybe the Single Responsibility Principle (SRP)) and they must be applied hands-in-hands.

This article concludes this SOLID posts serie. Being aware of SOLID principles is not enough: they must be kept in mind during every design decision. But they also must be constrained by the KISS principle, Keep It Simple Stupid, because as we explained in the post Are SOLID principles Cargo Cult? it is easy to write entangled code in the name of SOLID principles. Then one can learn from experience. With years, identifying the right abstractions and partitioning properly the business needs in well balanced classes is becoming natural.

Are SOLID principles Cargo Cult?

My last post about SOLID Design: The Single Responsibility Principle (SRP) generated some discussion on reddit. The discussion originated from a remark considering SOLID principles as a Cargo Cult. Taking account the definition of Cargo Cult the metaphor is a bit provocative but it is not unfounded.

cargo cult is a belief system among members of a relatively undeveloped society in which adherents practice superstitious rituals hoping to bring modern goods supplied by a more technologically advanced society

The recent Boeing’s 737 Max fiasco revealed that some parts of their software have been outsourced to $9-an-hour engineers. Those engineers shouldn’t be blamed for not achieving top notch software taking account the budget. Nevertheless it is clear that a lot of software written nowadays look like this cargo cult plane. For many real-world developers, SOLID principles are superstitious rituals whose primary goal is to succeed during job interview.

The SRP article underlines that SRP is the only SOLID principle not related to the usage of abstraction and polymorphism. SRP is about logic partitioning into code: which logic should be declared in which class. But SRP is so vague it is practically useless from its two definitions.

Definition 1: A class should have a single responsibility and this responsibility should be entirely encapsulated by the class.

Definition 2: A class should have one reason to change.

One can justify any class design choice by tweaking somehow what is a responsibility or what is a reason to change. In other words, as someone wrote in comment: Most people who “practice” it don’t actually know what it means and use it as an excuse to do whatever the hell they were going to do anyways.

We can feel bitterness in those comments, certainly coming from seasoned developers whose job is to fix mistakes of $9 an hour engineers.

SOLID Principles vs. OOP Patterns

We must remember that SOLID principles emerged in the 80s and 90s from the work of world-class OOP experts like Robert C. Martin (Uncle Bob) and Bertrand Meyer. Software writing is often considered as an art. Terminologies such as clean code or beautiful code have been widely used. But art is a subjective activity. In this context, SOLID principles necessarily remain vague and subject to interpretation. And this is what makes the difference between a SOLID principle and an OOP pattern:

  • A SOLID Principle is subjective. It helps to guide the usage of powerful concepts of Object Oriented Programming (OOP).
  • An OOP Pattern is objective. It is a set of recipes to implement a well identified situation with the OOP concepts.

Despite a restraint number of keywords and operators, the OOP toolbelt of languages such as C# or Java is very rich. With a few dozens of characters it is possible to write code that puzzle experts. C# especially gets richer and richer with many syntactic sugars to express complex situations with just a few characters. This power is a double edged sword: seasoned developers can write neat and compact code. But on the other hand it is easy to misuse this power, especially for junior developers and all those that write code just to pay their bills.

Always keep in mind the KISS principle

Someone wrote in comments: “SOLID encourages abstraction, and abstraction increases complexity. It’s not always worth it, but it’s always presented as the non-plus ultra of good approaches.”

The only reason to be for abstraction in OOP is to simplify the implementation of a complex business rule.

  • Abstracting Circle, Rectangle and Triangle with an IShape interface will dramatically simplify the implementation of a shape drawing software.
  • On the other hand, creating an interface for each class is a waste of resource: not every concepts in your program deserve an abstraction.

This is why the Keep It Simple Stupid KISS principle should be always kept in mind: don’t add up extra implementation complexity on top of the business complexity.

SOLID and Static Analysis

I have been in the .NET static analysis industry since 2004. At that time I was consulting for large companies with massive legacy apps that were very costly to maintain. Books like Robert Martin’s Agile Principles, Patterns, and Practices made me realize that the source code is data. This data can be measured with code metrics. And the same way relational data can be crawled with SQL queries, code as data can be crawled with code queries. For example:

This query will objectively match complex methods not fully covered by tests. There are situations where one can argue that static analysis returns false positives but there is no justification for complex methods not well tested.

Not all aspects of SOLID principles can be objectively measured and verified. However static analysis can help bring objectiveness. For example:

SOLID and Testability

Regularly applying such rules will avoid taking SOLID too far to the point it becomes detrimental. However there are still all those aspects of SOLID, and code design in general, that must be left to creativity and interpretation. Experience in software development helps a lot here: over the years one refines his/her gut feeling about which design will increase flexibility and maintainability.

By definition juniors developer have no experience. However anyone can relentlessly struggle for 100% code coverage by tests. Being able to fully cover your code means, by definition, that your code is testable. Testability doesn’t come by chance. The properties that leads to full testability are the same properties that leads to high maintainability. Those properties include:

  • Easiness to use API
  • Domain classes well isolated
  • Careful map of logic to classes
  • Short classes and short methods
  • Cohesive classes
  • Abstractions and polymorphism used judiciously
  • Careful management of states mutability

Advice to add up objectivity when applying SOLID principles

Not everyone is a senior developer with a passion for well designed code. As a consequence Cargo Cult usage of SOLID principles is common. To improve the design some objectivity needs to be added in the development process. Here are my 3 advices for that:

  • KISS principle first, always struggle for simplicity: if it is complicated it is not SOLID.
  • Use static analysis to automatically monitor some measurable aspects of SOLID. Gross violations of code quality rules and metrics are also SOLID principles violations.
  • Refactor your code until it becomes seamlessly 100% coverable by tests. Code that cannot be easily 100% covered by tests is not SOLID.

 

 

SOLID Design: The Single Responsibility Principle (SRP)

After having covered The Open-Close Principle (OCP) and The Liskov Substitution Principle (LSP) let’s talk about the Single Responsibility Principle (SRP) which is the S in the SOLID acronym. The SRP definition is:

A class should have a single responsibility and this responsibility should be entirely encapsulated by the class.

This leads to what is a responsibility in software design? There is no trivial answer, this is why Robert C. Martin (Uncle Bob) rewrote the SRP principle this way:

A class should have one reason to change.

This leads to what is a reason to change?

SRP is a principle that cannot be easily inferred from its definition. Moreover the SRP lets a lot of room for own opinions and interpretations. So what is SRP about? SRP is about logic partitioning into code: which logic should be declared in which class. Something to keep in mind is that SRP is the only SOLID principle not related to the usage of abstraction and polymorphism.

The goal of this post is to propose objective and concrete guidelines to increase your classes compliance with SRP, and in-fine, increase the maintainability of your code.

SRP and Concerns

Typically the ActiveRecord pattern is used to exhibit a typical SRP violation. An ActiveRecord class has two responsibilities:

  • First an ActiveRecord object stores in-memory data retrieved from a relational database.
  • Second the record is active in the sense that data in-memory and data in the relational database are kept mirrored. For that, the CRUD (Create Read Update Delete) operations are implemented by the ActiveRecord.

To make things concrete an ActiveRecord class can look like that:

If Employee was a POCO class that doesn’t know about persistence and if the persistence was handled in a dedicated persistence layer the API would be improved because:

  • Not all Employee consumer wants to deal with persistence.
  • More importantly an Employee consumer really needs to know when an expensive DB roundtrip is triggered: if the Employee class is responsible for the persistence who knows if the data is persisted as soon as a setter is invoked?

Hence better isolate the persistence layer accesses and make them more explicit. This is why at NDepend we promote rules like UI layer shouldn’t use directly DB types that can be easily adapted to enforce any sort of code isolation.

Persistence is what we call a cross-cutting concerns, an aspect of the implementation that tends to spawn all over the code. We can expect that most domain objects are concerned with persistence. Other cross-cutting-concerns we want to separate domain objects from include: validation, log, authentication, error handling, threading, caching. The need to separate domain entities from those cross-cutting concerns can be handled by some OOP pattern like the pattern decorator for example. Alternatively some Object-Relational Mapping (ORM) frameworks and some Aspect-Oriented-Programming (AOP) frameworks can be used.

SRP and Reason to Change

Let’s consider this version of Employee:

The ComputePay() behavior is under the responsibility of the finance people and the ReportHours() behavior is under the responsibility of the operational people. Hence if a financial person needs a change to be implemented in ComputePay() we can assume this change won’t affect the ReportHours() method. Thus according to the version of SRP that states “a class should have one reason to change”, it is wise to declare these methods in different dedicated modules. As a consequence a change in ComputePay() has no risk to affect the behavior of ReportHours() and vice-versa. In other words we want these two parts of the code to be independent because they will evolve independently.

This is why Robert C. Martin wrote that SRP is about people : make sure that logics controlled by different people are implemented in different modules.

SRP and High-Cohesion

The SRP is about encapsulating logic and data in a class because they fit well together. Fit well means that the class is cohesive in the sense that most methods use most fields. Actually cohesion of a class can be measured with the Lack of Cohesion Of Methods (LCOM) metric. See below an explanations of LCOM (extracted from this great Stuart Celarier placemat) What matters is to understand that if all methods of a class are using all instances fields, the class is considered utterly cohesive and has the best LCOM score, which is 0 or close to 0.

Typically the effect of a SRP violation is to partition a class methods and fields into groups with few connections. The fields needed to compute the pay of an employee are not the same than the fields needed to report pending work. This is why the LCOM metric can be used to measure adherence to SRP and take actions. You can use the rule Avoid types with poor cohesion to track classes with poor cohesion between methods and fields.

SRP and Fat Code Smells

While we can hardly find an easy definition for what is a responsibility we noticed that adhering to SRP usually results in classes with a good LCOM score. On the other hand, not adhering to SRP usually leads to the God class phenomenon: a class that knows too much and does too much. Such god class is usually too large: violations of rules like Avoid types too big, Avoid types with too many methods, Avoid types with too many fields are good candidate to spot god classes and refactor into a finer-grained design.

Guidelines to adhere to SRP

Here are a set of objective and concrete guidelines to adhere to SRP:

  • Domain classes must be isolated from Cross-Cutting Concerns: code responsible for persistence, validation, log, authentication, error handling, threading, caching…
  • When implementing your domain, favor POCO classes that do not have any dependency on an external framework. Note that a POCO class is not necessarily a fields and properties only class, but can implement logic/behavior related to its data.
  • Use your understanding of the Domain to partition code: logics related to different business functions should be kept separated to avoid interference.
  • Regularly check the Lack of Cohesion Of Methods (LCOM) score of your classes.
  • Regularly check for too large and too complex classes.

 

SOLID design: The Liskov Substitution Principle (LSP)

The Liskov substitution principle is the L in the well known SOLID acronym. The original principle definition is:

Methods that use references to base classes must be able to use objects of derived classes without knowing it.

At first glance this principle is pretty easy to understand. At second glance it seems redundant with the OOP concept of polymorphism. After all, the whole point of polymorphism is to consume an abstraction without knowing the implementation behind isn’t it?

However it is a good thing that the community emphases the Liskov substitution principle. This principle is in fact a caveat for developers that polymorphism is both powerful and tricky : in the real world, the usage of polymorphism often leads to a dead-end situation, it must be wisely used.

LSP is often summarized with a counter-example of Duck Test“If it looks like a duck, quacks like a duck, but needs batteries – you probably have the wrong abstraction”

Let’s details some common polymorphism pitfalls that the Liskov substitution principle attempts to prevent by reminding the developer to adopt the client perspective.

Prevent situations where a method cannot be implemented

When a class implements an interface or derives from a base class, refactor tooling like Visual Studio refactor tools, Resharper or CodeRush propose to insert abstract methods stubs to implement. Typically the default body of such inserted method is throw NotImplementedException().

Obviously this behavior must remain temporary and must not be released in production. Client code that hold a reference on the interface or the base class doesn’t expect to get a NotImplementedException raised upon a method call. This is why NDepend has a default rule named Do implement methods that throw NotImplementedException, to prevent such situation.

On the other hand, if it doesn’t make sense to implement an abstract method, it clearly means that the design is wrong. Here is such wrong design, assuming that all birds can fly:

This code could then be refactored to:

The problem was that Bird with its Fly() method was too coarse, we needed some refinement because not all bird can fly. From my experience such wrong assumptions on interfaces and base classes happen quite often in the real world. When you stumble on such situation, see it as a good starting point for refactoring … if possible. Indeed, sometime refactoring is not an option if many clients depend already on the wrong design.

Example of a LSP violation in the .NET framework design

One dreaded LSP violation is .NET System.Array implementing the ICollection<T> interface. Hence Array has to implement the ICollection<T>.Add() method but calling this method on an array throws at runtime a NotSupportedException:

The C# compiler doesn’t even warn on such simple erroneous program.

Of course we’d need to ensure that ICollection<T>.IsReadOnly is false before modifying a collection through a reference of IList<T> or a ICollection<T> but frankly this is an error-prone design. I can remember having stumbled on this situation quite a few times during the last 15 years I am programming with C#/.NET.

Moreover refactoring this original design mistake is not an option anymore, even when .NET Core was introduced, since millions of programs are relying on this API. IReadOnlyCollection<T> has been introduced with .NET v4.5 in 2012 but the original design cannot be changed.

Think twice before applying the ISA trick

Another common example to explain the Liskov substitution principle is the Rectangle/Square paradigm. A square is-a rectangle isn’t it? So we should be able to write such code:

Clearly this is a wrong usage of the ISA principle: yes a square is a rectangle but the client doesn’t expect that when modifying the height of the rectangle, the width gets modified also.

Here also such wrong design emerges quite often in the real world. For example when dealing with a complex control hierarchy with dozens of fields to maintain at various level, it can become quite tricky to maintain coherence in your objects’ states at runtime.

Code Contract to help?

In theory code contract could help here. A contract is one or several assertions that prevent corrupted state. When a corrupted state is reached at runtime (whether in production or at test run time), such assertion must deadly fail because continuing running with corrupted state can potentially lead to more corrupted states.

Often contracts are used at interface level to improve the semantic by adding constraints to classes that implement the interface. We might want to force all implementations of IRectangle.set_Width to let the Height value untouched. By using Microsoft Code Contracts we could try to write something like that:

Unfortunately, as far as I know, Microsoft Code Contracts has no support for such PreserveValue() possibility. More generally it seems that Microsoft Code Contracts doesn’t receive much love nowadays despite how useful code contract can be. Ok C#8 non-nullable reference addresses many of the situations usually verified through contracts or assertions, but not all contracts are about nullable reference, as this example suggests.

Use polymorphism with great caution

These classical Bird and Rectangle examples show well how polymorphism and inheritance can quickly lead to rotten design, even in apparently simple situations. The Array class implementing ICollection<T> situation shows that in practice LSP violations just happen.

In my point of view, what the Liskov substitution principle really says is that when writing an API relying on polymorphism, you should first take the point of view of the client of your API before writing any interface and any class hierarchy.

  • Do really all birds can fly? What happen if I try to call Fly() on a bird that cannot fly?
  • Is a square really a rectangle? What happen if I change the width on a square?

In the real world this looks like:

  • Can all collections really be modified? What happen if I add or remove an element to an array?
  • Do all controls are scrollable? What happen if a scrollbar is displayed on a control that should not scroll?
  • Do withdrawal applies to all bank account? What happen if we try to withdraw money from a locked long term deposit account? Should we fail withdrawal badly in this situation or should we prevent such situation with an IAccountThatAuthorizeWithdraw abstraction?

A pattern emerges here: for each members of your interface you should question yourself: Do this member applies seamlessly to all objects that will implement this interface?

Notice that API writing principle is more general, it doesn’t only apply to polymorphism: when writing an API first take the point of view of the client of your API. This is the only way to achieve elegant API that clients will love to consume. This is another way to explain Test-Driven Development (TDD), where client code must be written for test and design purposes before writing the code itself.

In a previous post I explained why your classes should be declared as sealed when possible  and why NDepend proposes the default rule Class with no descendant should be sealed if possible. In the real world, a class is never well designed for inheritance by chance and thus should be sealed by default. Designing well a class for inheritance requires quite an extensive understanding of your domain. As for everything in life, it requires a lot of effort to build something that others will find easy to use.