NDepend

Improve your .NET code quality with NDepend

Top 10 Visual Studio Refactoring Tips

With the version 2019 Visual Studio is now mature when it comes to refactoring. This post proposes a tour of the top 10 most used refactoring actions in my opinion.

Short GIF is an excellent way to help get started with those Visual Studio tips. See others related posts based also on short GIFs here:

1) Renaming an Identifier

With Ctrl+R,R you can rename any code identifier: a variable, a field, a class… The renaming experience is pretty clean when only one source file is concerned since all references in file get updated live while typing the new name:

Renaming a variable in Visual Studio
Renaming a variable in Visual Studio

When renaming an identifier impacts several source files, like when renaming a class, several UI details improve the user experience:

  • In the top-right panel we see: Renaming X references in Y files
  • In the top-right panel we get the checkbox: Rename file, this is quite useful since often the file name and the class name are in-sync.
  • We get the possibility to preview all changes in all files in a Preview Change window.
Renaming a class in Visual Studio
Renaming a class in Visual Studio

2) Extract Method

The hotkey Ctrl+. triggers the Quick Actions and Refactorings menu. If you are not used yet to Ctrl+. I’d advise training especially this one because it is a powerful shortcut. You’ll see that most other refactoring presented here rely on the Ctrl+. hotkey.

If one or several instructions are actually selected in a method, the Extract method and Extract local function menus are proposed. A large tooltip is immediately shown to preview the changes. Then just click Enter and terminate the refactoring action by naming the NewMethod identifier.

Extract Method with Visual Studio
Extract Method with Visual Studio

3) Remove and Sort Usings

To make your code cleaner it is recommended to maintain for each source file the list of using ordered alphabetically with unnecessary usings removed. Both actions can be done automatically with Visual Studio top menu > Edit > Intellisense > Remove and Sort Usings. I wish a default shortcut was assigned to this common refactoring (of course you can still assign a shortcut to this action from Visual Studio top menu > Tools > Options > Keyboards.).

Remove and Sort Usings with Visual Studio
Remove and Sort Usings with Visual Studio

Actually it is possible to remove all unnecessary usings at once in a Visual Studio project or solution thanks to the bulb that appears in the code editor gutter when selecting an unnecessary using faded away.

Remove unnecessary usings with Visual Studio
Remove unnecessary usings with Visual Studio

4) Add Missing Usings when Pasting

When pasting some code it is quite irritating to get some errors because of some missing usings. Once the code has been pasted, you can click Ctrl+. to ask Visual Studio to add missing usings for you:

Add missing usings after pasting with Visual Studio
Add missing usings after pasting with Visual Studio

5) Generate Property from Constructor and Generate Constructor from Properties

Once again the magical Ctrl+. hotkey can be used when selecting a parameter into a constructor signature, to generate the corresponding property.

However I haven’t found a way to generate several properties in a row. I’d expect that selecting several parameters and then Ctrl+. would propose to generate several properties in a row but it is not the case, and there is no Create properties for all parameters menu.

Generate property from constructor with Visual Studio
Generate property from constructor with Visual Studio

The same way you can generate a constructor from the selected properties.

Generate constructor from selected properties with Visual Studio
Generate constructor from selected properties with Visual Studio

Both quick actions work with fields also.

6) foreach to LINQ

When the editor carret is over a foreach loop, the hotkey Ctrl+. proposes to convert the foreach loop to a LINQ query. This conversion is not always possible but it is smart enough. For example it can convert a if(condition){continue} within the loop into a where condition LINQ clause.

Notice that typically loops are faster than LINQ queries because the compiler can optimize loops while LINQ queries extensively rely on method calls. But in non-performance-critical code region LINQ queries is often a more readable, concise and maintainable way of writing code.

Converting a foreach loop into a LINQ query with Visual Studio
Converting a foreach loop into a LINQ query with Visual Studio

7) Extract Interface

When the editor carret is over a class name the hotkey Ctrl+. proposes a quick-actions list that includes: extract an interface from the class members. Ctrl+R,I can be used instead to directly show the Extract Interface dialog. The Extract Interface dialog proposes to define the interface name (per defaut set to “I{class name}”) and the member list to include in the interface.

Extracting an interface from a class with Visual Studio
Extracting an interface from a class with Visual Studio

8) Move Type to Namespace

When the editor carret is over a type name, the hotkey Ctrl+. proposes the quick-actions list that includes: move to namespace. The Move to namespace dialog is smart enough to propose intellisense and auto-completion based on existing namespaces. However the type’s source file is not moved automatically to the folder corresponding to the namespace chosen, this must be done manually in the Solution Explorer. I hope this move will be done automatically in the future.

Move class to namespace with Visual Studio
Move class to namespace with Visual Studio

9) Add Parameter to a Method

You can add a named parameter to a method call location. Then the hotkey Ctrl+. proposes the refactoring Add parameters to the method called. The method signature is then refactored with the extra parameter but other calls of the method are left untouched. It means that now these other calls provoke a syntax error and must be fixed with the extra parameter.

Add parameter to a method with Visual Studio
Add parameter to a method with Visual Studio

10) Convert to interpolated string and simplify interpolation

When the carret is over a call to string.Format() call the hotkey Ctrl+. proposes the refactoring Convert to interpolated string. String interpolation with the syntax $”{parameter}” has been introduced with C#6 in 2015. Then if possible some elements like call to PadLeft() are grayed in code. This is a sign that the string interpolation can be simplified once again with the hotkey  Ctrl+. over those grayed elements.

Convert to interpolated string with Visual Studio
Convert to interpolated string with Visual Studio

Conclusion

Over the years thanks to massive effort put in Roslyn, Visual Studio got better and better when it comes to refactoring actions proposed out-of-the-box. Many more refactoring than those 10 are proposed: read the list of refactoring and list of quick-actions.

When we talk about Visual Studio and refactoring the case of Resharper immediately comes in the discussion. For more than a decade Resharper has been the tool of choice to improve the productivity with many refactoring actions and more great features. My independent opinion in the VS vs. R# debate (in 2020) is that R# is still a bit more powerful despite VS being now quite mature. However it seems to me that the fact that VS is now quite mature with refactoring is not popular enough, hence I hope this post can help spread the word. And if you ask, yes I still code with R# in VS despite R# slowing down a bit the IDE, but I find myself using it less often. Within the next years we can expect both VS improvements in terms of refactoring and R# improvements especially in terms of performance.


Actually the word refactoring has really two meanings:

  • Short and quick productivity refactoring actions as presented here.
  • Large scale refactoring that are necessary when the architecture of a legacy doesn’t fit anymore the planned evolution and maintainability requirements.

Large scale refactoring must be discussed extensively. The number one prerequisite for a successful large scale refactoring is a solid understanding of the legacy code architecture. This is where the tool NDepend with its new dependency graph and dependency matrix can really help. Here are two short videos that explain how:

Case Study : Complex UI Testing

In the previous post Case Study: 2 Simple Principles to achieve High Code Maintainability I explained that the principles layered code + high coverage ratio by test are 2 simple principles that can be objectively applied, validated and measured. When these 2 principles are applied they lead to High Code Maintainability: As a consequence the management saves money in the long term and developers are happy to work in a cleaner code base with principles easy to follow.

Large and complex UI 90%+ covered by tests

In this previous post I used the example of the new NDepend v2020.1 graph. This new tool is a large and complex UI with dozens of actions proposed to the user and drastic performance requirement (it scales live on 100.000+ elements). This graph implementation is 90% covered by tests. It is not because there is a lot of UI code that it should not been well tested. We didn’t spend a good part of our resources in writing tests just for the sake of it. We did it because we know by experience that it’ll pay off: probably a few bugs will be reported as for all 1.0 implementation although beta test phases already caught some. But we are confident that it won’t take a lot of resources to fix them. We can look forward the future confidently (like supporting properly .NET 5 that will be released in November 2020).  And indeed 10 days after its 1.0 release no bug has been reported (nor logged) on this new graph although many users downloaded it: so far it looks rock-solid and we can focus on what’s next.

The picture below shows all namespace, classes and methods of the graph implementation. Smaller rectangles are methods and the color of each rectangle indicates how well a method is covered by tests. Clearly we tolerate some gaps in UI code, while non UI code like Undo/Redo actions implementations are 100% covered. Experience told us how to balance our resources and that everything does not have to be perfect to achieve high maintainability.

NDepend Graph Implementation 90% covered by tests

How did we achieve High Coverage Ratio on UI Code?

It is easy: we have a simple MVC (Model View Controller) design. Some controller classes contain the logic for all actions the user can do and those classes pilot the UI. Concretely in our scenario actions are: load/save, change group-by, change layout direction, zoom, generate a call graph for this method, change filters…

Then we wrote a test suite that first starts the UI and then invokes all actions. Each complex peculiarity of each action gets fully tested, hence complex actions get invoked several times by tests but differently each time, to make sure all scenarios get tested.

The video below shows the UI under testing: more than 40 actions get tested in less than a minute. It would take more than an hour to do all this work manually and any change in code could potentially ruin the validity of manual tests.

In such a complex UI there are many classes that are not directly related to UI. For example the grape of classes that describe the underlying model are tested separately.

As usual, a side benefit of writing tests is better design : the code gets structured in a way that makes it easy to invoke it through tests. Concretely some abstractions are introduced (that wouldn’t make sense without tests), some classes and some methods get splitted, some logic gets refined and as a result developers are happy to live in a code base where the logic is smoothly implemented.

High Coverage Ratio is not Enough: Assertions to the rescue

Typically at this point comes the remark: but code coverage is not enough, results must be asserted. And indeed, if nothing gets asserted nothing gets tested even if the code is entirely covered by tests.  We want tests to fail if something can go wrong.

Of course our tests contain many assertions for example load / save actions are invoked and asserted this way:

But these assertions are not enough. Per definition the UI code contains tons of visual peculiarities represented by states that can be potentially corrupted. As a consequence our UI code is stuffed with thousands of assertions: everything that can be asserted gets asserted.

  • A Rectangle with width/height in certain range
  • The state of a node or an edge when another element gets selected (is it a caller, a callee…?).
  • The current application state when a new graph is demanded by the controller.
  • The graph UI contains many asynchronous computation to avoid UI freezing. This leads to many assertions to check that mutable states are not corrupted by concurrent accesses.

All those states asserted would be hardly reachable from test code. However they get naturally accessed by the UI code itself so it is the right place to assert that they are not corrupted.

Btw, We still use the good old System.Diagnostics.Debug.Assert(…) for that, it has several advantages:

  • It is simple.
  • It is understood by tools like Roslyn/Resharper/CodeRush analyzers.

  • An assertion that fails cannot be missed both when running automatic tests and when running manual tests on the Debug mode version.
  • Debug assertions are removed by the compiler in Release mode: assertions are not executed in production and users get better performance. The idea is to not consider users as testers: code released in production is supposed to be rock-solid. Assertions are like scaffolding that gets removed when a building gets delivered. If there is still a bug we’ll discover it from users feedback, from production logs or from our own manual tests.

Debug.Assert(…) is enough for us and it is understandable that some other teams wants more sophisticated assertions framework. The key is to take the habit to assert everything that can be asserted when writing code (UI code or not). Each assertion is a guard that helps making the code rock-solid. Also each assertion improves the code readability. At code-review time we’ve all been wondering: can this integer be zero? can this string be empty? can this reference be null?. Hopefully C#8 non-nullable discards the last question but so many questions remain open without assertions.

Design by Contracts

This idea of stuffing code with assertions is actually an important software correctness methodology named DbC, Design by Contract, that is really worth knowing. Contracts mean much more than the usual approach with exception:

  • Explicitly throwing an exception says: zero is tolerated, it is not a bug, but you won’t get the result you’d like, be prepared to catch some exceptions.
  • Writing a contract says: don’t even thing of passing a zero value. The real type of argument is not Int32 it is [Int32 minus 0].  Ideally such violation could be caught by compilers and analyzers (and is indeed sometime caught as we saw in the screenshot above).

Conclusion

Any complex UI can be automatically tested as long as:

  • It is well designed with some controllers that pilot the UI and that can be invoked from tests.
  • UI code gets stuffed with assertions to make sure that no state becomes corrupted at runtime.

In short assertions embedded in code tested matter as much as assertions embedded in tests. If an assertion gets violated there is a problem, no matter the assertion location, and it must not be missed nor ignored. This powerful idea doesn’t necessarily applies only to UI code and is known as DbC, Design by Contract.

Actually in this post I added a third principle to achieve high code maintainablity and high code correctness : layered code + high coverage ratio by test + contracts

Case Study: 2 Simple Principles to achieve High Code Maintainability

High Code Maintainability is the key to make both the management and the developers happy:

  • Maintainability lets a product evolves naturally at a sustained pace with controlled cost.
  • Maintainability lets developers add new features and improve existing ones without spending most of their time refactoring old dusty code and fixing bugs.

After 16 years of development on our product NDepend (first release in April 2004!) we came to the conclusion that:

Highly Maintainable Code can be achieved through two simple, objective and verifiable principles: Layered Architecture and High Test Coverage Ratio

Layered Architecture prevents entangled code, the well know spaghetti code phenomenon. Dependencies get mastered and when it is time for the code to evolve new classes and interfaces naturally integrate with existing ones.

High Test Coverage Ratio means that when code covered by tests get refactored, existing tests get impacted. With not much efforts the developer discovers regression problems and fix them before they go to production and become bugs to fix. The more code is covered by tests the more you’ll benefit from this shield.

When writing a tool for developers, the most satisfying part is to challenge the tool on its own code: this practice is named dogfooding. We just rewrote completely the dependency graph of NDepend so let’s use this important refactoring as a case study. Then we’ll see how to automatize the validation of these principles.

Case Study: Layered Architecture

Let’s first present the layered architecture principle and then the test coverage principle.

See below a graph of the 250+ classes, interfaces and enumerations used to implement the new dependency graph. A 2.500+ classes, methods and fields SVG vector dependency graph is available here.

The class GraphController is selected:

  • The blue classes are the ones directly used by GraphController
  • The light-blue classes are the ones indirectly used by GraphController (indirectly means used by a classes used by a class … used by GraphController). Clearly GraphController relies on everything.
  • The red classes are the ones mutually dependent with GraphController.
The NDepend Dependency Graph used to visualize its own code

Several things can be said on how this code is structured:

  • This is not an API so we can use namespaces the way we want. Here namespaces implement the concept of components.
  • Box size is proportional to the number of lines of code. We can see that the overall namespaces box size is well balanced. This is a good practice to avoid having a few monster components and tons of smaller components.
  • The biggest component in terms of number of classes and lines of code is the implementation of the Undo/Redo system. More than 30 actions are implemented (expand/collapse, change GroupBy, select/unselect, generate a call graph…). These actions are relatively low level in the structure. While they act on the entire system they are not coupled with the controller, the UI rendering or the layout computation.
  • The two lowest components are Base and Model. Both contain few logic and are used by almost all other components.

In the future, whether we add new actions on the graph or decide to improve the layout somehow, this architecture won’t undergo drastic modifications. Thanks to this view it’ll be easy to decide in which component to add our new classes or if new components should be added and what they can and cannot use.

Ideally the GraphControl class shouldn’t be entangled with the GraphController class. These two classes have been developed together. See below the coupling graph between GraphController and GraphControl. It has been obtained by double-clicking the red arrow between the two classes. It wouldn’t be difficult to introduce an interface to inject one implementation in the other one but we didn’t do it (see below the coupling graph between the two classes) . This is the key when it comes to care for maintainability: which move will offer the highest ROI? Not everything has to be perfect just for the sake of it. Experience shows that having only two classes entangled does not impact much the maintainability. We estimated that spending our resources to satisfy the two principles has a better ROI in the long run.

Coupling Graph between GraphController class and GraphControl class

 

Case Study: High Test Coverage Ratio

The graph implementation is 90% covered by tests. It is not because there is a lot of UI code that it should not been well tested. We didn’t spend a good part of our resources in writing tests just for the sake of it. We did it because we know by experience that it’ll pay off: probably a few bugs will be reported as for every 1.0 implementation although beta test phases already caught some. But we are confident that it won’t take a lot of resources to fix them. We can look forward the future confidently (like supporting properly .NET 5 that will be released in November 2020).

The picture below shows all namespace, classes and methods. Smaller rectangles are methods and the color of each rectangle indicates how well a method is covered by tests. Clearly we tolerate some gaps in UI code, while non UI code like Undo/Redo actions implementations are 100% covered. Here also experience tells us how to balance our resources and that everything does not have to be perfect to achieve high maintainability.

NDepend Graph Implementation 90% covered by tests

In terms of lines of code the NDepend Graph is not even 5% of the entire product, it is a tool in the toolset. The worst case scenario would be that each tool implementation regularly spits some bugs: all our resources would be spent fixing them, we couldn’t continue adding value to the product and the business would probably die at a point. Not even mentioning the frustration of users of a buggy product.

Each year we fix a few dozens of bugs that each impact few users but that doesn’t take us more than a tiny percentage of our overall development resources. The overall code base is 86.5% covered by tests and is entirely layered: maintenance doesn’t cost us much.

Typically at this point comes the remark: but code coverage is not enough, results must be asserted by unit-tests. And indeed, if nothing gets asserted nothing gets tested even if the code is entirely covered by tests.  We want tests to fail when something is going wrong. In this next post Case Study : Complex UI Testing I explain how millions of assertions get checked while running our test suite against the graph implementation.

Automatically Validate Layered Architecture and High Test Coverage Ratio

NDepend offers hundreds of default code rules but only 4 of them are used to validate these key points:

The fourth rule Avoid namespaces mutually dependent helps a lot to layer a large super-component. In this situation the first thing to do is to make sure there is no pair of components that use each other. For each such pair of namespaces matched, this rule has an heuristic and tells which type should not use which other type, same for method level. A technical-debt estimation is also given in terms of development effort it’ll cost to fix each pair. Here it says that 11 man-day (8 hours a day) should be spent if someone decides to layer the NHibernate code base. Unfortunately this is not possible because it would break thousands of client code base bound with it. Also let’s note that an interest estimation is also given in terms of: how much development effort does it takes per year if I let issues unfixed. Here this rule estimates that not fixing all those pairs of namespaces entangled costs 5 man-days a year to the development team.

Avoid Namespaces Mutually Dependent with advices on what to do and costs estimation

These rules can be validated during the build process (Azure DevOps / TFS, Jenkins, TeamCity, Bamboo, SonarQube…) and the team can know when the new code written diverges from these two maintainability goals.

 

Conclusion: Objective, Verifiable, Simple

What is interesting with these two simple concepts, layering and code coverage, is that they can be objectively applied, validated and measured. Last year in 2019 I wrote a blog post series on SOLID principles and there have been so much debate about how to apply them in the real-world. SOLID principles are a great way to improve our understanding of Object Oriented Programming and how encapsulation, abstraction, polymorphism, inheritance … should be used and not used. But when it comes to write maintainable code everyone has a different opinion.

If it is decided that the code structure should be layered there is not much debate about which part should be abstracted from other ones. If a class A should use a class B and B is in a higher layer than A, somehow an interface IB must be created at A level to inject the B implementation in A without breaking the layering.

These 2 concepts emerged over the years because we had the utter need to produce maintainable code. What I really like is that they are simple. And KISS (Keep It Simple Stupid) is a great principle in software engineering.

If a third principle should be added it would definitely be about user documentation: we offer free email support to users but we also offer tons of embedded and online documentation. Everytime a question starts to be asked a few times, we make sure that users can get the response immediately from both a tooltip (or a smart UI change) and from the online documentation. Some other ISV decides to make money with support. Personally I don’t find this fair because it is a clear incentive to produce rotted documentation and hence frictions for the user.

How did we obtain the image in this post

Let’s show that all those images in this post have been obtained within a few clicks.

  • First let’s search for Graph Panel in the entire NDepend code base (they get zoomed automatically).
  • Then let’s reset the metric view with NDepend.UI.Graph.* namespaces to get the colored treemap.
  • Then let’s go back to the graph and only keep NDepend.UI.Graph.* namespaces matched by the search.
  • Then un-group by parent assembly to get a graph made of namespaces only.
  • Then change the layout direction from Top to Bottom to have a nicer layout.
  • Then expand all namespaces to get all classes.
  • Finally expand all classes to get all methods and fields.
Using the NDepend Graph to obtain a clear view of the implementation of the Graph

 

Mythical man month : 10 lines per developer day

The mythical book, Mythical man month quotes that no matter the programming language chosen, a professional developer will write on average 10 lines of code (LoC) day.

After 14 years of full-time development on the tool NDepend I’d like to elaborate a bit here.

Let’s start with the definition of logical Line of Code. Basically, a logical LoC is a PDB sequence point except sequence points corresponding to opening and closing method brace. So here we have a 5 logical lines of code method for example:

I already hear readers complaining that LoC has nothing to do with productivity. Bill Gates once said “Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.“.

And indeed, measured on a few days or a few weeks range, LoC has nothing to do with productivity. As a full-time developer some days I write 200 LoC in a row, some days I spend 8 hours fixing a pesky bug by not even adding a LoC. Some day I clean dead code and remove some LoC. Some other days I refactor existing code without, all in all, adding a single LoC. Some days I create a large and complex UI control and the editor generates automatically 300 additional LoC. Some days are dedicated solely to performance enhancement or writing tests…

What is interesting is the average number of LoC obtained from the long term. And if I do the simple math our average is around 80 LoC per day. Let’s precise that we are strict on high code quality standard both in terms of code structure and formatting, and in terms of testing and code coverage ratio (see the last picture of this post that shows the NDepend code coverage map). For a code quality tool for developers, being strict on code quality means dogfooding☺.

So this average score of 80 LoC produced per day doesn’t sacrifice to code quality, and is a sustainable rhythm. Things get interesting with LoC after calibration: caring about counting LoC becomes an accurate estimation tool. After coding and measuring dozens of features achieved in this particular context of development, the size of any feature can be estimated accurately in terms of LoC. Hence with simple math, the time it’ll take to deliver a feature to production can be accurately estimated. To illustrate this fact, here is a decorated treemap view of the NDepend code base, K means 1.000 LoC. This view is obtained from the NDepend metric view panel with handmade coloring to illustrate my point. The small rectangles are methods grouped by parent classes, parent namespaces and parent assemblies. A rectangle area is proportional to the corresponding method #LoC.

treemap code-metric

Thanks to this map, I can compare the size in terms of LoC of most components. Coupling this information with the fact that the average coding score if 80 LoC per day, and looking back on cost in times for each component, we have an accurate method to tune our way of coding and estimate future schedules.

Of course not all components are equals. Most of them are the result of a long evolutive coding process. For example, the code model had undergone much more refactoring since the beginning than say, the dependency matrix for example that had been delivered out-of-the-box after a few months of development.

This picture reveals something else interesting. We can see that all these years spent polishing the tool to meet high professional standards in terms of ergonomy and performance, consumed actually quite a few LoC. Obviously building a performant code query engine based of C# LINQ that is now the backbone of the product took years. This feature alone now weights 34K LoC. More surprisingly just having a clean Project Properties UI management and model takes (model + UI) =(4K + 7K) = 11K LoC. While a flagship feature such as the interactive Dependency Graph only consumes 8K LoC, not as much as the Project Properties implementation. Of course the interactive Dependency Graph capitalizes a lot on the existing infrastructure developed for other features including the Dependency Model. But as a matter of fact, it took the same amount of effort to develop  the Dependency Graph than to develop a polished Project Properties model and UI.

All this confirms an essential lesson for everyone in charge of an ISV. It is lightweight and easy to develop a nice and flashy prototype application that’ll bring enthusiast users. What is really costly is to transform it into something usable, stable, clean, fast with all possible ergonomy candy to make the life of the user easier. And these are all these non-functional requirements that will make the difference between a product used by a few dozens of enthusiast users only, and a product used by the mass.

To finish, it is also interesting to visualize the code base through the prism of code coverage ratio. The NDepend code base being 86% covered, by comparing both pictures we can easily see which part is almost 100% covered and which part need more testing effort.

Business Complexity vs. Implementation Complexity

It is good software design practice to make sure that methods can be entirely viewed in the code editor that typically shows 30 to 45 lines at a time. The root of this principle is easy to grasp: scrolling up and down over a too large method impedes code readability.

Refactoring a too large method or a too large class implies to create several code elements smaller in terms of number of textual lines. But ultimately the code behavior didn’t change. In other words the business complexity remained the same but the refactor session reduced the implementation complexity (at least we hope so).

Software complexity is a subjective measure relative to the human cognition capabilities. Something is complex when it requires effort to be understood by a human. Software complexity is a 2 dimensional measure. To understand a piece of code one must understand both:

  • What this piece of code is doing at runtime, the behavior of the code, this is the business complexity.
  • How the actual implementation solves the business needs at runtime, this is the implementation complexity.

The whole point of software design, SOLID principles, code maintainability, patterns… is to avoid adding implementation complexity to the business complexity. But from where implementation complexity comes from? There are 5 main sources of implementation complexity:

Too Large Code

We already mentioned this one. It is easy to limit implementation complexity by abiding by simple code rules that check thresholds on code metrics like methods number of lines of code and method complexity, or classes with too many methods. The Single Responsibility Principle (SRP) also contribute to smaller implementations: less responsibilities for a class necessarily means less code.

Lack of Abstractions

An abstraction such as an abstract method, an interface or even a delegate, is the minimum required knowledged to invoke an implementation at runtime without knowing it at design time. In other words an abstraction hides implementation detail. Abstraction represents a great weapon to reduce the implementation complexity because polymorphism can replace quantity of if/else/switch/case. Concretely code like that:

Can be replaced with code like that:

Abstraction also reduces implementation complexity because it frees the developer mind of implementation details.

The S in SOLID is the SRP (mentioned above) and is not related to abstraction. But the OLID principles are all about using abstractions wisely:

How do we check for good usage of abstractions? There is no magic stick like thresholds to limit too large code elements. However the Abstractness vs. Instability graph and metrics and the Level metric are a good start to identify code areas that need more abstractions. They are both described in this post about Dependency Inversion Principle.

State Mutability at Runtime

A common source of implementation complexity is mutable states. The human brain is not properly wired to anticipate what is really happening at runtime in a program. While reviewing a class, it is hard to imagine how many instances will simultaneously exists at runtime and how the states of each these instances will evolve over time. This is actually THE major source of problems when dealing with a multi-threaded program.

If a class is immutable (if the states of all its instances objects don’t change at runtime once the constructor is called) its runtime behavior immediately becomes much easier to grasp and as a bonus, one doesn’t need to synchronize access to immutable objects. See here a post explaining in-depth immutable class.

A method is a pure function if it produces outputs from its inputs without modifying any state. Like immutable classes, pure functions are also a good mean to reduce implementation complexity.

Some code rules exists to enforce state mutability like Structure should be Immutable or Fields should be marked as ReadOnly when possible. Being immutable for a class or pure for a method is such an essential property that dedicated C# keywords for that would be welcome, like readonly for fields. Here is a proposal for C# support of the immutable keyword. By now some ImmutableAttribute or PureAttribute can be used to tag such elements and some code rule can check for Types tagged with ImmutableAttribute must be immutable or Methods tagged with PureAttribute must be pure

Over Coupling

When trying to re-engineer/understand/refactor a large piece of code (I mean something made of dozens of classes like a big namespaces or an assembly), the difficulty is directly proportional to the amount of coupling between considered code elements. Both dependency graphs below shows dependencies between 36 classes: but the left contains 64 edges and the right one contains 133 edges: which one would you prefer to deal with?

One key strategy to control coupling is to layer components and make sure that there is no dependency cycles. This can be checked on namespaces for example with the code rule Avoid namespaces dependency cycles. Using abstractions is also a good way to limit over coupling. If an interface has N implementations then relying only on one interface is virtually like depending on the N underlying classes.

Lack of Unit Tests

Software testing is a large topic I won’t cover here. But one key benefit of writing tests (apart enforcing business rules and detecting regressions early) is to ensure that the code is testable. Being testable for code means less coupling, more abstractions and overall simple code. Ultimately if the code is easily testable we can safely assume that its implementation complexity is kept low. Here also many code rules like Code should be tested can help enforce high testability.

One Measure for All

There are more sources of implementation complexity but those 5 ones are certainly the bigger culprits. To reduce this unnecessary complexity one must be able to measure it. But how to unify too large code, bad design, poorly tested code and more in a single metric?

As we saw most of these aspects can be enforced with code rules. A code rule produces issues and for each issue the code rule can estimate the cost to fix an issue and the annual cost to let an issue unfixed. A famous analogy with the financial field says that:

  • The estimated cost to fix code smells is the Technical-Debt: a measure of the implementation complexity.
  • The estimated annual cost to let code smells unfixed is the Annual Interest of the Debt: a measure of the business operation cost induced by poorly written and poorly tested code.

These estimations can be expressed in developer-time and ultimately in money cost which can be used by management to take the right decisions.

Static Analysis and Dependency Injection

For quite some years now, we (the NDepend team) got some demand about resolving Dependency Injection, see this page on our User Voices. Lately we’ve been considering such support carefully but came to the conclusion that it’d be awkward. On the other hand we have alway been prioritizing features development based on user feedback and need. So our conclusions are not definitive yet. I write this post to not only expose our point of view, but eventually trigger more productive discussions.

Static Analysis

First let’s underline that NDepend is a static analyzer. It gathers 100% of its data from the code itself, and also parses more data like code coverage. Static analysis is in opposition with dynamic analysis that gathers data from runtime execution.

  • Static analysis deals with classes, methods and design measures like lines of code or complexity. It is design-time analysis.
  • Dynamic analysis deals with objects and runtime measures like performance or memory consumption. It is run-time analysis

Dependency Injection (DI)

Recently I explained the basis of Dependency Injection (DI) and how it relates to DIP in the post SOLID Design: The Dependency Inversion Principle (DIP). DI is all about depending on interfaces instead of depending on classes. Here is simple DI sample code, ClientCode() depends on IDbConnection and not on SqlConnection. The power of DI is that now ClientCode() can run on other RDMS than SQL Server, like Oracle or SQLLite. ClientCode() can now also be unit-tested seamlessly by mocking the interface IDbConnection.

Stable Dependency Principle (SDP)

DI is an application of the Stable Dependency Principle (SDP), which is not part of the SOLID principles, but is much related with the Dependency Inversion Principle (DIP).

SDP states: The dependency should be in the direction of the stability. Stability is a measure of the likeliness of change. The more likely it will change, the less it is stable.

The Liskov Substitution Principle (LSP) and the Interface Segregation Principle (ISP) articles explain how interfaces must be carefully thought out. As a result an interface is a stable code element, that is less likely to change than the the classes that implement it. Following the SDP, it is better design to rely on interfaces than to rely on concrete classes.

The Users need: supporting DI through Static Analysis

In this context, users want NDepend to be able to parse DI code like that.

Concretely users want NDepend to bind all calls to an IDbConnection member with a call to the corresponding SqlConnection member . For example a dependency toward SqlConnection.Open() should be created for any method calling IDbConnection.Open().

But why would this be useful? Here is a user feedback:

We have layered architecture: Foundation, Feature, Project. Following current NDepend report, everything follows right dependencies. Project depends on Feature and Foundation. Feature depends on Foundation. However due to some of DI registrations, we have complex dependencies (…) there is dependency from Foundation to Project in reality. Interface is declared in Foundation, but implementation is in Project and registered via DI. It means that Foundation projects, that are core and stable part of solution, depend on Project level, which is not so stable and reliable.

In reality is highlighted in bold, because the user really meant at run-time. At run-time Foundation depends on Project, the same way that at run-time ClientCode() depends on SqlConnection. But at design-time Foundation doesn’t depend on Project, the same way that at design-time ClientCode() doesn’t depends on SqlConnection. At design-time ClientCode() only knows about IDbConnection.

From this, one implies that not only at design time, but also at runtime dependencies should go from Project towards Foundation. In other words, one feels that implementing an adapter into Project and injecting it into a Foundation component “is not so stable and reliable”, but that’s not how those principles are meant and described.

As a matter of fact, being able to inject Project adapters into Foundation components (where the Project adapter implements one of Foundation’s interfaces) is the core of the Dependency Inversion Principle (DIP). Without it, it is impossible to create flexible systems. This can be unreliable, but only if one doesn’t follow the Liskov Substitution Principle (LSP), when Foundation interfaces are not well thought-out, like for example when ICollection<T> forces the Array class to implement ICollection<T>.Add() which obviously cannot be supported.

Conclusion

Hence it looks like this feature need results from a confusion between design-time dependencies and run-time dependencies. The purpose of design is to reduce the cost of change, anything else is not design. Managing design-time dependencies is at the heart of design. All SOLID principles and all principles surrounding SOLID like the Stable Dependency Principle (SDP) are about design-time dependencies.

Moreover adding DI support would be a never-ending story—and I guess it cannot be done right without information collected at run-time. DI Containers are very complex beasts internally, their behavior often changes from one major version to the next, and there are dozens of DI Containers in .NET (Autofac, UnityContainer, NInject, StructureMap, MEF, SimpleInjector, Castle.Windsor, LightInject, Prism, Spring.net … just to name a few).

Again all our developments are driven by users-need, but on this particular need we’re not sure it’d make sense. We let the discussion open on this post and on the User Voices page. Whether you agree or disagree with our positions, feel free to comment.

 

 

SOLID Design: The Open-Close Principle (OCP)

The Open-Close principle (OCP) is the O in the well known SOLID acronym.

Bertrand Meyer is generally credited for having originated the term open/closed principle, which appeared in his 1988 book Object Oriented Software Construction. Its original definition is

  • A module will be said to be open if it is still available for extension. For example, it should be possible to add fields to the data structures it contains, or new elements to the set of functions it performs.
  • A module will be said to be closed if it is available for use by other modules. This assumes that the module has been given a well-defined, stable description (the interface in the sense of information hiding)

Upon these definitions the principle is usually expressed and summarized this way:

Modules should be open for extension and closed for modification.

There have been (and still are) a lot of debates about the meaning of this simple definition and its implication on the usage of Object-Oriented Programming (OOP) in the real-world.

In this post I’ll try to be as concrete and concise as possible.

The classical code example to explain OCP

The classical code example to explain OCP can be translated in C# this way:

I find this example a bit gross. I believe one must have no idea of what OOP is to write such code. It can obviously be refactored to something like this:

Let’s look at the implication of this refactoring:

  • We introduced a new abstraction IShape to represent a concept that we already had in mind. Indeed, in the first code sample the method DrawShapes() already accepted a sequence of shapes. With the new IShape abstraction our design is now open to accept more shapes like Triangle or Pentagone.
  • The method DrawShapes() will draw any new shape without any need for modification. In other words the DrawShapes() method implementation is closed.

Here is how the OCP is usually presented. It is all about anticipating the future changes in the system in a way that:

  • When a change occurs existing code is left untouched. Existing code here is DrawShapes() concrete method body, IShape interface and Circle and Square classes.
  • When a change occurs new code to implement changes is written in new classes that implement existing abstraction. New classes here could be Triangle and Pentagone.

The Point of Variation principle

Another way to see the OCP is the Point of Variation (PV) principle that states:

Identify points of predicted variation and create a stable interface around them.

I find PV more understandable than OCP because it is actionable. First identify potential variations, second build the proper abstractions around these variations.

The real challenge: Anticipation

But the real challenge is anticipation, anticipating is hard. If anticipation was easy we’d be all billionaire in bitcoins. In real world, when you anticipate the risk is high that:

  1. We do anticipate variations that won’t vary. This is what the YAGNI principle says (You aren’t gonna need it): Always implement things when you actually need them, never when you just foresee that you need them. Developing and maintaining an abstraction has a cost and if we won’t need it this cost is negative.
  2. The risk is also high that we don’t anticipate the variation that will really be needed. But once the need for variation becomes real this is your developer responsibility to refactor and create the right abstractions and the right stable code that will act upon these abstractions. This is the fool me once, don’t fool me twice idea: I am not supposed to foresee what I’ll need but I am supposed to identify and then write the right abstraction when I need it.

OCP in the real world

Here is a pragmatic approach to OCP:

  • In any case the KISS principle applies (Keep It Simple Stupid): don’t underestimate the difficulty of anticipating and don’t waste your resources creating abstractions you won’t need.
  • Write automatic tests: one of the greatest benefit of writing tests is that for a while, you must look at your code from the client perspective. If your code contains some area difficult to cover by tests, it certainly means that your code should be refactored to be easily 100% testable. Experience shows that when refactoring from poorly testable code to fully testable code, the need for right abstractions naturally pops up.
  • Some static analyzers can help you pinpoint typical OCP violations:
    • when downcasting reference (i.e casting from a base class or interface to a subclass or leaf class.),
    • when using the is or as operators (as in the first example above).
    • NDepend has the rule Base class should not use derivatives: matches of this rules are obvious violation of the OCP.
  • Keep in mind the fool me once, don’t fool me twice idea. You must refactor code as soon as the need to abstract some concepts is identified. Of course sometime it is not possible if tons of client code depend upon your API: in this situation you cannot easily refactor and often you’ll have to live with wrong design. This is why public API design is such a sensitive topic: you have no other choice than doing your best to anticipate and to accept to live with your past design mistake.

Be open to more than one variation with the Visitor pattern

Finally let’s underline that in the real world, data objects (like the shapes here) don’t implement themselves algorithm such as drawing. Experience tells that this is a clear violation of OCP because when a new algorithm is needed on data objects, like persistence in addition of drawing for example, all shape classes must be modified again. This is also a violation of the Single Responsibility Principle (SRP, the S in SOLID) because a shape class has now two responsibilities: 1) holding the shape data 2) drawing the shape.

Hence we now have two variations: we  need a way to abstract both the shapes and the algorithms applied on the shapes, in order to to write something like algorithm.ApplyOn(shape). This sort of call on two abstractions is named a double dispatching call: the implementation really invoked depends both on the IShape object’s type and the IAlgorithm object’s type. If you have N shapes and M algorithms you need [N x M] implementations.

Fortunately the visitor pattern helps implementing double dispatching. The code with the new persistence algorithm would then look like:

 

 

 

Service Oriented Architecture, A Dead Simple Explanation

Service Oriented Architecture: A Dead Simple Explanation

Service-oriented architecture (SOA) has been with us for a long time. The term first appeared in 1998, and since then it’s grown in popularity. It’s also branched into several variants, including microservice architecture. While microservices dominate the landscape, reports of SOA’s death have been greatly exaggerated. So, let’s go over what SOA is. We’ll cover why it’s an architectural pattern that isn’t going anywhere. Then we’ll see how you can apply its design concepts to your work. Continue reading Service Oriented Architecture: A Dead Simple Explanation

REST_vs_RESTful_The_Different_and_Why_the_Difference_Doesn_t_Matter

REST vs. RESTful: The Difference and Why the Difference Doesn’t Matter

What’s the difference between a REST API and a RESTful one? Is there a difference? This sounds like the kind of academic question that belongs on Reddit. But then you find yourself in a design session, and the person across the table is raising their voice.

The short answer is that REST stands for Representational State Transfer. It’s an architectural pattern for creating web services. A RESTful service is one that implements that pattern.

The long answer starts with “sort of” and “it depends” and continues with more complete definitions.

Continue reading REST vs. RESTful: The Difference and Why the Difference Doesn’t Matter