Improve your .NET code quality with NDepend

L:ack of Cohesion of Methods: What Is This And Why Should You Care?

Lack of Cohesion of Methods: What Is This And Why Should You Care?

Lack of cohesion of methods (sometimes abbreviated LCOM) is one of those things that occurs fairly high up on the software hierarchy of needs.  What’s the “software hierarchy of needs?”  It’s a thing that I just made up, shamelessly copying Maslow’s Hierarchy of Needs.  (Though in a quick search to see how effectively I’ve planted this flag, I see that Scott Hanselman had the idea before me.  But mine looks a little different than his because I’m talking about the software, rather than the humans writing it.)

Anyway, here’s the hierarchy.

  • The cost of change stays relatively flat because the software is highly maintainable.
  • You can change it over time in response to evolving requirements and market realities.
  • It performs and behaves acceptably in production.
  • Technically, it fulfills the functional requirements and user value proposition.
  • It simply exists in production.

The software hierarchy of needs.

When it comes to something like lack of cohesion of methods, you’re talking about the top two categories.  Early in your career, you’ll tend to have worries like just wheezing past the finish line with a feature and like getting your code to do what you want it to.  With practice and time, you then start worrying about non-functional concerns and maintainability.  And when you start worrying about that, you should start paying attention to lack of cohesion of methods (among other things).

So for the rest of this post, I’ll focus on this one idea: lack of cohesion of methods.  What is it, how do you measure it, and why should you care?

First of All, What is Cohesion?

Before getting into what “lack of cohesion” means, it’s probably worth covering the idea of cohesion.   So, to the dictionary!

The act or state of coheringuniting, or sticking together.

In the simplest terms, things are cohesive when they stick together and stay together.  That applies to the wide world, as well as to software.  When you describe your group at work as cohesive, people will understand you to say that they work well together and get along.

In the world of source code, the same idea applies, albeit more narrowly.  If we have something like an assembly or a class and we say that it’s cohesive, we mean that its innards are coherently interrelated.

Cohesion and Lack of Cohesion Briefly Demonstrated

Instead of further description, let’s illustrate with some code.  First up, here’s a cohesive class, using C# 7 language features to make the class compact.

This class has a field, a property, and two methods.  Notice that both methods and the property all reference the field.  This is a cohesive class, with its functionality entirely wrapped up in manipulating the number field.

Now, consider another class.

This class has three fields and three methods.  Each method refers to its own field, and there’s no interrelation of which to speak.  You look at this class and justifiably ask yourself whether these methods and their referenced fields even belong in the same class.  (Indeed, you probably wonder why not just have a single method and field and create multiple instances of it to achieve this goal — and you should wonder that.)

Lack of Cohesion of Methods and Design Principles

When it comes to your code, you want to shoot for having cohesion in your codebases.  Think of it with the mantra, “Things that need to change together should exist together.”  In this vein, cohesion and LCOM tie in with two design principles you may have heard of:

The SRP, one of the so-called SOLID principles, states that a code element should have a single reason to change.  Admittedly, this is a little subjective, but you can understand it by thinking of our example classes above.  Why would someone change NumberManipulator?  Well, you’d change it in order to allow clients of the class to do different things to _number.  What about for the non-cohesive version?  More or less the same basic reasoning but for three different versions of _number.  Yikes.

It’s hard to look at a codebase and easily know whether you’re getting the SRP right entirely.  But you can recognize when you’re getting it badly wrong.  One way this occurs is that modifying the code to add features requires you to touch the codebase in many places (mimicking the spray of a shotgun).  Shotgun surgery indicates a lack of cohesion — the things that need to change at the same time are found all over the place.

Lack of Cohesion of Methods: Getting Specific

So you can see that the relatively granular metric/concern of LCOM actually relates to important and broad software design principles.  People have been recommending that you write cohesive code, even if they haven’t been using that term.

Let’s look now at LCOM the metric, rather than cohesion the concept.  LCOM is specifically a code metric that pertains to classes in your codebase.  To dig in, let’s look at how NDepend computes it.

LCOM for a class will range between 0 and 1, with 0 being totally cohesive and 1 being totally non-cohesive.  This makes sense since a low “lack of cohesion” score would mean a lot of cohesion.

Here’s how the calculation works.  For each field in the class, you count the methods that reference it, and then you add all of those up across all fields.  You then divide that by the count of methods times the count of fields, and you subtract the result from one.  So, for instance, consider the classes above.

  • NumberManipulator = 1 – (3/4) = 0.25 LCOM.
  • NonCohesiveNumberManipulator = 1 – (3/12) = .75 LCOM.

If you’re wondering about the phantom 4th method, consider that each of these classes technically has the default contructor, which counts.  To get perfect cohesion, you would need a constructor to refer to the lone field as well.

NDepend for LCOM

Of course, you don’t need to actually compute this stuff for yourself.  Here’s a screenshot of NDepend’s query and rule editor from the playpen codebase where I created this example.

You should absolutely use static analysis tooling at your disposal to compute lack of cohesion of methods.  Out of the box, NDepend will both compute it for you and warn you when it gets excessive for types in your codebase.  You can then adjust these thresholds and warnings as you see fit.

Why Does LCOM Matter to You?

I’ll round out the explanation now by discussing the significance.  So far, I’ve talked about this obliquely, pointing out that cohesive types and cohesion in your codebase in general helps you conform to the SRP and avoid shotgun surgery.  But from a broader perspective…so what?

Well, it all boils down to maintenance.  Every time you touch a codebase to implement new functionality or to fix a defect, you introduce risk.  So if each change requires you to wander all over the codebase, you increase that risk.  Likewise, if you have classes that mash incoherent and unrelated functionality together, you have much greater chance of breaking thing A when you’re trying to fix thing B.

As a metric, LCOM alerts you to pockets of your code that are becoming risky.  And, more than that, it can jolt you into recognizing a suboptimal design.  If a warning from NDepend brings you to the class NonCohesiveNumberManipulator, you might look at that and say, “Huh. Why don’t we just use multiple instances of a class with a single field?”  And the answer could be simply that nobody ever thought to do that before.

As I said at the beginning of the post, these concerns sit fairly high on the software hierarchy of needs.  But you should never stop seeking to improve, so it’s  important that you make yourself aware of things like LCOM.

Published by

Erik Dietrich

I'm a passionate software developer and active blogger. Read about me at my site.

Leave a Reply

Your email address will not be published. Required fields are marked *