NDepend

Improve your .NET code quality with NDepend

code smells fish

Easy to Miss Code Smells

The concept of a code smell is, perhaps, one of the most evocative in our profession.  The name itself has a levity factor to it, conjuring a mental image of one’s coworkers writing code so bad that it actually emits a foul odor.  But the metaphor has a certain utility as well in the “where there’s smoke, there may be fire” sense.

In case you’re not familiar, a code smell is an observable feature of the code (the smoke) that often belies a deeper existing problem (the fire).  When you say that a code smell exists, what you’re communicating is “you may be justified here, but I’m skeptical – in my experience this is probably a design flaw.”

Of course, accusing code of having a smell is only slightly less incendiary to the author than accusing code of being flat out bad.  Them’s fightin’ words, as they say.  But, for all the arguments and all of the righteous indignation that code smell accusations have generated over the years, their usefulness is undeniable.

No doubt you’ve heard of some of the most common and easiest to visualize code smells.  The God Class, Primitive Obsession, and Inappropriate Intimacy all come to mind.  These indicate, respectively a class in your code base doing way too much, a tendency to use primitive types when you should take advantage of classes, and a module or class that breaks encapsulation by knowing too many details about another.  The combination of their visual memorability and their wisdom has prodded us over the years to break things down, to create cohesive objects, and to preserve encapsulation.

I would argue, however, that there are many more code smells out there than the big, iconic ones that get a lot of attention.  I’d like today to discuss a few that I don’t think are as commonly known.  I’ll make the case for why, once you’ve mastered avoiding the well-known ones, you should watch for these as well.

Bloated Constructor

One smell that I see rear its head a lot, but that might not get as much play as others is the bloated constructor.  The constructor is where you should put the logic necessary to satisfy your object’s preconditions.  And, that’s it.  Nothing else belongs in there.

For one thing, bloated constructors make your classes difficult to test because a lot of code has to be executed just to get the things instantiated.  But, beyond that, it makes your application inflexible as well, and it’s often indicative of a wholesale, procedural approach to code.  If you see bloated constructors, you can usually bet that objects are not really organized by behaviors.

Excessive Using

I wanted to give this a memorable name, and “excessive using” is where I landed.  What I’m referring to here is having a class that uses a lot of types (which is often quickly visible in C# with a lot of using statements at the top of the file).  This is a smell indicating that your type is probably doing too much.

Many years ago, I encountered a class where the author had to use the full type qualifier for a class in the Excel library called something like “DataSet.”  The reason?  It was because there was a name collision with something in the SQL driver that was also called dataset.  The deeper, underlying problem?  This one class was dealing with spreadsheets and databases.  If there’s a more obvious prima facie violation of the Single Responsibility Principle, I can’t recall it.

If you have a class using a large number of other types, whether in your codebase or in a third party or framework libraries, there’s a good chance it’s doing too much.  And a class doing too much is a maintenance headache.

Lazy Loading

This one might seem a bit controversial, but I would argue that the presence of lazy loading in your application is a smell.   Surely I must be crazy, right?   I mean, lazy loading is an elegant solution  to two simultaneous problems: how else do you avoid speculative performance hits while presenting a clean façade to consumers of your API?  Lazy loading ensures you’ll incur the performance hit at the last responsible moment and then not again.

All of that is true, and all of that is the reason that “code smell” is not simply called “code problem.”  The trouble with lazy loading comes in a more insidious form, however.  The trouble is that in many codebases it winds up being overused and broadly shared.  The data access object manager’s lazy loading needs to log, so it triggers the logging lazy loader, which, naturally, triggers the configuration management lazy loader, which itself triggers… well, you get the idea.

Lazy loading is powerful and, when used tactically, it retains that power while helping you.  But the trouble is that lazy-loading constructs have two distinct operating modes, and those modes are opaque to their consumers.  In this fashion, a powerful construct can wind up turning your code into a confusing quagmire that’s nearly impossible to reason about at compile time.

Ask and Tell

The last lesser known smell that I’ll mention I call “Ask and Tell.”  With this, I’m referring to methods that return values and have side effects at the same time.  For instance, consider a method that takes a string, writes that string to a file, and then returns a Boolean indicating whether or not the operation succeeded.  This is “ask and tell,” because invoking the method involves telling the API to do something (write to the file) and asking it a question (“what happened?”).

There is a less well-known principle in programming known as “command-query separation,” which advises that any method should either be a side-effect free query about the state of something or a void operation that changes the state of something.  It should not be both.

The attraction of this principle comes in how elegantly it allows you to reason about APIs, provided they comply.  If anything returns a value you to you, you can count on it not having any side effects in computing that value.  Thus if you have an Invoice object and you ask it what the total is, you can count on it not doing something weird like charging a credit card or dumping information to a database.  (As an aside, the aforementioned lazy loading is an ipso facto violation of this principle).  If you find yourself looking at a void method, on the other hand, you know it will have an effect.

Like lazy loading, ask and tell methods are not something to crusade to eradicate from your codebase.  But if you find yourself using them frequently, you should probably rethink your APIs and class designs.

Code Smells: Not Hard to Detect

All four of these code smells are relatively straightforward to detect.  If you’re an NDepend user, the first two would be simple tweaks to out of the box metrics/stats.  The latter two would be fairly straightforward to take a crack at with CQLinq.

But whether you use an automated tool to detect them or simply through code review/inspection, they’re worth looking out for.  You could easily tie yourself in knots trying to conform to all principles and avoid all code smells, but being aware of them and keeping your eyes open can only make you a better developer and your code less likely to draw accusations of odor.

Published by

Erik Dietrich

I'm a passionate software developer and active blogger. Read about me at my site.

Comments:

  1. I have used ‘code and tell’ for the last 40+ years. Originally, it was the standard way of most languages (e.g. C). It is still needed today. If you have a query e.g. CanISaveThisValue and it returns True, you would call SaveThisValue and would be oblivious to the fact that the disc had crashed between the calls and the value was lost. The old SaveValue would return True if successful or False if not possible or if the save failed. To break Code And Tell, you need three calls: CanISaveThisValue, SaveThisValue, and DidTheSaveActuallyWork.

    I suppose that Exceptions are the modern way of dealing with the situation, but (IMHO, Exceptions are major code smells – they break flow (possibly though multiple procedure calls) and are a nightmare to build into languages). Exceptions are little more than ON ERROR GOTOs rebadged.

    ‘Code and tell’ is coming back into vogue with the ? (is not null) operator in C#, Java, and even VB meaning ‘if not null’ e.g. as myobj?.methodToRun().

    Maybe ‘code and tell’ is a smell, but it is safer than two steps to do one thing, exceptions, and ‘is not null’ operator.

  2. @jsc42
    Exceptions, in a language with decent support, are entirely different to On Error GoTo. In C++, for example, they ensure that resources allocated within the procedures between the exception and the handler and freed correctly, they help improve the readability of code (if used correctly) and offer near-zero overhead when no errors occur.

    They even fit well with the functional model, as they can be effectively modelled using monadic theory.
    On Error GoTo has none of those desirable properties.

    With your approach, every call that may possible fail gets swamped in the noise of 3 lines of code to handle errors. In what possible way is this sane?

    1. I don’t jsc42 meant to compare ON ERROR GOTO’s and exceptions from a technical point of view. Essentially they deal with the same problem using the same approach. Exceptions might do more and their syntax might read more naturally, but the approach is still “Here’s some code for when something goes wrong”. I am not a big fan of exceptions but I know they can be a necessity. So when I don’t need them, I don’t use them.

      One way to work around this “code smell” without using exceptions would be to have callbacks. e.g. OnLoadSuccess, OnLoadFailure, etc. Now if you have a synchronous void function whose states can only be success or failure, you should have stuck to your if-else and made the function return a boolean.

  3. In your ‘Ask and Tell’ example, I would say that returning a Boolean indicating success/failure is really just an ack/nack of whether the message was handled. In a synchronous situation, the Boolean indicates whether the message was properly processed (to the point that the method knows); in an asynchronous situation, the Boolean indicates if the message was received or not (not processed).

    I think a better example would be the canonical banking example where you deposit an amount via the method and the method also returns your balance. That method encapsulates both a command (depositing the money) and a query (what’s my new balance). To me, that’s a smell and indicates that the method should be broken into two.

    Great article — thanks!!

  4. My #1 first code smell is class size. If you break the 10/100 rule, it is a smell (10 lines per method, 100 lines per class). http://www.rhyous.com/2014/02/10/the-10100-rule-following-this-one-development-rule-will-improve-your-code

    I like the bloated Constructor one a lot. I especially see this with the over use of Dependency Injection (DI) and Inversion of Control (IoC) containers. A constructor ends up with about ten required parameters in order to even create it. Ugh!

    The Excessive Usings is a good sign of a class that is breaking the Single Responsibility Principle.

    I am not sure about the Lazy Loading one. I would argue that the lack of lazy loading is a code smell. No one should ever create an object with a property type of List and leave that object uninitialized. Either it is going to crash later when the next developer uses it or every use of the object will require a second line of code to instantiate the property. Both are poor designs. Similarly in a C# property, a List property should be initialized by the getter. In Java, Lists usually have getters not setters and the getter initializes the object if it is not initialized.

    [csharp]
    public List MyList
    {
    get { return _MyList ?? (_MyList = new List(); )}
    } private List _MyList;
    [/csharp]

    But perhaps you are talking higher level. Perhaps you are not talking about simple property initialization, but you are talking about larger lazy loading. Because one would never “log” when lazy loading a List object.

    If you are talking about basic property lazy loading, you have it backwards. That lack of it is a code smell. If you mean that someone is already breaking SOLID all over the place and have massive objects that are lazy loading and creating massive chains of other loaded objects, then “chained lazy loading” is a smell, but basic lazy loading is not.

  5. Really dislike the “Ask and Tell” example because it assumes a kind of single threaded universe. In fact, the entire Go language leverages the “Ask and Tell” pattern with its multiple return values. If you’re a C# developer (because this is ndepend), it also exists in the Framework itself. For example, Stream.Read() and other stream methods exercise exactly this pattern.

    Most REST endpoints are configured as an “Ask and Tell”. If I POST a Post to Facebook or LinkedIn, they return as ID or some other data about the Post they created. In your vision of “Ask and Tell”, I never get that data back or I have to hack some state on the object making the call, which is equally “code smelly”.

    I earnestly think that most “write” operations should return key information about the thing they just did.
    – If I send a write to the DB, I want the object, as it existed in the DB, as a return value.
    – If I send a POST to a REST endpoint, I want some detail about the action to be returned.
    – If I say “create me a file”, I want a handle to that file to be returned.

  6. Thanks for the good example – your banking account example is good because just getting the value afterwards does not indicate whether the amount was changed by the current transaction having succeeded or some other transaction running in parallel while the current one has failed.

    The original example is bad as it does not provide a sane and simple way of checking the result:

    exceptions? might be useful if it rarely fails, but in some cases failure can happen under regular conditions as well, thus an exception would clobber up the program flow (they are called “exceptions” rather than “expecteds” for a reason 😉

    callbacks? usually result in a more complicated program flow, especially when nested
    saving result and querying it in a separate call? results in a more complicated code and gets really awkward in case the corresponding object is modified by several threads or processes in parallel

    result parameters? can´t see any benefit over return values. depends on api style, though

    Apart from that I really miss the “const” type modifier in most languages – it helps a lot in self-documenting and type checking which functions and objects are immutable within a certain area or api part and which are not.

Leave a Reply

Your email address will not be published. Required fields are marked *