NDepend

Improve your .NET code quality with NDepend

Our experience with using third-party libraries

NDepend is a tool that helps .NET developers write beautiful code. The project was started in April 2004. It is now used by more than 6 000 companies worldwide.

In more than a decade, many decisions were made, each with important consequences on the code base evolution process, sometime good consequences, sometime less good. Relentlessly dog-fooding (i.e using NDepend to analyze and improve the NDepend code base) helped us a lot to obtain more maintainable code, less bugs, and to improve the tool usability and features.

When it comes to working on and maintaining a large code base for several years, some of the most important decisions made are related to relying (or not) on some third-party libraries. Choosing whether or not to rely on a library is a double-edged sword decision that can, over time, bring a lot of value or cause a lot of friction. Ultimately users won’t make a difference between your bugs and third-party libraries bugs, their problems become your problems. Consider this:

  • Sometimes the team just needs a fraction of a library and it may be more cost-effective, and more maintainable with time, to develop your own.
  • Sometimes the licensing of a free or commercial library will prevent you from achieving your goals.
  • Sometimes the library looks bright and shiny but becomes an obsolete project a few months later and precious resources will have to be spent maintaining others code, or migrating toward a trendy new library.
  • Sometimes the library code and/or authors are so fascinating that you’ll be proud to contribute and be part of it.

Of course we all hope for the last case, and we had the chance to experience this a few times for real. Here are some libraries we’ve had great success with:

Mono.Cecil (OSS)

Mono.Cecil is an open source project developed by Jean-Baptiste Evain that can read and write data nested in .NET assemblies, including compiled IL code. NDepend has relied on Cecil for almost a decade to read IL code and other data. Both the library and the support provided are outstanding. The performance of the library is near optimal and all bugs reported were fixed in a matter of days or even hours. For our usage, the library is actually close to bug free. If only all libraries had this level of excellence…

DevExpress WinForm (Commercial)

NDepend has also relied on DevExpress WinForm for almost a decade to improve the UI look and feel. NDepend is a Visual Studio extension and DevExpress WinForm makes smooth visual integration with Visual Studio. Concretely, thanks to this library we achieved the exact same Visual Studio theming and skinning, docking controls a la Visual Studio, menus, bars and special controls like BreadCrumb to explore directories. We have never been disappointed with DevExpress WinForm. The bugs we reported were fixed quickly, it supports high DPI ratio and it is rock solid in production.

Microsoft Automatic Graph Layout MSAGL (OSS)

NDepend has relied on MSAGL for several years to draw all sorts of graphs of dependencies between code elements including Call Graphs, Class Inheritance Graphs, Coupling Graphs, Path and Cycle Graphs…  This library used to be commercial but nowadays OSS.  Here also the bugs we reported were fixed quickly, it supports high DPI ratio and it is perfectly stable in production.

NRefactory (OSS)

NDepend used to have a C# Code Query LINQ editor in 2012, a few years before Roslyn became RTM. We wanted to offer users a great editing experience with code completion and documentation tooltips on mouse-hovering. At that time NRefactory was the best choice and it proved with the years to be stable in production. Nowadays Roslyn would certainly be a better choice, but since our initial investment NRefactory still does the job well, we didn’t feel the need (yet) to move to Roslyn.

 

Here are a few things we prefer to keep in-house:

Licensing

While there are libraries for licensing, these are vital, sensitive topics that require a lot of flexibility with time, and we preferred to avoid delegating it. This came at the cost of plenty of development/support and significant levels of acquired expertise. Even taking into account that these efforts could have been spent on product features, we still don’t regret our choice.

The licensing layer is a cornerstone in our relation with our users community and it cannot suffer from any compromise. As a side remark, several times I observed that the cost of developing a solid licensing layer postponed promising projects to become commercial for a while.

Persistence

As most of application, NDepend persists and restores a lot of data, especially the large amount of data in analysis results. We could have used relational or object databases, but for a UI tool embedded in VS, the worst thing would be to slow down the UI and make the user wait. We decided only optimal performance is acceptable for our users, and optimal performance in persistence means using custom streams of bytes. The consequence of this decision is less flexibility: each time our data schema evolves, we need to put in extra effort to keep ascendant compatibility.

I underline that most of the time it is not a good idea to develop a custom persistence layer because of the amount of work and expertise required. But taken account our particular needs and goals, I think we took the right decision.

Production Logs

I explained here about our production logs system. We consider it an essential component to making NDepend a super-stable product. Here also, we could have used a third-party library. We capitalize on our own logging system because, year after year, we customized it with a plethora of production information, which was required to help fix our very own problems. We kept the system lightweight and flexible, and it still helps us improve the overall stability and correctness of our products.

Dependency Matrix and Treemap UI Components

These UI components were developed years ago and are still up to date. Both then and now, I believe there is no good third-party alternative that meets all our requirements in terms of layout and performance. A few times, we received propositions to buy those components, but we are not a component provider and don’t have plans for that.

 

In this post I unveiled a few core choices we made over the years. We hope this information will be useful for other teams.

exploring technical debt codebase

Exploring the Technical Debt In Your Codebase

Recently, I posted about how the new version of NDepend lets you compute tech debt.  In that post, I learned that I had earned a “B” out of the box.  With 40 minutes of time investment, I could make that an “A.”  Not too shabby!

In that same post, I also talked about the various settings in and around “debt settings.”  With debt settings, you can change units of debt (time, money), thresholds, and assumptions of working capacity.  For folks at the intersection of tech and business, this provides an invaluable way to communicate with the business.

But I really just scratched the surface with that mention.  You’re probably wondering what this looks like in more detail.  How does this interact with the NDepend features you already know and love?  

Well, today, I’d like to take a look at just that.

To start, let’s look at the queries and rules explorer in some detail.

Introducing Quality Gates

Take a look at this screenshot, and you’ll notice some renamed entries, some new entries, and some familiar ones.

In the past, “Code Smells” and “Code Regressions” had the names “Code Quality” and “Code Quality Regression,” respectively.  With that resolved, the true newcomers sit on top: Quality Gates and Hot Spots.  Let’s talk about quality gates.

Continue reading Exploring the Technical Debt In Your Codebase

Managing Code Analysis Statistics with the NDepend API

If you’re familiar with NDepend, you’re probably familiar with the Visual Studio plugin, the out of the box metrics, the excellent visualization tools, and the iconic Zone of Uselessness/Zone of Pain chart.  These feel familiar to NDepend users and have likely found their way into the normal application development process. NDepend has other features as well, however, some of which I do not necessarily hear discussed as frequently.  The NDepend API has membership in that “lesser known NDepend features club.”  Yes, that’s right — if you didn’t know this, NDepend has an API that you can use.

You may be familiar, as a user, with the NDepend power tools.  These include some pretty powerful capabilities, such as duplicate code detection, so it stands to reason that you may have played with them or even that you might routinely use them.  But what you may not realize is the power tools’ source code accompanies the installation of NDepend, and it furnishes a great series of examples on how to use the NDepend API.

NDepend’s API is contained in the DLLs that support the executable and plugin, so you needn’t do anything special to obtain it.  The NDepend website also treats the API as a first class citizen, providing detailed, excellent documentation.   With your NDepend installation, you can get up and running quickly with the API.

Probably the easiest way to introduce yourself is to open the source code for the power tools project and to add a power tool, or generally to modify that assembly.  If you want to create your own assembly to use the power tools, you can do that as well, though it is a bit more involved.  The purpose of this post is not to do a walk-through of setting up with the power tools, since that can be found here.  I will mention two things, however, that are worth bearing in mind as you get started.

  1. If you want to use the API outside of the installed project directory, there is additional setup overhead.  Because it leverages proprietary parts of NDepend under the covers, setup is more involved than just adding a DLL by reference.
  2. Because of point (1), if you want to create your own assembly outside of the NDepend project structure, be sure to follow the setup instructions exactly.

A Use Case

I’ve spoken so far in generalities about the API.  If you haven’t already used it, you might be wondering what kinds of applications it has, besides simply being interesting to play with.  Fair enough.

One interesting use case that I’ve experienced personally is getting information out of NDepend in a customized format.  For example, let’s say I’m analyzing a client’s codebase and want to cite statistical information about types and methods in the code.  Out of the box, what I do is open Visual Studio and then open NDepend’s query/rules editor.  This gives me the ability to create ad-hoc CQLinq queries that will have the information I need.

But from there, I have to transcribe the results into a format that I want, such as a spreadsheet.  That’s fine for small projects or sample sizes, but it becomes unwieldy if I want to plot statistics in large codebases.  To address this, I have enlisted the NDepend API.

Continue reading Managing Code Analysis Statistics with the NDepend API

4 Ways Custom Code Metrics Improve A Development Team

One of the things that has surprised me over the years is how infrequently people take advantage of custom code metrics.  I say this not from the perspective of a geek with esoteric interest in a subject, wishing other people would share my interest.  Rather, I say this from the perspective of a business man, making money, and wondering why I seem to have little competition.

As I’ve mentioned before, a segment of my consulting practice involves strategic code assessments that serve organizations in a number of ways.  When I do this, the absolute most important differentiator is my ability to tailor metrics to the client and specific codebases on the fly.  Anyone can walk in, install a tool, and say, “yep, your cyclomatic complexity in this class is too high, as evidenced by this tool I installed saying ‘your cyclomatic complexity in this class is too high.'”  Not just anyone can come in and identify client-specific idiosyncrasies and back those findings with tangible data.

But, if they would invest some up-front learning time in how to create custom code metrics, they’d be a lot closer.

Being able to customize code metrics allows you to reason about code quality in very dynamic and targeted terms, and that is valuable.  But you might think that, unless you want a career in code base assessment, value doesn’t apply to you.  Let me assure you that it does, albeit not in a quite as direct way as it applies to me.

Custom code metrics can help make your team better and they can do so in a variety of ways.  Let’s take a look at a few.

Continue reading 4 Ways Custom Code Metrics Improve A Development Team

The Power of CQLinq for Developers

I can still remember my reaction to Linq when I was first exposed to it.  And I mean my very first reaction.  You’d think, as a connoisseur of the programming profession, it would have been, “wow, groundbreaking!”  But, really, it was, “wait, what?  Why?!”  I couldn’t fathom why we’d want to merge SQL queries with application languages.

Up until that point, a little after .NET 3.5 shipped, I’d done most of my programming in PHP, C++ and Java (and, if I’m being totally honest, a good bit of VB6 and VBA that I could never seem to escape).  I was new to C#, and, at that time, it didn’t seem much different than Java.  And, in all of these languages, there was a nice, established pattern.  Application languages were where you wrote loops and business logic and such, and parameterized SQL strings were where you defined how you’d query the database.  I’d just gotten to the point where ORMs were second nature.  And now, here was something weird.

But, I would quickly realize, here was something powerful.

The object oriented languages that I mentioned (and whatever PHP is) are imperative languages.  This means that you’re giving the compiler/interpreter a step by step series of instructions on how to do something.  “For an integer i, start at zero, increment by one, continue if less than 10, and for each integer…”   SQL, on the other hand, is a declarative language.  You describe what you want, and let something else (e.g. the RDBMS server) sort out the details.  “I want all of the customer records where the customer’s city is ‘Chicago’ and the customer is less than 40 years old — you figure out how to do that and just give me the results.”

And now, all of a sudden, an object oriented language could be declarative.  I didn’t have to write loop boilerplate anymore!

Continue reading The Power of CQLinq for Developers

The Better Code Book – Our MVPs of 2015

We firmly believe spaghetti belongs on the dinner table and not in code. Our mission when starting NDepend was to create a tool to make best coding practices easier to maintain and improve. Writing has always been part of our message (see Patrick Smacchia’s work on CodeBetter.com) and we are proud to present our favorite pieces of writing from around the web in the last year, collected in what we are calling the Better Code Book.

We wanted to focus not only on how people use NDepend to improve their code for developers and architects, but also how to use static analysis in a broader, management sense. We are extremely grateful for our contributors in this project. Let us introduce them:

Bjørn Einar Bjartnes is a developer at the Norwegian Broadcasting Corporation. His current role is a backend developer at the API team, serving web, mobile, TV clients and more metadata about programs- and video-streams. He holds a MSc in Engineering Cybernetics and has a background from the petroleum industry, which has probably shaped his view on systems design. Also, Bjørn is active in the local F# Meetup and a proud member of the lambda club, playing with all things useless related to computers. You can also follow him on Twitter: @bjartnes

Jack Robinson is a twenty-something student in his final year of a degree in Software Engineering at Victoria University of Wellington. Currently an Intern Developer at Xero, he enjoys writing clean code, playing a board game or two with his friends, or just sitting down and watching a good film. You can read not just his musings on computer science, but also reviews on films and more at his website jackrobinson.co.nz

Prasad Narravula is a programmer, architect, consultant, and problem-solving leader.  He helps teams in agile development essentials- feedback loops to fail fast, enabling (engineering) practices, iterative and incremental design, starting at the right place, discovery, and learning. When time permits, he writes at ObjectCraftworks.com.

Erik Dietrich, founder of DaedTech LLC, is a programmer, architect, development coach, writer, Pluralsight author, and technologist. You can read his writing and find out more about him at http://www.daedtech.com/ and you can follow him on Twitter @daedtech.

Anthony Sciamanna is a software developer from Philadelphia, PA who has worked in the industry for nearly 20 years. He specializes in leading and coaching development teams, improving development practices for cross-functional teams, Test-Driven Development (TDD), unit testing, pair programming, and other Agile / eXtreme Programming (XP) practices. He can be contacted via his website: anthonysciamanna.com

Tomasz Jaskula is a software craftsman, founder and organizer of Paris user groups for F# and Domain Driven Design. He focuses on creating software delivering true business value which aligns with the business’s strategic initiatives and bears solutions with clearly identifiable competitive advantage. He is currently working for a big French bank building reactive applications in F# and C#. In his free time, he runs a startup project on applying machine learning with F# to the recruitment field, speaks at conferences and user groups, and writes blogs and articles for a French magazine for coders called “Programmez !” You can visit his site jaskula.fr

Continue reading The Better Code Book – Our MVPs of 2015

With Code Metrics, Trends are King

Here’s a scene that’s familiar to any software developer.  You sit down to work with the source code of a new team or project for the first time, pull the code from source control, build it, and then notice that there are literally thousands of compiler warnings.  You shudder a little and ask someone on the team about it, and he gives a shrug that is equal parts guilty and “whatcha gonna do?”  You shake your head and vow to get the warning situation under control.

If you’re not a software developer, what’s going on here isn’t terribly hard to understand.  The compiler is the thing that turns source code into a program, and the compiler warning is the compiler’s way of saying, “you’ve done something icky here, but not icky enough to be a show-stopping error.”  If the team’s code has thousands of compiler warnings, there’s a strong likelihood that all is not well with the code base.  But getting that figure down to zero warnings is going to be a serious effort.

As I’ve mentioned before on this blog, I consult on different kinds of software projects, many of which are legacy rescue efforts.  So sitting down to a new (to me) code base and seeing thousands of warnings is commonplace for me.  When I point the runaway warnings out to the team, the observation is generally met with apathetic resignation, and when I point it out to management, the observation is generally met with some degree of shock.  “Well, let’s get it fixed, and why is it like this?!”  (Usually, they’re not shocked by the idea that there are warts — they know that based on the software’s performance and defect counts — but by the idea that such a concrete, easily metric exists and is being ignored.)

Continue reading With Code Metrics, Trends are King

The Most Important Code Metrics You’ve Never Heard Of

Oh, how I hope you don’t measure developer productivity by lines of code. As Bill Gates once ably put it, “measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.”  No doubt, you have other, better reasoned code metrics that you capture for visible progress and quality barometers.  Automated test coverage is popular (though be careful with that one).  Counts of or trends in defect reduction are another one.  And of course, in our modern, agile world, spring velocity is ubiquitous.

But today, I’d like to venture off the beaten path a bit and take you through some metrics that might be unfamiliar to you, particularly if you’re no longer technical (or weren’t ever).  But don’t leave if that describes you — I’ll help you understand the significance of these metrics, even if you won’t necessarily understand all of the nitty-gritty details.

Perhaps the most significant factor here is that the code metrics I’ll go through can be tied, relatively easily, to stakeholder value in projects.  In other words, I won’t just tell you the significance of the metrics in terms of what they say about the code.  I’ll also describe what they mean for people invested in the project’s outcome.

Type Rank

It’s possible that you’ve heard of the concept of Page Rank.  If you haven’t, page rank was, for a long time, the method by which Google determined which sites on the internet were most important.  This should make intuitive sense on some level.  Amazon has a high page rank — if it went down, millions of lives would be disrupted, stocks would plummet, and all sorts of chaos would ensure.  The blog you created that one time and totally meant to add to over the years has a low page rank — no one, yourself included, would notice if it stopped working.

It turns out that you can actually reason about pieces of code in a very similar way.  Some bits of code in the code base are extremely important to the system, with inbound and outbound dependencies.  Others exist at the very periphery or are even completely useless (see the section below on dead code).  Not all code is created equally.  This scheme for ranking code by importance is called “Type Rank” (at least at the level of type granularity — methods can also be ranked).

You can use Type Rank to create a release riskiness score.  All you’d really need to do is have a build that tabulated which types had been modified and what their type rank was, and this would create a composite index of release riskiness.  Each time you were gearing up for deployment, you could look at the score.  If it were higher than normal, you’d want to budget extra time and money for additional testing efforts and issue remediation strategies.

Cohesion

Cohesion of modules in a code base can loosely be described as “how well is the code base organized?”  To put it a bit more concretely, cohesion is the idea that things with common interest are grouped together while unrelated things are not.  A cohesive house would have specialized rooms for certain purposes: food preparation, food consumption, family time, sleeping, etc.  A non-cohesive house would have elements of all of those things strewn about all over the house, resulting in a scenario where a broken refrigerator fan might mean you couldn’t sleep or work at your desk due to noise.

Keeping track of the aggregate cohesiveness score of a codebase will give you insight into how likely your team is to look ridiculous in the face of an issue.  Code bases with low cohesion are ones in which unrelated functionality is bolted together inappropriately, and this sort of thing results in really, really odd looking bugs that can erode your credibility.

Imagine speaking on your team’s behalf and explaining a bug that resulted in a significant amount of client data being clobbered.  When pressed for the root cause, you had to look the person asking directly in the eye and say, “well, that happened because we changed the font of the labels on the login page.”

You would sound ridiculous.  You’d know it.  The person you were talking to would know it.  And you’d find your credibility quickly evaporating.  Keeping track of cohesion lets you keep track of the likelihood of something like that.

Dependency Cycles

So far, I’ve talked about managing risk as it pertains to defects: the risk of encountering them on release, and the risk of encountering weird or embarrassing ones.  I’m going to switch gears, now, and talk about the risk of being caught flat-footed, unable to respond to a changing environment or a critical business need.

Dependency cycles in your code base represent a form of inappropriate coupling.  These are situations where two or more things are mutually dependent in an architectural world where it is far better for dependencies to flow one way.  As a silly but memorable example, consider the situation of charging your phone, where your phone depends on your house’s electrical system to be charged.  Would you hire an electrician to come in and create a situation where your house’s electricity depended on the presence of your charging phone?

All too often, we do this in code, and it creates situations as ludicrous as the phone-electrical example would.  When the business asks, “how hard would it be to use a different logging framework,” you don’t want the answer to be, “we’d basically have to rewrite everything from scratch.”  That makes as much sense as not being able to take your phone with you anywhere because your appliances would stop working.

So, keep an eye out for dependency cycles.  These are the early warning light indicators that you’re heading for something like this.

Dead Code

One last thing to keep an eye out for is dead code.  Dead code is code that can never possibly be called during the running application’s lifecycle.  It just sits in your codebase taking up space to no good end.

That may sound benign, but every line of code in your code base carries a small, cognitive maintenance weight.  The more code there is, the more results come back in text searches of the code base, the more files there are to lose and confuse developers, and the more general friction is encountered when working with the system.  This has a very real cost in the labor required to maintain the code.

Use Code Metrics Wisely

These are metrics about which fewer people know, so the industry isn’t rife with stories about people gaming them, the way it is with something like unit test coverage.  But that doesn’t mean they can’t be gamed.  For instance, it’s possible to have a nightmarish code base without any actual dead code — perversely, dead code could be eliminated by finding everything useless in the code base and implementing calls to it.

The code metrics I’ve outlined today, if you make them big and visible to all, should serve as a conversation starter.  Why did we introduce a dependency cycle?  Should we be concerned about the lack of cohesion in modules?  Use them in this fashion, and your group can save real money and produce better output.  Use them in the wrong fashion, and they’ll be just another ineffective management bludgeon straight out of a Dilbert comic.

Let’s Build a Metric: Using CQLinq to Reason about Application State

I’ve been letting the experiments run for a bit before posting results so as to give all participants enough time to submit, if they so choose.  So, I’ll refresh everyone’s memory a bit here.  Last time, I published a study of how long it took, in seconds (self reported) for readers to comprehend a series of methods that varied by lines of code.  (Gist here).  The result was that comprehension appears to vary roughly quadratically with the number of logical lines of code.  The results of the next study are now ready, and they’re interesting!

Off the cuff, I fully expected cyclomatic complexity to drive up comprehension time faster than the number of lines of code.  It turns out, however, that this isn’t the case.  Here is a graph of the results of people’s time to comprehend code that varied only by cyclomatic complexity.  (Gist here).

SecondsVsCyclomaticComplexity

If you look at the shape of this graph, the increase is slightly more aggressive than linear, but not nearly as aggressive as the increase that comes with an increase in lines of code.  When you account for the fact that a control flow statement is also a line of code, it actually appears that conditionals are easier to comprehend than the mathematical statements from the first experiment.

Because of this finding, I’m going to ignore cyclomatic complexity for the time being in our rough cut time to comprehend metrics.  I’ll assume that control flow statements impact time to comprehend as lines of code more than as conditional branching scenarios.  Perhaps this makes sense, too, since understanding all of the branching of a method is probably an easier task than testing all paths through it.

As an aside, one of the things I love about NDepend is that it lets me be relatively scientific about the approach to code.  I constantly have questions about the character and makeup of code, and NDepend provides a great framework for getting answers quickly.  I’ve actually parlayed this into a nice component of my consulting work — doing professional assessments of code bases and looking for gaps that can be assessed.

Going back to our in-progress metric, it’s going to be important to start reasoning about other factors that pertain to methods.  Here are a couple of the original hypotheses from earlier in the series that we could explore next.

  • Understanding methods that refer to class fields take longer than purely functional methods.
  • Time to comprehend is dramatically increased by reference to global variables/state.

If I turn a critical eye to these predictions, there are two key components: scope and popularity.  By scope, I mean, “how closely to the method is this thing defined?”  Is it a local variable, defined right there in the method?  Is it a class field that I have to scroll up to find a definition of?  Is it defined in some other file somewhere (or even some other assembly)?  One would assume that having to pause reading the method, navigate to some other file, open it, and read to find the definition of a variable would mean a sharp spike in time to comprehend versus an integer declared on the first line of the method.

And, by popularity, I mean, how hard is it to reason about the state of the member in question?  If you have a class with a field and two methods that use it, it’s pretty easy to understand the relationship and what the field’s value is likely to be.  If we’re talking about a global variable, then it quickly becomes almost unknowable what the thing might be and when.  You have to suck the entirety of the application’s behavior into your head to understand all the things that might happen in your method.

I’m not going to boil that ocean here, but I am going to introduce a few lesser known bits of awesomeness that come along for the ride in CQLinq.  Take a look at the following CQLinq.

If your reaction is anything like mine the first time I encountered this, you’re probably thinking, “you can do THAT?!” Yep, you sure can. Here’s what it looks like against a specific method in my Chess TDD code base.

MethodFieldsAndParametersResults

The constructor highlighted above is shown here:

BoardConstructor

As you can see, it has one parameter, uses two fields, and assigns both of those fields.

When you simply browse through the out of the box metrics that come with NDepend, these are not the kind of things you notice immediately.  The things toward which most people gravitate are obvious metrics, like method size, cyclomatic complexity, and test coverage.  But, under the hood, in the world of CQLinq, there are so many more questions that you can answer about a code base.

Stay tuned for next time, as we start exploring them in more detail and looking at how we can validate potential hypotheses about impact on time to comprehend.

And if you want to take part in this on going experiment, click below to sign up.




Join the Experiment



Let’s Build a Metric: Incorporating Results and Exploring CQLinq

It turns out I was wrong in the last post, at least if the early returns from the second experiment are to be believed.  Luckily, the scientific method allows for wrongness and is even so kind as to provide a means for correcting it.  I hypothesized that time to comprehend would vary at a higher order with cyclomatic complexity than with lines of code.  This appears not to be the case.  Hey, that’s why we are running the experiments, right?

By the way, as always, you can join the experiment if you want.
You don’t need to have participated from the beginning by any stretch, and you can opt in or out for any given experiment as suits your schedule.

Join the Experiment

 

Results of the First Experiment

Recall that the first experiment asked people to record time to comprehend for a series of methods that varied by number of lines of code.  To keep the signal to noise ratio as high as possible, the methods were simply sequential arithmetic operations, operating on an input and eventually returning a transformed output.  There were no local variables or class level fields, no control flow statements, no method invocations, and no reaching into global state.  Here is a graph of the results from doing this on 3 methods, with 1, 5, and 10 logical lines of code.

LogicalLinesOfCodeComprehensionTime

So as not to overburden anyone with work, and because it’s still early, the experiment contained three methods, yielding three points.  Because this looked loosely quadratic, I used the three points to generate a quadratic formula, which turned out to be this.

GraphOfComprehensionVsLLOC

It’s far from perfect, but this gives us our first crack at shaping time to comprehend as something experimental, rather than purely hypothetical.  Let’s take a look at how to do this using NDepend in Visual Studio.  Recall all the way back in the second post in this series that I defined a metric for time to comprehend.  It was essentially a placeholder for the concept, pending experimental results.

All we’re doing is setting the unit we’ve defined, “Seconds,” equal to the number of lines of code in a method.  But hey, now that we’ve got some actual data, let’s go with it!  The code for this metric now looks like this.

I’ve spread on multiple lines for the sake of readability and with a nod to the notion that this will grow as time goes by. Also to note is that I’ve included, for now, the number of logical lines of code as a handy reference point.

Exploring CQLinq Functionality

This is all fine, but it’s a little hard to read.  As long as we’re here, let’s do a brief foray into NDepend’s functionality.  I’m talking specifically about CQLinq syntax.  If you’re going to get as much mileage as humanly possible out of this tool, you need to become familiar with CQLinq.  It’s what will let you define your own custom ways of looking at and reasoning about your code.

I’ve  made no secret that I prefer fluent/expression Linq syntax over the operator syntax, but there are times when the former isn’t your best bet.  This is one of those times, because I want to take advantage of the “let” keyword to define some things up front for readability.  Here’s the metric converted to the operator syntax.

With that in place, let’s get rid of the cumbersome repetition of “m.NbLinesOfCode” by using the let keyword. And, while we’re at it, let’s give NbLinesOfCode a different name. Here’s what that looks like in CQLinq.

That looks a lot more readable, huh? It’s now something at least resembling the equation pictured above. But there are a few more tweaks we can make here to really clean this thing up, and they just so happen to demonstrate slightly more advanced CQLinq functionality. We’ll use the let keyword to define a function instead of a simple assignment, and then we’ll expand the names out a bit to boot. Here’s the result.

Pretty darned readable, if I do say so myself! It’s particularly nice the way seconds is now expressed — as a function of our LengthFactor equation. As we incorporate more results, this approach will allow this thing to scale better with readability, as you’ll be able to see how each consideration contributes to the seconds.

So, what does it look like? Check it out.

Updated Seconds Metric with CQLinq

Now we can examine the code base and get a nice readout of our (extremely rudimentary) calculation of how long a given method will take to understand.  And you know what else is cool?  The data points of 8.6 seconds for the 1 LLOC method and 51 for the 5 LLOC method.  Those are cool because those were the experimental averages, and seeing them in the IDE means that I did the math right. 🙂

So we finally have some experimental progress and there’s some good learning about CQLinq here.  Stay tuned for next time!