The mythical book, Mythical man month quotes that no matter the programming language chosen, a professional developer will write on average 10 lines of code (LoC) day.
After 14 years of full-time development on the tool NDepend I’d like to elaborate a bit here.
Let’s start with the definition of logical Line of Code. Basically, a logical LoC is a PDB sequence point except sequence points corresponding to opening and closing method brace. So here we have a 5 logical lines of code method for example:
I already hear readers complaining that LoC has nothing to do with productivity. Bill Gates once said “Measuring software productivity by lines of code is like measuring progress on an airplane by how much it weighs.“.
And indeed, measured on a few days or a few weeks range, LoC has nothing to do with productivity. As a full-time developer some days I write 200 LoC in a row, some days I spend 8 hours fixing a pesky bug by not even adding a LoC. Some day I clean dead code and remove some LoC. Some other days I refactor existing code without, all in all, adding a single LoC. Some days I create a large and complex UI control and the editor generates automatically 300 additional LoC. Some days are dedicated solely to performance enhancement or writing tests…
What is interesting is the average number of LoC obtained from the long term. And if I do the simple math our average is around 80 LoC per day. Let’s precise that we are strict on high code quality standard both in terms of code structure and formatting, and in terms of testing and code coverage ratio (see the last picture of this post that shows the NDepend code coverage map). For a code quality tool for developers, being strict on code quality means dogfooding☺.
So this average score of 80 LoC produced per day doesn’t sacrifice to code quality, and is a sustainable rhythm. Things get interesting with LoC after calibration: caring about counting LoC becomes an accurate estimation tool. After coding and measuring dozens of features achieved in this particular context of development, the size of any feature can be estimated accurately in terms of LoC. Hence with simple math, the time it’ll take to deliver a feature to production can be accurately estimated. To illustrate this fact, here is a decorated treemap view of the NDepend code base, K means 1.000 LoC. This view is obtained from the NDepend metric view panel with handmade coloring to illustrate my point. The small rectangles are methods grouped by parent classes, parent namespaces and parent assemblies. A rectangle area is proportional to the corresponding method #LoC.
Thanks to this map, I can compare the size in terms of LoC of most components. Coupling this information with the fact that the average coding score if 80 LoC per day, and looking back on cost in times for each component, we have an accurate method to tune our way of coding and estimate future schedules.
Of course not all components are equals. Most of them are the result of a long evolutive coding process. For example, the code model had undergone much more refactoring since the beginning than say, the dependency matrix for example that had been delivered out-of-the-box after a few months of development.
This picture reveals something else interesting. We can see that all these years spent polishing the tool to meet high professional standards in terms of ergonomy and performance, consumed actually quite a few LoC. Obviously building a performant code query engine based of C# LINQ that is now the backbone of the product took years. This feature alone now weights 34K LoC. More surprisingly just having a clean Project Properties UI management and model takes (model + UI) =(4K + 7K) = 11K LoC. While a flagship feature such as the interactive Dependency Graph only consumes 8K LoC, not as much as the Project Properties implementation. Of course the interactive Dependency Graph capitalizes a lot on the existing infrastructure developed for other features including the Dependency Model. But as a matter of fact, it took the same amount of effort to develop the Dependency Graph than to develop a polished Project Properties model and UI.
All this confirms an essential lesson for everyone in charge of an ISV. It is lightweight and easy to develop a nice and flashy prototype application that’ll bring enthusiast users. What is really costly is to transform it into something usable, stable, clean, fast with all possible ergonomy candy to make the life of the user easier. And these are all these non-functional requirements that will make the difference between a product used by a few dozens of enthusiast users only, and a product used by the mass.
To finish, it is also interesting to visualize the code base through the prism of code coverage ratio. The NDepend code base being 86% covered, by comparing both pictures we can easily see which part is almost 100% covered and which part need more testing effort.