NDepend can assist in identifying C# code duplicated. First, we will present a case study, followed by an explanation of how it works.
Chasing for C# Code Clone: A Case Study
First starts NDepend.PowerTools.exe found in the NDepend redistributable.
NDepend.PowerTools.SourceCode
folder.Type the key “e” to start the power tool Search For Code Duplicate.
Then select an NDepend project. Here we selected a project that we created and that scans NodaTime version 3.1.0. Here is a web report obtained on this project.
Within a few seconds, potential C# code duplicates are identified and listed individually. Pressing the “o” key allows to open the source declarations.
For example, here is a duplicate that was found:
Understanding the Code Duplicate Heuristic
The algorithm behind this code duplicate Power Tool is simple yet highly effective in practice. It works by identifying groups of methods that use the same members—such as calling the same methods, reading from or writing to the same fields. These groups are referred to as “suspect sets.” The suspect sets are then ranked based on how many common members they share.
The algorithm follows three key steps:
- Investigate each method (including third-party ones) to determine if their callers could be considered suspects. Methods that are called frequently, such as those from
System.Collections.Generics
are discarded to reduce false positives. - Merge suspect sets obtained from the first step.
- Sort suspect sets based on a weight calculated by the number of common members called.
Pros and Cons
The duplicates identified by this algorithm are generally highly relevant. One of its key advantages over other algorithms is its resilience to minor modifications in copy-pasted code, meaning it isn’t easily fooled by slightly altered duplicates. Another strength is that the algorithm can be run directly on IL code, without requiring the source code. It’s worth noting that while this post shows examples with two methods in a suspect set, a suspect set can actually contain more than two methods.