NDepend

Improve your .NET code quality with NDepend

Improve C# code performance with Span<T>

C# 7.2 introduced the structure System.Span<T>. First we’ll present a concrete example where Span<T> helps achieve better performance. Then we’ll explain what makes Span<T> so special.

Span<T> primary goal is to avoid allocating new objects on the heap when one needs to work with a contiguous region of arbitrary memory. Performance gain is twofold:

  • A) the allocation on heap operation is not performed
  • B) less pressure on the Garbage Collector (GC) since it doesn’t need to track non-allocated objects.

Using Span<T> to parse a comma separated uint string

Let’s use Span<T> to obtain an array of uint from the string "163,496,691,1729".

  • Without Span<T> one would use "163,496,691,1729".Split(','). This call allocates four strings and an array to reference these four strings. Then uint.Parse(string) is used to parse each sub-string.
  • With Span<T> (actually with ReadOnlySpan<char> because the content of a string is immutable) the input string gets sliced into four spans. Because ReadOnlySpan<T> is a structure, each span is concretely a few bytes added on the current thread stack. Stack allocation is super fast and the GC is not impacted by values allocated on the stack. Then uint.Parse(ReadOnlySpan<char>) is used to parse each slice.

The method uint.Parse(ReadOnlySpan<char>) shows that it’s not just about Span<T> itself: many .NET APIs related to string and other memory representations have been extended to accept Span<T> or ReadOnlySpan<T>.

Here is pseudo code and diagrams that summarizes both approaches:

C# Span<T>

Benchmarking Span<T> performance gain

Below is the complete code that can be pasted in a C# 6 Program.cs source file. The NuGet package BenchmarkDotNet needs to be referenced. Here is the github project BenchmarkDotNet. Before digging into Benchmark.NET results, let’s note that:

  • A third method GetUIntArrayWithAstuteParsing() shows an optimized way to parse "163,496,691,1729" without the need of Span<T>.
  • In the real-world the number of uint in the comma separated string input wouldn’t be known upfront. A List<uint> would be used to store uint values parsed until we get them all. But here we want to highlight that no allocation is made by Span<T>. Thus to avoid cluttering the result uint[] arrayToFill is pre-allocated with the proper length.

For each cases, Benchmark.NET measures both memory allocation and durations. Here is how it presents the results:

  • GetUIntArrayWithAstuteParsing() is the fastest way and doesn’t allocate anything. The performance gain comes from the fact that we wrote our own dedicated uint parsing implementation. This shows well that despite new goodies in the framework, best performance is often achieve with well though-out algorithm.
  • GetUIntArrayWithSpan() is 38% faster than GetUIntArrayWithSplit(). This is great but the essential saving is that nothing gets allocated. In a real world scenario where this method would be used to parse millions of uint values, a lot of GC pressure would be saved.

Span<T> implementation is base on managed pointer!!

Most of Span<T> articles I read stop here. We now have a new cool way to avoid allocating sub-strings. But the key is that the super-performant implementation of Span<T> required heavy changes in the runtime. Let’s explain what happened.

The Span<T> source code (in .NET 7.0 and upper) shows that it contains two fields.

The _length value is internally multiplied by sizeof(T) to obtain the end address of the slice. Thus the slice in memory is the range [_reference, _reference + _length*sizeof(T)].

_reference is a managed pointer field (or ref field). The ref field feature is a new feature added in C# 11. Before that the implementation of Span<T> (in .NET 6.0, 5.0…) used an internal trick to reference a managed pointer through an internal ref struct named ByReference<T>.

Span<T> is declared as a ref struct. A structure marked with ref, is a special structure that can only live on the thread stack. This way it can hold a managed pointer as a field (ref field explained above). The feature ref struct was released with C# 7.2 just to make the implementation of Span<T> upon a managed pointer possible. If the .NET team did all this efforts this is because a Span<T> implementation based on managed pointer has significant advantages:

  • Safe: Managed pointers are pointers but they belong to the safe world. There is no need to be within an unsafe scope to work with Span<T>.
  • Performance wise: Span<T> performance overhead is close to zero because even if they are managed, managed pointers are pointers and thus they have almost no overhead. Managed pointers are managed in the sense: A) the C# compiler refuses code that could lead to a managed pointer pointing to an invalid memory and B) if a managed pointer points to an object on the heap, the runtime takes care of updating such pointer if the GC relocates the pointed object.
  • Flexibility: Managed pointer can reference any kind of memory (object on heap, unmanaged buffer, value on the stack, field with an object, a slot within an array, a position within a string…). The Span<T> implementation benefits from this flexibility that makes its API and implementation concise. Because the memory pointed is typed as ref T, there is no need to bother if it’s a string, a slot of an array or a location in stack.
  • Thread safe: A fortunate consequence of being stack-only is that a Span<T> instance belongs to a single thread. This makes Span<T> de-facto thread-safe.

Managed pointer, ref struct ,ref field, extended usage of the keyword ref, is an interesting topic and I dedicated an entire article to it: Managed pointers, Span<T>, ref struct, C#11 ref fields and the scoped keyword

No restriction with Memory<T>

The struct Memory<T> is similar to Span<T> but without the ref struct restrictions. It can be used as a field of a class for example. As a consequence Memory<T> doesn’t have this special relation with the GC and is a bit less performant.  This performance loss is because its implementation has 3x fields instead of 2x: instead of having a special ByReference<T> pointer, Memory<T> needs to reference the _object and then the _index in the object.

I wanted to benchmark the comma separated string code above with Memory<T> but realized that there is no uint.Parse(Memory<T>) API which suggests Memory<T> didn’t get as much love as Span<T>.

Span<T> and the .NET Framework

Because Span<T> and ByReference<T> imply significant updates on the runtime GC, they were not ported to the .NET Framework. They are only available on the .NET Core runtime (.NET 5, .NET 6…) since version 2.1. Here are Microsoft engineers discussions about it: Fast Span is too fundamental change to be quirklable in reasonable way.”.

However an implementation of Span<T>  exists for .NET Framework, it is referred as slow span. To use it the Nuget package System.Memory must be referenced. This implementation is similar to the Memory<T> implementation with 3x fields:

Also when referencing the System.Memory package from a .NET Framework project you won’t get APIs like uint.Parse(Span<T>) which makes it less attractive.

Span<T> API

As we saw Span<T> is especially suitable to improve performance when working with string because it provides a way to manipulate sub-string without any allocation. However Span<T> is a generic type and can be used on any kind of element like byte. The complete Span<T> API including extension methods is huge because of all the methods overloaded. Here is a simplified API below:

Notice above the unsafe constructor that takes a void* pointer. Span can work on any kind of memory including unmanaged memory. It thus represents a simpler way to work with pointers and unmanaged memory.

However what makes Span<T>/ReadOnlySpan<T> shines is that more than 5.000 methods in the .NET API use it. This is shown by the screenshot below taken from NDepend analyzing the .NET 6 framework in the directory C:\Program Files\dotnet\shared\Microsoft.NETCore.App\6.0.2:

Span used everywhere in .NET API

Span<T> vs. Array

At this point one might wonder how Span<T> differs from array and especially ArraySegment<T>.

  • Span<T> has special relation with GC that makes it more performant than ArraySegment<T> in stack-only scenarios.
  • ArraySegment<T> is limited to managed memory while we saw Span<T> can handle also unmanaged memory.
  • ArraySegment<T> doesn’t provide a read-only view while ReadOnlySpan<T> does.
  • The confusion between Span<T> and array comes from the fact that Span<T> is a view on some data and most of the time this data is represented through an array. So array is still needed, Span<T> is a just a convenient view on it.

Conclusion

Span<T>/ReadOnlySpan<T> are special API that required heavy runtime modifications to offer improvements in very high performance critical scenarios. Not everyone needs its power but for those that needs it, it is a game changer.

 

My dad being an early programmer in the 70's, I have been fortunate to switch from playing with Lego, to program my own micro-games, when I was still a kid. Since then I never stop programming.

I graduated in Mathematics and Software engineering. After a decade of C++ programming and consultancy, I got interested in the brand new .NET platform in 2002. I had the chance to write the best-seller book (in French) on .NET and C#, published by O'Reilly and also did manage some academic and professional courses on the platform and C#.

Over my consulting years I built an expertise about the architecture, the evolution and the maintenance challenges of large & complex real-world applications. It seemed like the spaghetti & entangled monolithic legacy concerned every sufficiently large team. As a consequence, I got interested in static code analysis and started the project NDepend in 2004.

Nowadays NDepend is a full-fledged Independent Software Vendor (ISV). With more than 12.000 client companies, including many of the Fortune 500 ones, NDepend offers deeper insight and full control on their application to a wide range of professional users around the world.

I live with my wife and our twin kids Léna and Paul in the beautiful island of Mauritius in the Indian Ocean.

Comments:

Comments are closed.