NDepend Blog

Improve your .NET code quality with NDepend

Improve C# code performance with Span<T>

April 24, 2024 9 minutes read

Improve C# code performance with Span of T

Welcome to our exploration of System.Span<T> and System.ReadOnlySpan<T>, two powerful structures introduced in C# 7.2 back in 2017. As a type-safe way to access contiguous memory regions, Span<T> can manage sequences of bytes stored on the heap, the stack, or even in unmanaged memory. This flexibility makes Span<T> a robust tool in a developer’s arsenal.

Span<T> in C# is not your everyday structure. It is declared as a ref struct which means it is restricted to stack allocation only. We will explain how this design choice brings more performance but also more restrictions. For instance, Span<T> cannot be used as a field in a class or harnessed within asynchronous methods.

In this blog post, we will delve into practical examples and benchmarks to demonstrate how Span<T> can enhance performance. Additionally, we’ll discuss why Span<T> often outperforms typical C# code, offering insights into its efficient utilization. Join us as we uncover the potential of Span<T> streamlining your code and boosting its execution speed.

C# Programming with Span<T>

In this section, we’ll dive into the practical use of Span<T> in C# programming by exploring a few code samples. This will help us understand how Span<T> can be employed to enhance code performance through more efficient data manipulation and memory management.

Basic Usage of Span<T>

First, let’s look at a simple example that demonstrates how to initialize and use Span<T> for basic operations. In this example, Span<int> is created from an array of integers. We then modify the first element of the Span, which also modifies the original array, demonstrating the by-reference nature of Span<T>.

Slicing with Span<T>

Span<T> excels in creating slices of data without allocating new memory. Here’s how you can create slices. This example showcases how to slice a Span<byte> to focus on a specific segment of the array without copying the data, demonstrating the efficiency of Span<T>.

String and ReadOnlySpan<char>

A ReadOnlySpan<char> is highly effective for executing read-only, memory-efficient operations on strings in C#. Here’s a straightforward example showing how to manipulate a substring within a string using ReadOnlySpan<char>, without incurring extra memory allocation. Conversely, using String.Substring(1, 3) would actually allocate a new string object containing "234":

Span<T> APIs

As we’ve seen Span<T> is particularly effective in improving performance when dealing with strings. This is because it enables the manipulation of substrings without any need for memory allocation. However, Span<T> is a generic type. It can be used with various data types, such as byte. The complete Span<T> API including extension methods is huge because of all the overloaded methods. Here is a simplified API below:

Notice above the unsafe constructor that takes a void* pointer. Span can work on any kind of memory including unmanaged memory. It thus represents a simpler way to work with pointers and unmanaged memory like in this code sample:

Above we were able to call uint i = uint.Parse(subStringSpan); because a new overload of uint.Parse(ReadOnlySpan<char>) exists in the .NET Base Class Library (BCL). What truly sets Span<T> and ReadOnlySpan<T> apart is their widespread integration into the BCL. This fact is illustrated in the screenshot below. It shows NDepend analyzing the .NET 8 framework in the directory C:\Program Files\dotnet\shared\Microsoft.NETCore.App\8.0.0:

Span used everywhere in .NET API

Span<T> vs. Array

At this point, one might wonder how Span<T> differs from standard arrays and especially the structure ArraySegment<T>.

  • Span<T> has a special relation with GC that makes it more performant than ArraySegment<T> in stack-only scenarios.
  • ArraySegment<T> is limited to managed memory while we saw Span<T> can handle also unmanaged memory.
  • ArraySegment<T> doesn’t provide a read-only view while ReadOnlySpan<T> does.
  • The confusion between Span<T> and array arises from the fact that Span<T> is a view on some data. Most of the time this data is represented through an array. So array is still needed. In this context, Span<T> is just a convenient view on arrays.

Improving some C# code performance with Span<T>

Now let’s put Span<T> to work and see how it can significantly boost performance in a practical, real-world scenario.

In this section, we will use Span<T> to obtain an array of uint from the string "163,496,691,1729".

  • Without Span<T> one would use "163,496,691,1729".Split(','). This call allocates four strings and an array to reference these four strings. Then uint.Parse(string) is used to parse each sub-string.
  • Actually, we will use ReadOnlySpan<char> because the content of a string is immutable.
  • With ReadOnlySpan<T> the input string gets sliced into four spans. Because ReadOnlySpan<T> is a ref struct, each of its instances occupies only a few bytes located on the current thread stack. Stack allocation is super fast and it does not impact the GC with values allocated on the stack. Then uint.Parse(ReadOnlySpan<char>) is used to parse each slice.

Here is a pseudo-code and some diagrams that summarize both approaches:

C# Span<T>

Benchmarking Span<T> performance gain

Below is the complete code that can be pasted into a C# Program.cs source file. To run this benchmark you need to reference the NuGet package BenchmarkDotNet. Here is the github project BenchmarkDotNet. Before digging into Benchmark.NET results, let’s note that:

  • A third approach with the method GetUIntArrayWithAstuteParsing() presents an optimized method for parsing "163,496,691,1729" without the requirement of using Span<T>.
  • In the real world, the number of uint in the comma-separated string input may not be known in advance. Typically, a List<uint> would be used to store uint values parsed until all of them are obtained. But here we want to demonstrate that no allocation is made by Span<T>. Thus to avoid cluttering the performance result, uint[] arrayToFill is pre-allocated with the proper length.

For each case, Benchmark.NET measures both memory allocation and duration. Here is how it presents the results:

  • GetUIntArrayWithAstuteParsing() is the fastest way and doesn’t allocate anything. The performance gain comes from the fact that we wrote our own dedicated uint parsing implementation. This clearly illustrates that, despite the presence of new features in the framework, the best performance often results from well-thought-out algorithms.
  • GetUIntArrayWithSpan() is 38% faster than GetUIntArrayWithSplit(). This is already a significant win. However, the core of performance gain is that there is no heap allocation. In a real-world scenario where this method would be used to parse millions of uint values, a lot of GC pressure would be saved.

Explanations About the Magic Behind Span<T> Implementation

Many articles discussing Span<T> tend to conclude at this point. We’ve introduced an efficient approach to sidestep the need for allocating sub-strings. However, the critical aspect lies in the substantial runtime modifications necessary to achieve this performant implementation of Span<T>. Let’s explain what happened.

The Span<T> source code shows that it contains two fields.

The _length value is internally multiplied by sizeof(T) to obtain the offset address of the slice. Thus the slice in memory is the range [_reference, _reference + _length*sizeof(T)].

_reference is a managed pointer field (or ref field). The ref field feature is a new feature added in C# 11 and .NET 7.0. Before that, the implementation of Span<T> (in .NET 6.0 and before…) used an internal trick to reference a managed pointer through an internal ref struct struct named ByReference<T>.

Span<T> is declared as a ref struct. A structure marked with ref, is a special structure that can only be allocated on the thread stack. This way it can hold a managed pointer as a field (ref field explained above).

The advantages of managed pointers

ref struct was released with C# 7.2 just to make the implementation of Span<T> through a managed pointer possible. If the .NET team achieved all these efforts this is because the Span<T> implementation being based on managed pointer has significant advantages:

  • Safe: Managed pointers are pointers but they belong to the safe world. There is no need to declare an unsafe scope to work with Span<T>.
  • Performance wise: The performance overhead of Span<T> is nearly negligible. This is because managed pointers, even though they are managed, are essentially regular pointers. Consequently, they incur minimal overhead. The management of these pointers includes two key aspects:
    • A) the C# compiler refuses code that could lead to a managed pointer pointing to an invalid memory and
    • B) if a managed pointer points to an object on the heap, the runtime automatically handles the updating of such pointers in the event of the GC relocating the referenced object
  • Flexibility:  A managed pointer can point to various types of memory, including objects on the heap, unmanaged buffer, value on the stack, field within an object, a slot within an array, or a position within a string. The Span<T> implementation benefits from this flexibility making its API and implementation concise. Because the memory pointed is typed as ref T, there is no need to bother if it’s a string, a slot of an array or a location on the stack.
  • Thread safe: A fortunate consequence of being stack-only is that a Span<T> instance belongs to a single thread. This makes Span<T> de-facto thread-safe.

Managed pointer, ref struct , ref field, extended usage of the keyword ref, is an interesting topic and we dedicated an entire article to it: Managed pointers, Span<T>, ref struct, C#11 ref fields and the scoped keyword

No stack-only restriction with Memory<T>

The structures System.Memory<T> and System.ReadOnlyMemory<T> were introduced alongside System.Span<T> and System.ReadOnlySpan<T> in the same release.

Memory<T> shares similarities with Span<T> but it is a regular structure. It doesn’t have the ref struct stack-only restrictions. This makes it suitable for use as a field in a class, for instance. However, this lack of constraint also means Memory<T> doesn’t have this special relation with the GC. Consequently, it is slightly less performant.  This performance loss arises from the fact that its implementation has 3x fields instead of 2x: instead of having a special ref pointer, Memory<T> needs to reference both the _object and then the _index in the object.

I wanted to benchmark the comma-separated string code above with Memory<T>.Then I realized that there is no uint.Parse(Memory<T>) API which suggests Memory<T> didn’t get as much love as Span<T>.

Span<T> and the .NET Framework

Because Span<T> and ref fields imply significant updates on the runtime GC, they were not ported to the .NET Framework. They are only available on the .NET Core runtime (.NET 7, .NET 8…) since version 2.1. Here is a Microsoft engineers discussion about it: Fast Span is too fundamental change to be quirklable in reasonable way.”.

However the implementation of Span<T>  exists for .NET Framework. It is referred to as slow span. To use it, reference the Nuget package System.Memory from your .NET Framework project. This implementation is similar to the Memory<T> implementation with 3x fields:

Also when referencing the System.Memory package from a .NET Framework project you won’t get APIs similar to uint.Parse(Span<T>) which makes it less attractive.

Conclusion

In this article, we’ve explored the innovative Span<T> and ReadOnlySpan<T> structures and their applications in refining code for peak performance.

Span<T> and ReadOnlySpan<T> hold a unique and significant place within the .NET Base Class Library. These types required substantial runtime modifications to deliver performance enhancements in extremely high-performance, critical scenarios. While not everyone may require their capabilities, for those who do, they can be a game-changing tool.

 

Comments:

  1. Blll Woodrufrf says:

    Excellent article !

Comments are closed.