NDepend Blog

Improve your .NET code quality with NDepend

Managed pointers, Span, ref struct, C#11 ref fields and the scoped keyword

October 25, 2022 9 minutes read

Managed pointers, Span of T, ref struct, C#11 ref fields and the scoped keyword

The concept of managed pointer exists in the NET runtime and C# since the inception of the platform in the early 2000. Managed pointers belong mostly to the pointer world, which makes them well suited for performance critical scenarios. However unlike regular pointers, an extra care from the compiler and the runtime makes their usage safe.

Only recently – from C# 7.0 (2017) to C# 11 (2022) – the .NET engineers improved the runtime and the language constructs around managed pointers to unleash their flexibility and performance gain. These improvements mostly rely in more expressions supporting the C# keyword ref dedicated to managed pointers.

These new constructs around the keyword ref are often not well understood. Many articles talking about new ref stuff don’t event mention managed pointers. Only language design discussions (that are de-facto quite verbose) and a few insider posts (that often have a narrow focus) capture the primary intention. Some sparse stackoverflow answers also contain interesting remarks. So I decided to write the present article to attempt to provide the whole story through code samples to illustrate these recent C# evolutions.

Let’s keep in mind that managed pointer benefits are mostly for CPU intensive scenario on low-level constructs, like parsing a large data set. You might not use managed pointers every day but it is certainly worth knowing about their capabilities.

Managed Pointer

Since the early days of C# and .NET there was the concept of managed pointer. A managed pointer is like a pointer except that the runtime keeps track of it and thus, it doesn’t require an unsafe scope. This code below shows one managed pointer that points to an integer on the thread stack and another managed pointer that points to a string object. The method changes the pointed integer value and then the pointed string object. During a GC relocation phase (when object are being compacted) the pointed string object location can change, which leads the runtime to update the pointer, this is why it is qualified as managed.

Interior pointer

The code below shows that a managed pointer can point to fields location that are nested within the layout of an object, and even to a particular slot within an array. Again the GC can modify the in-memory location of the object and then update managed pointers. This kind of managed pointer pointing inside an object or an array is known as interior pointer.

Keeping managed pointer safe

Because managed pointers are not in unsafe scopes, the compiler must prevent a managed pointer to point to memory that is no longer valid. In the example below, the variable j lives on the UpdateRef() method’s stack. Thus j doesn’t exist anymore when UpdateRef() returns. As a consequence, the managed pointer i, that is located outside the UpdateRef() method stack (maybe in the caller method stack), cannot point to the j location that has a narrower escape scope.

managed reference escape scope

Managed pointer vs. managed reference

From now keep in mind that:

  • A managed pointer can point to anything on the stack or on the heap. When the GC relocates an object, if a managed pointer points to the object or to its interior, the GC updates the manager pointer.
  • A managed pointer always lives on the stack. The rational behind this restriction is that if a managed pointer could live on the heap, this would lead to too complex heuristics at GC relocation time and this would go against the performance expectation of managed pointer.
  • While managed references only objects, managed pointers are more flexible. A managed pointer can point to any kind of memory representation including:
    • a method local variable
    • a method parameter in or out
    • a location on the stack
    • an object
    • a field of an object
    • an element of an array
    • a string or a location within a string
    • unmanaged memory buffer
  • Managed pointer is good for performance, because in many scenario it doesn’t require any new object to be created on the heap, which would put pressure on the GC.

If you want to dig deeper into managed pointer and interior pointer concepts (discussion about IL code generated, JITed assembly code, GC implication…) you can read this great article written by Konrad Kokosa.

Managed pointers vs. pointers, unsafe, and pinning

Managed pointer belong to the safe world:

  • because the GC takes care of updating them when it relocates an object pointed
  • and because the compiler prevents situations where a managed pointer could reference some invalid memory.

Regular pointers do require the keyword unsafe to be used. And when a pointer points to a location in the managed heap, the object that hold the location must be pinned first. Pinning prevents the GC from relocating the object while some pointers work is performed on the object memory representation. Pinning is thus harmful for performance, exactly what we don’t want when we work with pointers.

C# 7.0 ref local and ref return

C# 7.0 extended the usage of the ref keyword. A local variable can be a managed pointer, this is illustrated by the example below:

Also a method can return a managed pointer:

This example is not trivial. First notice that the array is allocated on the heap and lives longer than the method GetRef() that builds it. So it is fine to return an interior pointer to one of its element. Second, the returned interior pointer is then the only GC root of the array. But the array itself became unreachable once the method has returned because there is no way to obtain the array reference from its interior pointer!

C# 7.2 ref struct and Span<T>

C# 7.2 introduced the notion of ref struct mostly to provide a fast and flexible implementation for Span<T>. Span<T> let’s work with a contiguous region of arbitrary memory. For example a Span<char> can map a substring of a string object. While the method string.SubString(int start, int length) returns a new string object, a substring represented by a Span<char> is the memory slice within the string itself. Thus working with substrings through Span<T> is much more performant because it doesn’t require any new string allocation.

Here is what we will see in this section:

  • Span<T> hold internally a managed pointer to the memory slice pointed.
  • As all managed pointer, the one held by a Span<T> must live on the thread stack. Thus Span<T> must also live on the thread stack.
  • This is why ref struct was created: to force a Span<T> to always live on the thread stack through compile-time restrictions.
  • Because managed pointers are flexible and can point to virtually anything, Span<T> can address any memory scenario through a limited API.
  • As an added bonus, living on the stack makes Span<T> thread-safe. This fact relieves any synchronization need and as a consequence, unleashes some additional performance gain.

Span<T> implementation in .NET 6.0 (and earlier)

I explained in detail Span<T> in this post Improve C# code performance with Span<T> but here I’d like to focus on the link between Span<T> and ref struct. Span<T> is declared as ref struct. In .NET 6.0 (and prior) Span<T> internally held a managed pointer to the memory pointed through a field typed withByReference<T> (see this source code here):

ByReference<T> is a ref struct declared as internal. Thus it couldn’t be used outside of the .NET Base Class Library. Basically it was an internal trick to hold a managed pointer as a field of a ref struct. ByReference<T> was only used in the context of Span<T> and ReadOnlySpan<T> as shown by the NDepend code query against the .NET 6.0 impl below. TypedReference is also matched which is an internal runtime helper.

ByReference of T usage in net6.0

Span<T> implementation in .NET 7.0 (and later)

Since C# 11 and .NET 7.0 Span<T> can rely on the new ref field C# 11 language construct (explained later in the present article) and its new implementation is now:

Btw one super cool fact not underlined yet is that in this code sample, the managed pointer points to a generic T! ref T can be pretty much anything: a char location within a string, an int location within an array, a byte location within a stackalloc buffer. Hat off to the .NET team!!

Span<T> implementation in .NET Framework v4.x

The runtime itself was significantly updated to make Span<T> faster thanks to its internal managed pointer. The update was important enough that it was not applied to the .NET Framework 4.X as explained here by Microsoft engineers: Fast Span is too fundamental change to be quirklable in reasonable way.”.

However an implementation of Span<T>  exists for the .NET Framework, it is referred as slow span. To use it the Nuget package System.Memory must be referenced.

Span<T> and thread-safety

Finally, the fact that Span<T> always live on the stack makes it de-facto thread-safe. This is the opportunity for an additional performance gain. Let’s remind that Span<T> has two fields: _pointer and _length. In concurrent scenario, modifying several states requires a lock to be an atomic operation. Without such lock we end up with the struct tearing phenomenon: the possibility of an inconsistent structure state in a concurrent environment. But since Span<T> is de-facto thread-safe we don’t need such lock.

Memory<T>: a slower Span<T>

Span<T> is fairly flexible and has almost no overhead. A slower version of memory slice already existed through the struct Memory<T>. This implementation is conceptually similar to the .NET Fx slow Span<T> implementation with 3x fields (see above). But since it is a struct and not a ref struct, a Memory<T> can be nested as a field within an object layout on the heap.

The stack only restriction of ref struct

Not only the runtime was updated for Span<T> but also the language concept of ref struct was introduced. A ref struct is a struct that can only live on the thread stack. Let’s remind that there are numerous ways to end up with a struct instance on the object heap:

All those are prohibited at compile time as soon as MyStruct becomes a ref struct:

ref struct must live on the stack

A ref struct instance can only be used as a local variable, an in or out method parameter and as a field of a ref struct:

I’ve read that ref struct was misnamed and that stackonly struct would have been better. Those that wrote that didn’t understand the primary goal of ref struct which is to hold a managed pointer as a field. In C# the keyword ref means managed pointer so it is perfectly named.

Interestingly enough, I learned about ref struct peculiarities when some NDepend users got false positive on the rule Don’t use obsolete types, methods or fields. The compiler tags ref struct with ObsoleteAttribute to prevent them being used by older versions of C# that don’t know about the stack-only restrictions of ref struct. Hence the false positive when using ref struct that is fixed for the next version

C# 7.2 conditional ref expression

C# 7.2 also added some language facilities to work with managed pointer and the ?: ternary condition illustrated by the code sample above:

C# 7.2 readonly ref

C# 7.2 also introduced the in parameter modifier to prevent modifying the state of a structure passed as argument. With this modifier the compiler can then pass a reference of the structure instance instead of copying it which is the default behavior. Just passing a reference is safe because the compiler prevents the in structure instance to be modified. This is quite a welcomed performance gain especially when working with larger structures:

C#7.2 in keyword

The same benefit applies with a returned structure instance with the C# 7.2 ref readonly modifier. A managed pointer is returned instead of copying back the structure instance.

C#7.2 ref readonly

Also structure copying can be prevented in both in and out scenario with the new C# 7.2 concept of readonly struct:

C#7.2 readonly struct

See more on this C# 7.2 addition here.

C# 7.3 ref re-assignment

C# 7.3 introduced ref re-assignment illustrated by the short program below:

Managed pointer arithmetic

This ref re-assignment feature above opens the door of pointer arithmetic to managed pointer. Also the class System.Runtime.CompilerServices.Unsafe exposes methods to perform pointer arithmetic with managed and unmanaged pointers. It was released with .NET Core 3.0 and is proposed as a NuGet package for .NET Fx 4.x.

Such pointer arithmetic is used for example in the implementation of Span<T>.Slice() (available here) and also reproduced below. The call to Unsafe.Add() relies on pointer arithmetic to translate the managed pointer to the proper offset of the slice start:

More on pointer arithmetic in safe code can be found in this article: Unsafe array access and pointer arithmetics in C#

C# 8.0 disposable ref structs

As we’ve seen a ref struct cannot implement an interface and thus cannot implement IDisposable. However since C# 8.0 a ref struct can dispose its internal state and resources hold through an accessible void Dispose() method. Thus the using pattern can be applied and the Dispose() method is called when leaving the using scope.

C# 11 ref field

Some users asked the .NET runtime team to make the internal ByReference<T> trick available to all .NET developers. With a bit of astute it was possible to embed a 1-length Span<T> to have access to it (see here). C# 11 ref field does exactly that: hold a managed pointer within a field. Of course, because of the managed pointer stack requirement a ref field can only be declared as a field of a ref struct.

The C# language specification for ref field is quite long and verbose and I hope that with everything explained above you now get the point. Of course there are extra care when it comes to ref field.

Unlike other ref scenario, with ref fields it is easy to end up with a null managed pointer.

managed ref field can be null

It is possible to test for a managed pointer nullity through helpers within the class Unsafe mentioned above.

Let’s explore an interesting use-case for ref field. A linked list whose nodes live on the stack (this example comes from the ref field language design discussion):

Of course Span<T> remains the shinier application of managed pointer flexibility and performance gain. But this short example is a good indication that a significant range of algorithms and runtime constructs can benefit from this new possibility. Just keep in mind that a thread stack size is limited: by default 4MB on 64 bits and 1MB on 32 bits. A StackOverflowException could easily pop from a wrong usage of this StackLinkedListNode<T>. And keep in mind too that ref field is not just a stack only stuff: often a Span<T> points to a memory slice that belongs to the heap.

readonly ref readonly

C# 11 ref field can be declared as ref ,ref readonly ,readonly ref , or readonly ref readonly. What? This looks complex but is actually straightforward. The readonly effect does apply to either the reference, or to the referenced value. This is illustrated by the screenshot below:

C#11 readonly ref readonly

As we saw earlier, since C# 7.2 the compiler is smart enough to prevent some unnecessary copy thanks to the readonly usage.

C# 11 structure returning a managed pointer to its field

With C# 11, a structure (not necessarily a ref struct) can now return a managed pointer to its field. This example of frugal list is also proposed in the ref field language design discussion:

C# 11 scoped keyword

The new keyword scoped defines runtime guarantee about the lifetime of managed pointers.

Its usage is best explained with these two screenshots below. scoped is like a contract that a method shows to its callers. It guarantees that a managed pointer passed in parameter won’t escape the scope of the method.

Of course the compiler doesn’t accept to pass a managed pointer to a method that doesn’t use the scoped keyword.

C#11 scoped keyword restriction 2

Finally, the keyword scoped can also be used on a local managed pointer to prevent it from escaping:

C#11 scoped keyword on local

Conclusion

Since the .NET inception managed pointers were here, awaiting some runtime and language improvements to unleash their flexibility and performance gain in user safe code. A few points are still discussed by language designers but hopefully with .NET 7.0 and C# 11 we now get the bulk of these improvements. Let’s remind the key points:

  • Managed pointer sits in between pointers and object reference.
  • A managed pointer always live on the thread stack. There is no risk it gets relocated by the GC.
  • A managed pointer can point toward a locations within an object on the heap (like an element of a int[] for example or a char position with a string). This is known as interior pointer. The GC updates the interior pointer when it relocates the object. Thus – unlike regular pointers – managed pointer belong to the safe world.
  • Since C# 7.0 the C# keyword ref – which is the keyword for managed pointer – gets used in an increasing number of scenario. The primary motivation was to obtain a fast and generic memory slice implementation based on managed pointer through Span<T>. Span<T> hold a managed pointer to the pointed memory slice. This makes it faster than something like Memory<T>.
  • Since managed pointer must live on the stack the concept of ref struct was added to C# to prevent Span<T> to live elsewhere than the stack.
  • A managed pointer can point to any kind of memory representation. This flexibility was an important point when designing a generic memory helper like Span<T>.
  • C# 11 opened what allowed Span<T> to hold a managed pointer as a field. This is the new ref field language construct.

Now you can improve the performance of some of your low-level algorithms by re-implementing them with managed pointers.

References

Comments:

  1. Good article overall so far. I’m looking for something i can point less experienced engineers at. It is likely somewhat ill advised commenting before finishing the read, but I’ll take the risk for now.

    On one hand i appreciate the fact that you are going with the orthodox line that “Pinning objects is not advised as it can affect performance”… as it even says it is bad in the CLR/GC c++ source code on github.

    That’s said, as someone who has used unsafe pointers on high performance/throughput and distributed routines in .NET in anger most of his career (less though these days thanks to all the new stuff) that whilst it is possible to affect performance as stated, you really have to do something pretty stupid or generally ignorant to make that particular predicted outcome a reality – especially given how GC actually works internally (see github/dotnet/runtime in-repo doco/code and a certain 2018 Apress published book for further detail). I guess i’m stating that maybe the message should be more along the lines of ‘this is the official line, but so long as you only use unsafe pointers if all the other options have been exhausted first, then unless you are doing anything particularly esoteric or otherwise ill advised, you shouldn’t have any issues – and you can always profile your changes to verify too.’

    I guess I simply dislike the blanket “it’s bad! ‘k?”, when I know that it patently isn’t 95% of the time…

Comments are closed.