Faster Dictionary in C#

September 2, 2024 5 minutes read

In the .NET Base Class Library, Dictionary<TKey, TValue> is an essential key-based hashtable providing constant time access to values. This means accessing dictionary[key] takes the same amount of time regardless of the number of entries in the dictionary. Constant time access is usually referred to as O(1) access.

Index

Avoiding duplicate lookup

Rewriting an algorithm using a hashtable can drastically boost performance. However, the main cost in hashtables comes from the internal O(1) lookup for each access. Some algorithms may become less efficient when they involve duplicated lookup to the same key.

A common dictionary use-case is to add an entry if the key is not already there:

var dico = new Dictionary&lt;Guid, string&gt;();
var guid = Guid.NewGuid();
var value = guid.ToString();
if(!dico.ContainsKey(guid)) {
   dico.Add(guid, value);
}

var dico = new Dictionary<Guid, string>();

var guid = Guid.NewGuid();

var value = guid.ToString();

if(!dico.ContainsKey(guid)) {

dico.Add(guid, value);

}

Here the lookup for the key guid is achieved twice: during dico.ContainsKey(guid) and during dico.Add(guid, value). To solve this duplicate lookup the method TryAdd(key, value) has been added to Dictionary<TKey, TValue> in .NET Core 2.0 released in 2017. Now you can write instead:

bool added = dico.TryAdd(guid, value);

1	bool added = dico.TryAdd(guid, value);

Let’s benchmark with Benchmark.NET how faster it is to use this method. Here is the benchmark result:

| Method                    | Mean     | Allocated |
|-------------------------- |---------:|----------:|
| TryAddKeyValuePair        | 274.6 ns |     518 B |
| TryAddKeyValuePair_Faster | 211.4 ns |         - |

| Method | Mean | Allocated |

|-------------------------- |---------:|----------:|

| TryAddKeyValuePair | 274.6 ns | 518 B |

| TryAddKeyValuePair_Faster | 211.4 ns | - |

Here is the benchmark code:

using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]
[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]
[Config(typeof(FastRunConfig))]
public class Benchmarks {

   static readonly Dictionary<Guid, string> s_Dico1 = new Dictionary<Guid, string>();
   const string s_Value = "value";
   [Benchmark]
   public void TryAddKeyValuePair() {
      var key = Guid.NewGuid();
      if(!s_Dico1.ContainsKey(key)) {
         s_Dico1.Add(key, s_Value);
      }
   }

   static readonly Dictionary<Guid, string> s_Dico2 = new Dictionary<Guid, string>();
   [Benchmark]
   public void TryAddKeyValuePair_Faster() {
      var key = Guid.NewGuid();
      bool added = s_Dico1.TryAdd(key, s_Value);
   }
}

using BenchmarkDotNet.Attributes;

using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]

[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]

[Config(typeof(FastRunConfig))]

public class Benchmarks {

static readonly Dictionary<Guid, string> s_Dico1 = new Dictionary<Guid, string>();

const string s_Value = "value";

[Benchmark]

public void TryAddKeyValuePair() {

var key = Guid.NewGuid();

if(!s_Dico1.ContainsKey(key)) {

s_Dico1.Add(key, s_Value);

}

static readonly Dictionary<Guid, string> s_Dico2 = new Dictionary<Guid, string>();

[Benchmark]

public void TryAddKeyValuePair_Faster() {

var key = Guid.NewGuid();

bool added = s_Dico1.TryAdd(key, s_Value);

}

To add the package Benchmark.NET in your project just add to the .csproj:

  <ItemGroup>
    <PackageReference Include="BenchmarkDotNet" Version="0.14.0" />
  </ItemGroup>

</ItemGroup>

Notice that the method TryAdd(key, value) is not available in the good old .NET Framework.

Accessing dictionary internal layout with managed pointers

There remain duplicate lookup scenarios not addressed by the TryAdd(key, value) method. This is why two methods were added to the class System.Runtime.InteropServices.CollectionsMarshal in .NET 6 released in 2021:

public static ref TValue GetValueRefOrNullRef&lt;TKey,TValue&gt;
   (System.Collections.Generic.Dictionary&lt;TKey,TValue&gt; dictionary, TKey key);

public static ref TValue? GetValueRefOrAddDefault&lt;TKey,TValue&gt;
   (System.Collections.Generic.Dictionary&lt;TKey,TValue&gt; dictionary, TKey key, out bool exists);

public static ref TValue GetValueRefOrNullRef<TKey,TValue>

(System.Collections.Generic.Dictionary<TKey,TValue> dictionary, TKey key);

public static ref TValue? GetValueRefOrAddDefault<TKey,TValue>

(System.Collections.Generic.Dictionary<TKey,TValue> dictionary, TKey key, out bool exists);

CollectionMarshal documentation states that it is an unsafe class that provides a set of methods to access the underlying data representations of collections.

The ref keyword and managed pointers

The usage of ref keywords in the methods’ definition above and the terms unsafe and underlying data representations of collections means that we are entering the realm of managed pointer. Managed pointer is a .NET runtime feature that existed since the inception of .NET 1.0 in 2001. C# 1.0 relied on it to pass parameters to a method with the ref and out keywords. A managed pointer must live on the stack and can point to any kind of memory: object reference, slot within an array, field within an object, or unmanaged memory. In a runtime with a garbage collector like .NET, managed pointers offer the best of both worlds:

They are as fast as a regular C or C++ pointer
They are tracked by the runtime in case the memory pointed gets moved by the garbage collector

This is why since the release of C# 7.0 in 2017, Microsoft expanded over the years the use of the ref keyword to a broader set of scenarios, including ref locals, ref returns, ref struct, and ref fields. This is all explained in this article: Managed pointers, Span, ref struct, C#11 ref fields and the scoped keyword

Scenario 1: Creating the value only if the key is not present in the dictionary

Now let’s see how we can concretely improve our dictionaries’ performance. The scenario 1 is similar to the previous one with TryAdd(), but here, the value is created only when the key is not already present in the dictionary. Here TryAdd() cannot be used anymore:

var key = Guid.NewGuid();
if(!s_Dico1.ContainsKey(key)) { 
   // Create the object to be used as key's value
   // because the key is not present in the dictionary
   string val = key.ToString(); 
   dico.Add(key, val);
}

var key = Guid.NewGuid();

if(!s_Dico1.ContainsKey(key)) {

// Create the object to be used as key's value

// because the key is not present in the dictionary

string val = key.ToString();

dico.Add(key, val);

}

Thanks to the method CollectionMarshall.GetValueRefOrAddDefault() it is possible to address this scenario with only a single lookup for key:

var key = Guid.NewGuid();
ref string pointerToValLocation = 
   ref CollectionsMarshal.GetValueRefOrAddDefault(
      dico, key, out bool exists)!;
if (!exists) {
   // Create the object to be used as key's value
   // because the key is not present in the dictionary
   var val = key.ToString();
   pointerToValLocation = val;
}

var key = Guid.NewGuid();

ref string pointerToValLocation =

ref CollectionsMarshal.GetValueRefOrAddDefault(

dico, key, out bool exists)!;

if (!exists) {

// Create the object to be used as key's value

// because the key is not present in the dictionary

var val = key.ToString();

pointerToValLocation = val;

}

The screen recording below demonstrates that GetValueRefOrAddDefault() functions as intended: if the key is absent from the dictionary, it adds the key with a default value (default(string) which is null), and then returns a reference to the value’s location within the dictionary’s layout.

You might be intrigued by the ! character at the end of:

ref string pointerToValLocation = 
   ref CollectionsMarshal.GetValueRefOrAddDefault(
      dico, key, out bool exists)!;

ref string pointerToValLocation =

ref CollectionsMarshal.GetValueRefOrAddDefault(

dico, key, out bool exists)!;

It simply instructs the compiler not to emit a nullable-related warning about the possibility of a Dictionary<Guid, string> containing a null reference for a string value.

Benchmarking scenario 1

Here is the benchmark result:

| Method                         | Mean     | Gen1   | Allocated |
|------------------------------- |---------:|-------:|----------:|
| CreateValIfKeyNotInDico        | 575.5 ns | 0.0076 |      96 B |
| CreateValIfKeyNotInDico_Faster | 361.9 ns | 0.0073 |      96 B |

|------------------------------- |---------:|-------:|----------:|

| CreateValIfKeyNotInDico | 575.5 ns | 0.0076 | 96 B |

| CreateValIfKeyNotInDico_Faster | 361.9 ns | 0.0073 | 96 B |

Here is the benchmark code:

using System.Runtime.InteropServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]
[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]
public class Benchmarks {

   static readonly Dictionary<Guid, string> s_Dico1 = new();
   [Benchmark]
   public void CreateValIfKeyNotInDico() {
      var key = Guid.NewGuid();
      if(!s_Dico1.ContainsKey(key)) {
         // Create the object to be used as key's value
         // because the key is not present in the dictionary
         string val = key.ToString(); 
         s_Dico1.Add(key, val);
      }
   }

   static readonly Dictionary<Guid, string> s_Dico2 = new();
   [Benchmark]
   public void CreateValIfKeyNotInDico_Faster() {
      var key = Guid.NewGuid();
      ref string pointerToValLocation = ref CollectionsMarshal.GetValueRefOrAddDefault(s_Dico2, key, out bool exists);
      if (!exists) {
         // Create the object to be used as key's value
         // because the key is not present in the dictionary
         var val = key.ToString();
         pointerToValLocation = val;
      }
   }
}

using System.Runtime.InteropServices;

using BenchmarkDotNet.Attributes;

using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]

[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]

public class Benchmarks {

static readonly Dictionary<Guid, string> s_Dico1 = new();

[Benchmark]

public void CreateValIfKeyNotInDico() {

var key = Guid.NewGuid();

if(!s_Dico1.ContainsKey(key)) {

// Create the object to be used as key's value

// because the key is not present in the dictionary

string val = key.ToString();

s_Dico1.Add(key, val);

}

static readonly Dictionary<Guid, string> s_Dico2 = new();

[Benchmark]

public void CreateValIfKeyNotInDico_Faster() {

var key = Guid.NewGuid();

ref string pointerToValLocation = ref CollectionsMarshal.GetValueRefOrAddDefault(s_Dico2, key, out bool exists);

if (!exists) {

// Create the object to be used as key's value

// because the key is not present in the dictionary

var val = key.ToString();

pointerToValLocation = val;

}

Scenario 2: Modifying structs value in dictionary

If the values in a dictionary are structs and you need to modify a value, a duplicate lookup is required. This happens because accessing the value through dico[guid] returns a copy of the struct stored in the dictionary.

var key = Guid.NewGuid();
var dico = new Dictionary<Guid, int>() { { key, 0 } };
dico[key]++;
// compiled to
// dico[key] = dico[key] + 1;

var key = Guid.NewGuid();

var dico = new Dictionary<Guid, int>() { { key, 0 } };

dico[key]++;

// compiled to

// dico[key] = dico[key] + 1;

Again, using a managed pointer to the value’s location within the dictionary can eliminate the need for the second lookup:

using System.Runtime.InteropServices;
var key = Guid.NewGuid();
var dico = new Dictionary<Guid, int>() { { key, 0 } };
ref int pointerToValLocation = ref CollectionsMarshal.GetValueRefOrNullRef(dico, key);
//if(!Unsafe.IsNullRef(pointerToValLocation))
pointerToValLocation++;

using System.Runtime.InteropServices;

var key = Guid.NewGuid();

var dico = new Dictionary<Guid, int>() { { key, 0 } };

ref int pointerToValLocation = ref CollectionsMarshal.GetValueRefOrNullRef(dico, key);

//if(!Unsafe.IsNullRef(pointerToValLocation))

pointerToValLocation++;

If it’s uncertain whether the key is present in the dictionary you must use if (!Unsafe.IsNullRef(pointerToValLocation)) instead of if (pointerToValLocation != null) because else you get this compiler warning:

Benchmarking scenario 2

Here is the benchmark result:

| Method                   | Mean      | Allocated |
|------------------------- |----------:|----------:|
| StructValueInDico        | 10.875 us |         - |
| StructValueInDico_Faster |  5.161 us |         - |

| Method | Mean | Allocated |

|------------------------- |----------:|----------:|

| StructValueInDico | 10.875 us | - |

| StructValueInDico_Faster | 5.161 us | - |

Here is the benchmark code:

using System.Runtime.InteropServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]
[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]
public class Benchmarks {

   static readonly Guid s_Guid = Guid.NewGuid();

   static readonly Dictionary<Guid, int> s_Dico1 = new() { { s_Guid, 0 } };
   
   [Benchmark]
   public void StructValueInDico() {
      for(int i=0; i < 1000; i++) {
         s_Dico1[s_Guid] += 1;
         // Equivalent to 
         //s_Dico1[s_Guid] = s_Dico1[s_Guid] + 1;
      }
   }

   static readonly Dictionary<Guid, int> s_Dico2 = new() { { s_Guid, 0 } };
   [Benchmark]
   public void StructValueInDico_Faster() {
      for (int i = 0; i < 1000; i++) {
         ref int pointerToValLocation = ref CollectionsMarshal.GetValueRefOrNullRef(s_Dico2, s_Guid);
         //if(!Unsafe.IsNullRef(pointerToValLocation))
         pointerToValLocation++;
      }
   }
}

using System.Runtime.InteropServices;

using BenchmarkDotNet.Attributes;

using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]

[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]

public class Benchmarks {

static readonly Guid s_Guid = Guid.NewGuid();

static readonly Dictionary<Guid, int> s_Dico1 = new() { { s_Guid, 0 } };

[Benchmark]

public void StructValueInDico() {

for(int i=0; i < 1000; i++) {

s_Dico1[s_Guid] += 1;

// Equivalent to

//s_Dico1[s_Guid] = s_Dico1[s_Guid] + 1;

}

static readonly Dictionary<Guid, int> s_Dico2 = new() { { s_Guid, 0 } };

[Benchmark]

public void StructValueInDico_Faster() {

for (int i = 0; i < 1000; i++) {

ref int pointerToValLocation = ref CollectionsMarshal.GetValueRefOrNullRef(s_Dico2, s_Guid);

//if(!Unsafe.IsNullRef(pointerToValLocation))

pointerToValLocation++;

}

Scenario 3: Large Structs value in dictionary

In this scenario, the dictionary values are large structs. With the usual approach, accessing a value results in a copy, which incurs a performance cost proportional to the struct footprint. Instead, we can access the large struct value using a managed pointer. Below are the benchmark results:

| Method                   | Mean      | Allocated |
|------------------------- |----------:|----------:|
| LargeStructInDico        | 21.553 us |         - |
| LargeStructInDico_Faster |  5.158 us |         - |

| Method | Mean | Allocated |

|------------------------- |----------:|----------:|

| LargeStructInDico | 21.553 us | - |

| LargeStructInDico_Faster | 5.158 us | - |

And here is the benchmark code:

using System.Runtime.InteropServices;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]
[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]
public class Benchmarks {

   static readonly Guid s_Guid = Guid.NewGuid();

   static readonly Dictionary<Guid, LargeStruct> s_Dico = new() { { s_Guid, new LargeStruct() } };

   public struct LargeStruct {
      internal Guid f1; Guid f2; Guid f3; Guid f4; Guid f5; Guid f6; Guid f7; Guid f8;
      Guid f9; Guid f10; Guid f11; Guid f12; Guid f13; Guid f14; Guid f15; Guid f16;
      Guid f17; Guid f18; Guid f19; Guid f20; Guid f21; Guid f22; Guid f23; Guid f24;
      Guid f25; Guid f26; Guid f27; Guid f28; Guid f29; Guid f30; Guid f31; Guid f32;
      Guid f33; Guid f34; Guid f35; Guid f36; Guid f37; Guid f38; Guid f39; Guid f40;
      Guid f41; Guid f42; Guid f43; Guid f44; Guid f45; Guid f46; Guid f47; Guid f48;
      Guid f49; Guid f50; Guid f51; Guid f52; Guid f53; Guid f54; Guid f55; Guid f56;
      Guid f57; Guid f58; Guid f59; Guid f60; Guid f61; Guid f62; Guid f63; Guid f64;
   }

   [Benchmark]
   public Guid LargeStructInDico() {
      Guid guid = default;
      for(int i=0; i < 1000; i++) {
         var largeStruct = s_Dico[s_Guid];  // The large struct is copied into val
         guid = largeStruct.f1; // do something with largeStruct
      }
      return guid;
   }

   [Benchmark]
   public Guid LargeStructInDico_Faster() {
      Guid guid = default;
      for (int i = 0; i < 1000; i++) {
         ref LargeStruct pointerToValLocation = ref CollectionsMarshal.GetValueRefOrNullRef(s_Dico, s_Guid);
         //if(!Unsafe.IsNullRef(pointerToValLocation))
         guid = pointerToValLocation.f1; // do something with largeStruct
      }
      return guid;
   }
}

using System.Runtime.InteropServices;

using BenchmarkDotNet.Attributes;

using BenchmarkDotNet.Running;

BenchmarkRunner.Run<Benchmarks>();

[MemoryDiagnoser]

[HideColumns("StdDev", "Median", "Job", "RatioSD", "Error", "Gen0", "Alloc Ratio")]

public class Benchmarks {

static readonly Guid s_Guid = Guid.NewGuid();

static readonly Dictionary<Guid, LargeStruct> s_Dico = new() { { s_Guid, new LargeStruct() } };

public struct LargeStruct {

internal Guid f1; Guid f2; Guid f3; Guid f4; Guid f5; Guid f6; Guid f7; Guid f8;

Guid f9; Guid f10; Guid f11; Guid f12; Guid f13; Guid f14; Guid f15; Guid f16;

Guid f17; Guid f18; Guid f19; Guid f20; Guid f21; Guid f22; Guid f23; Guid f24;

Guid f25; Guid f26; Guid f27; Guid f28; Guid f29; Guid f30; Guid f31; Guid f32;

Guid f33; Guid f34; Guid f35; Guid f36; Guid f37; Guid f38; Guid f39; Guid f40;

Guid f41; Guid f42; Guid f43; Guid f44; Guid f45; Guid f46; Guid f47; Guid f48;

Guid f49; Guid f50; Guid f51; Guid f52; Guid f53; Guid f54; Guid f55; Guid f56;

Guid f57; Guid f58; Guid f59; Guid f60; Guid f61; Guid f62; Guid f63; Guid f64;

}

[Benchmark]

public Guid LargeStructInDico() {

Guid guid = default;

for(int i=0; i < 1000; i++) {

var largeStruct = s_Dico[s_Guid]; // The large struct is copied into val

guid = largeStruct.f1; // do something with largeStruct

}

return guid;

}

[Benchmark]

public Guid LargeStructInDico_Faster() {

Guid guid = default;

for (int i = 0; i < 1000; i++) {

ref LargeStruct pointerToValLocation = ref CollectionsMarshal.GetValueRefOrNullRef(s_Dico, s_Guid);

//if(!Unsafe.IsNullRef(pointerToValLocation))

guid = pointerToValLocation.f1; // do something with largeStruct

}

return guid;

}

Caution when using managed pointers on a dictionary

Using MemoryMarshal methods with a dictionary can be powerful but require careful handling. It lets obtain managed pointers toward dictionary’s internal layout. However, a dictionary layout at runtime is complex and involves numerous arrays called buckets. A managed pointer references a slot within one of these buckets. While the garbage collector safely updates such managed pointers if it moves a bucket array, problems can arise when entries are added or removed. Such operations might cause the dictionary internal layout to reorganize with new buckets, potentially leaving your managed pointer pointing to an orphaned array.

Don’t add or remove entries from a dictionary while holding a managed pointer obtained through a MemoryMarshal method on that dictionary.

Conclusion

Year after year, Microsoft enhances .NET to enable developers to write even more performant code. Managed pointers, accessible via the ref keyword, have become a crucial tool in achieving this goal. The initial push for increased use of managed pointers came with the introduction of Span<T>, as explained in Improve C# code performance with Span<T>. However, they have proven valuable in various scenarios, including optimizing dictionaries, faster array and list loops, and improving performance in C# 13 params collections.

This article is brought to you by the team behind NDepend — a proven .NET static analysis tool for improving code maintainability, security, and overall quality. Whether you’re modernizing a legacy .NET application or starting fresh in C#, get started with your free full-featured trial today!

Avoiding duplicate lookup

Accessing dictionary internal layout with managed pointers

The ref keyword and managed pointers

Scenario 1: Creating the value only if the key is not present in the dictionary

Benchmarking scenario 1

Scenario 2: Modifying structs value in dictionary

Benchmarking scenario 2

Scenario 3: Large Structs value in dictionary

Caution when using managed pointers on a dictionary

Conclusion

Share this:

Make your .NET code beautiful with NDepend