NDepend Blog

Improve your .NET code quality with NDepend

C#9 records: immutable classes

October 12, 2020 8 minutes read

C#9 record immutable classes

Record is a long time awaited feature now proposed by C# 9. With record we have a concise syntax to define immutable types this way:

Isn’t it beautiful? In the NDepend code itself we have several dozens of record classes that will be refactored with such single line declaration!

record is somehow similar to string. It looks easy. All developers use it everyday but few will realize that tricky things happen under the hood.

Actually, record is really similar to string: Both are classes proposing value-based semantic. Value based semantic results from two properties:

  • Value-based equality
  • Immutability

Value-based equality

Usually .NET classes rely on reference-based equality. For example:

string and record are classes but they rely on value-based equality. This sample code illustrates the idea:

int, double, bool, struct, enum are all value types. Equality for two instances of a value type means that the value(s) they contain are equals. For a structure with several fields the runtime actually calls Equals() on each field recursively. From the code above it is clear that both record and string behave the same way.

Value-based equality feels natural for strings and with time, it will look natural as well for records. But the implementation of this behavior is not trivial. Later in this article we’ll expose what the compiler generates for record Person(string FirstName, string LastName). You will see that the code handling equality is quite complex. But before that let’s explain the second value-based semantic property: Immutability.

Immutability

The second core property of string and record value-based semantic is immutability. Basically, an object is immutable if its state cannot change once the object has been created. Consequently, a class is immutable if it is declared in such way that all its instances are immutable.

I remember a discussion with a developer that got nervous about immutability. It looked like an unnatural constraint to him: he wanted his object’s state to change. But he didn’t realized that something he used everyday – string operations – relied on immutability. When you are modifying a string actually a new string object gets created. Records behave the same way. Moreover a clean new syntax based on the keyword with has been introduced with C#9. Let’s have a look at that:

Pros and Cons of Immutability

Immutability is an essential functional programming property. Here is why functional programmers love it:

  • Immutable objects are thread-safe and as a consequence they simplify a lot concurrent programming. Several threads can hold a reference to the same immutable object since its access is read-only.
  • Immutable objects are side-effects free: you can pass such object to a method and be sure it won’t be modified.
  • Immutable objects can be used to optimize memory: for example the .NET runtime interns literal strings declared in your code. It means that the runtime hold a hash list of literal strings to make sure that a string doesn’t get duplicated. Notice that non-literal strings are not interned because: doing so would save a tiny amount of memory in exchange for making the common cases much slower”.
  • The flyweight design pattern relies on reusing lightweight immutable instances. A classic example is graphical representation of characters: it is preferable to have a single immutable instance of the glyph ‘a’ with the font time, the font size 12 and other formatting data. This single instance can then be shared massively within a large document model that contains thousands of such ‘a’.
  • Immutable types lead to less bugs. State mutability is error-prone: many bugs are provoked by a state changing unexpectedly, even in single-threaded scenarios.
  • Hash list is one of the most powerful programming tool to improve performance – when used well. And a hash-list needs its keys to be immutable.
  • Immutable objects are useful in a number of situations including relational-database access and messages within a distributed architecture. In these scenarios the notion of in-process reference is irrelevant. The identity of entities is based on an unique identifier scheme. And identifiers fit well with the value-based equality described above.

Here at NDepend we enjoy so much immutable types that several rules help enforce this property:

The downside of immutability are:

  • Learning curve: We are all used to work with string immutability. Yet sometime we get notified by an analyzer or a test that we forgot to handle the value returned by str.Replace("A","B"). Developers might need to think twice before using record immutability properly.
  • Memory and GC pressure: I remember this 2014 Jetbrains post: ReSharper and Roslyn: Q&A. It explains that a fundamental difference between the Resharper engine and the Roslyn compiler is that the Roslyn syntax tree massively relies on immutability. It means that at code edition time, modified branches of the syntax tree are modelized with others immutable objects. The post explicitly claims that “Roslyn’s immutable code model would increase memory traffic, which would in turn lead to more frequent garbage collection, negatively impacting performance. I have no idea if this is true or not. As explained above when presenting the flyweight pattern, an immutable object can be massively shared to save memory. Also I remember the significant work on immutable collections and structures done by Microsoft engineers at that time.

Breaking Immutability

It is possible to break the immutability of both record and string. Here is how for example:

In both cases doing so is not recommend because breaking immutability leads to error prone situations. For example we often rely on string immutability to use keys’ string in hash table. A string mutation de-facto corrupts the hash table internal structure and this leads to tricky bug. Here is an unexpected behavior:

This odd behavior in the sample above results from the runtime interning strings. Thus both string1 and string2 references the same string object.

Also note that until now we relied on the positional record syntax:

However to break the immutability of record we used this more verbose record syntax:

Personally I regret that this syntax with a setter is allowed. I would have prefered that all record instances are strictly immutable.

What the compiler generates

Let’s look at the code generated by the C# compiler for this positional record syntax:

A few remarks:

  • The properties rely on the new init C#9 keyword. This new syntax let’s initialize readonly fields from the outside of their class at initialization time!

  • As mentioned already, the code generated to handle the value-based semantic is fairly complex.
  • ToString() provides a default implementation that returns  the string "FirstName = Paul, LastName = Smacchia".
  • The Deconstruct() method is here to be used by the Tuple syntax introduced with C#7. In the code below this deconstructor method is actually called:

  • A copy constructor and a <Clone>$() method are also provided to handle some scenarios described in the next section.

Inheritance

All value types and also the class string are sealed types, but a record doesn’t have to be sealed: inheritance can be used with records.

This is useful but can lead to some non-trivial scenarios with the with syntax and the value-based equality.

Inheritance and the with syntax

In the code sample below it is not immediate that person2 is a Student since it is inferred from a Person reference using the with syntax. Under the hood the generated virtual <Clone>$() method is called by the compiler. This virtual method is overridden by Student and its implementation calls the Student copy constructor:

Inheritance and the value-based equality

From the Person record perspective, in the code sample below both references have the same value. After all they share the same FirstName and LastName values.

Fortunately the code generated by the compiler makes it so that these two objects are considered as different. If you look back at the code generated by the compiler you will see that the method Equals() implementation generated relies on the virtual property protected virtual Type EqualityContract => typeof(Person);. This property is used to check that the two objects compared have the same type.

Generic Record

Let’s mention that a record can be a generic class. This is great! Nevertheless developers will have to remember that EqualityComparer<T>.Default is used against each property typed with T in the code generated to compare states. This can lead to non-immediate situations:

Record used as class

Most of records will be declared with the simple positional record syntax. Certainly it will be recommended to keep records as simple as possible to handle scenarios that rely on value-based semantic. But it is possible to add methods and more members to a record. You can even override some default behavior as demonstrated by the code sample below:

Conclusion

I wrote my first post on immutability in C# on 2008. Many of us have waited for long to finally see C# proposing such syntax. Over time C# improved at helping with immutability: for example C#8 introduced read only members for structure. Now C#9 offers a brilliant syntax to democratize the usage of immutability. And the concise positional record syntax will save a lot of keystrokes.

If – like us – your code is stuffed with immutable classes that are good candidates to be refactored with the record keyword, you should notice that refactoring toward this new syntax won’t be straightforward: client code of these classes relies on referenced-based equality which is very different than record value-base equality. As a consequence the move will have to be carefully thought-out!

Enjoy!

 

Comments:

  1. Tom Tucker says:

    I have been curious if with C# 9 records you will be able to deserialize data into records?

Comments are closed.