Record is a long time awaited feature now proposed by C# 9. With record we have a concise syntax to define immutable types this way:
1 |
record Person(string FirstName, string LastName); |
Isn’t it beautiful? In the NDepend code itself we have several dozens of record classes that will be refactored with such single line declaration!
record is somehow similar to string. It looks easy. All developers use it everyday but few will realize that tricky things happen under the hood.
Actually, record is really similar to string: Both are classes proposing value-based semantic. Value based semantic results from two properties:
- Value-based equality
- Immutability
Value-based equality
Usually .NET classes rely on reference-based equality. For example:
1 2 3 4 5 6 7 8 9 10 |
public class Person { public Person(string firstName, string lastName) { FirstName = firstName; LastName = lastName; } public string FirstName { get; init; } public string LastName { get; init; } } (...) var record1 = new Person("Paul", "Smacchia"); var record2 = new Person("Paul", "Smacchia"); Assert.IsFalse(record1 == record2); // not equals!! Assert.IsFalse(ReferenceEquals(record1, record2)); // indeed, references are not equals!! |
string and record are classes but they rely on value-based equality. This sample code illustrates the idea:
1 2 3 4 5 6 7 8 9 10 11 |
record Person(string FirstName, string LastName); (...) var string1 = "Paul Smacchia"; var string2 = string1.Substring(0, 8) + string1.Substring(8); Assert.IsTrue(string1 == string2); // Same value... Assert.IsFalse(ReferenceEquals(string1, string2)); // ... but different objects var record1 = new Person("Paul", "Smacchia"); var record2 = new Person("Paul", "Smacchia"); Assert.IsTrue(record1 == record2); // Same value... Assert.IsFalse(ReferenceEquals(record1, record2)); // ... but different objects |
int
, double
, bool
, struct
, enum
are all value types. Equality for two instances of a value type means that the value(s) they contain are equals. For a structure with several fields the runtime actually calls Equals()
on each field recursively. From the code above it is clear that both record and string behave the same way.
Value-based equality feels natural for strings and with time, it will look natural as well for records. But the implementation of this behavior is not trivial. Later in this article we’ll expose what the compiler generates for record Person(string FirstName, string LastName). You will see that the code handling equality is quite complex. But before that let’s explain the second value-based semantic property: Immutability.
Immutability
The second core property of string and record value-based semantic is immutability. Basically, an object is immutable if its state cannot change once the object has been created. Consequently, a class is immutable if it is declared in such way that all its instances are immutable.
I remember a discussion with a developer that got nervous about immutability. It looked like an unnatural constraint to him: he wanted his object’s state to change. But he didn’t realized that something he used everyday – string operations – relied on immutability. When you are modifying a string actually a new string object gets created. Records behave the same way. Moreover a clean new syntax based on the keyword with
has been introduced with C#9. Let’s have a look at that:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
record Person(string FirstName, string LastName); (...) var string1 = "Paul Smacchia"; var string2 = string1.Replace("Paul", "Léna"); Assert.IsTrue(string1 == "Paul Smacchia"); // string1 is left untouched // A new string has been created to contain the new value Assert.IsTrue(string2 == "Léna Smacchia"); var record1 = new Person("Paul", "Smacchia"); var record2 = record1 with { FirstName = "Léna" }; Assert.IsTrue(record1.FirstName == "Paul"); // record1 is left untouched // A new record has been created to contain the new value Assert.IsTrue(record2 == new Person("Léna", "Smacchia")); |
Pros and Cons of Immutability
Immutability is an essential functional programming property. Here is why functional programmers love it:
- Immutable objects are thread-safe and as a consequence they simplify a lot concurrent programming. Several threads can hold a reference to the same immutable object since its access is read-only.
- Immutable objects are side-effects free: you can pass such object to a method and be sure it won’t be modified.
- Immutable objects can be used to optimize memory: for example the .NET runtime interns literal strings declared in your code. It means that the runtime hold a hash list of literal strings to make sure that a string doesn’t get duplicated. Notice that non-literal strings are not interned because: “doing so would save a tiny amount of memory in exchange for making the common cases much slower”.
- The flyweight design pattern relies on reusing lightweight immutable instances. A classic example is graphical representation of characters: it is preferable to have a single immutable instance of the glyph ‘a’ with the font time, the font size 12 and other formatting data. This single instance can then be shared massively within a large document model that contains thousands of such ‘a’.
- Immutable types lead to less bugs. State mutability is error-prone: many bugs are provoked by a state changing unexpectedly, even in single-threaded scenarios.
- Hash list is one of the most powerful programming tool to improve performance – when used well. And a hash-list needs its keys to be immutable.
- Immutable objects are useful in a number of situations including relational-database access and messages within a distributed architecture. In these scenarios the notion of in-process reference is irrelevant. The identity of entities is based on an unique identifier scheme. And identifiers fit well with the value-based equality described above.
Here at NDepend we enjoy so much immutable types that several rules help enforce this property:
- Fields should be marked as ReadOnly when possible
- Structures should be immutable
- Avoid non-readonly static fields
- Avoid static fields with a mutable field type
- Property Getters should be immutable
- A field must not be assigned from outside its parent hierarchy types
- Don’t assign a field from many methods
- Types tagged with ImmutableAttribute must be immutable
- Types immutable should be tagged with ImmutableAttribute
- Methods tagged with PureAttribute must be pure: A method is pure if its execution doesn’t change the value of any instance or static field. This purity concept is also quite useful to avoid debugging headache due to state changing.
- Pure methods should be tagged with PureAttribute
The downside of immutability are:
- Learning curve: We are all used to work with string immutability. Yet sometime we get notified by an analyzer or a test that we forgot to handle the value returned by
str.Replace("A","B")
. Developers might need to think twice before using record immutability properly. - Memory and GC pressure: I remember this 2014 Jetbrains post: ReSharper and Roslyn: Q&A. It explains that a fundamental difference between the Resharper engine and the Roslyn compiler is that the Roslyn syntax tree massively relies on immutability. It means that at code edition time, modified branches of the syntax tree are modelized with others immutable objects. The post explicitly claims that “Roslyn’s immutable code model would increase memory traffic, which would in turn lead to more frequent garbage collection, negatively impacting performance“. I have no idea if this is true or not. As explained above when presenting the flyweight pattern, an immutable object can be massively shared to save memory. Also I remember the significant work on immutable collections and structures done by Microsoft engineers at that time.
Breaking Immutability
It is possible to break the immutability of both record and string. Here is how for example:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
record Person { public string FirstName { get; set; } public string LastName { get; set; } } (...) var string1 = "Paul Smacchia"; var string2 = string1; // Keep a reference to make clear // that the string value changed // but no other string has been created fixed (char* pointer = string1) { pointer[0] = '8'; } Assert.IsTrue(string2 == "8aul Smacchia"); var record1 = new Person { FirstName = "Paul", LastName = "Smacchia" }; var record2 = record1; // Same remark as above record1.FirstName = "8aul"; Assert.IsTrue(record2.FirstName == "8aul"); |
In both cases doing so is not recommend because breaking immutability leads to error prone situations. For example we often rely on string immutability to use keys’ string in hash table. A string mutation de-facto corrupts the hash table internal structure and this leads to tricky bug. Here is an unexpected behavior:
1 2 3 4 |
var string1 = "Paul Smacchia"; var string2 = "Paul Smacchia"; fixed (char* pointer = string1) { pointer[0] = '8'; } Assert.IsTrue(string2 == "8aul Smacchia"); // Did you expect that?! |
This odd behavior in the sample above results from the runtime interning strings. Thus both string1
and string2
references the same string object.
Also note that until now we relied on the positional record syntax:
1 |
public record Person(string FirstName, string LastName); |
However to break the immutability of record we used this more verbose record syntax:
1 2 3 4 |
record Person { public string FirstName { get; set; } public string LastName { get; set; } } |
Personally I regret that this syntax with a setter is allowed. I would have prefered that all record instances are strictly immutable.
What the compiler generates
Let’s look at the code generated by the C# compiler for this positional record syntax:
1 |
public record Person(string FirstName, string LastName); |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
public class Person : IEquatable<Person> { public Person() {} // Default constructor // init properties with backing fields private readonly string firstName; // really named <FirstName>__BackingField public string FirstName { get => firstName; init => firstName = value; } private readonly string lastName; // really named <LastName>__BackingField public string LastName { get => lastName; init => lastName = value; } // Copy constructor and Clone() method public virtual Person <Clone>$() => new Person(this); protected Person(Person original) { this.firstName = original.firstName; this.lastName = original.lastName; } // Deconstructor public void Deconstruct(out string firstName, out string lastName) { firstName = this.FirstName; lastName = this.LastName; } // ToString() impl public override string ToString() { StringBuilder builder = new StringBuilder(); builder.Append(nameof(Person)); builder.Append(" { "); if (this.PrintMembers(builder)) { builder.Append(" "); } builder.Append("}"); return builder.ToString(); } protected virtual bool PrintMembers(StringBuilder builder) { builder.Append("FirstName"); builder.Append(" = "); builder.Append((object)this.FirstName); builder.Append(", "); builder.Append("LastName"); builder.Append(" = "); builder.Append((object)this.LastName); return true; } // Boilerplate code to handle value-based equality public static bool operator !=(Person? r1, Person? r2) => !(r1 == r2); public static bool operator ==(Person? r1, Person? r2) { if ((object)r1 == (object)r2) { return true; } return (object)r1 != null && r1.Equals(r2); } protected virtual Type EqualityContract => typeof(Person); public override int GetHashCode() => (EqualityComparer<Type>.Default.GetHashCode(this.EqualityContract) * -1521134295 + EqualityComparer<string>.Default.GetHashCode(this.firstName)) * -1521134295 + EqualityComparer<string>.Default.GetHashCode(this.lastName); public override bool Equals(object? obj) => this.Equals(obj as Person); public virtual bool Equals(Person? other) => (object)other != null && this.EqualityContract == other.EqualityContract && EqualityComparer<string>.Default.Equals(this.firstName, other.firstName) && EqualityComparer<string>.Default.Equals(this.lastName, other.lastName); } |
A few remarks:
- The properties rely on the new
init
C#9 keyword. This new syntax let’s initialize readonly fields from the outside of their class at initialization time!
1 2 3 4 5 6 7 8 9 10 11 12 |
public class Person { private readonly string firstName; public string FirstName { get => firstName; init => firstName = value; } private readonly string lastName; public string LastName { get => lastName; init => lastName = value; } } (...) var person = new Person { FirstName = "Paul", LastName = "Smacchia" }; |
- As mentioned already, the code generated to handle the value-based semantic is fairly complex.
ToString()
provides a default implementation that returns the string"FirstName = Paul, LastName = Smacchia"
.- The
Deconstruct()
method is here to be used by the Tuple syntax introduced with C#7. In the code below this deconstructor method is actually called:
1 2 3 4 |
var record1 = new Person("Paul","Smacchia"); var (firstName, lastName) = record1; // Call Deconstruct() Assert.IsTrue(firstName == "Paul"); Assert.IsTrue(lastName == "Smacchia"); |
- A copy constructor and a
<Clone>$()
method are also provided to handle some scenarios described in the next section.
Inheritance
All value types and also the class string are sealed types, but a record doesn’t have to be sealed: inheritance can be used with records.
1 2 3 4 5 6 7 |
public record Person { public string FirstName { get; init; } public string LastName { get; init; } } public sealed record Student : Person { public int ID { get; init; } } |
This is useful but can lead to some non-trivial scenarios with the with
syntax and the value-based equality.
Inheritance and the with syntax
In the code sample below it is not immediate that person2
is a Student since it is inferred from a Person reference using the with
syntax. Under the hood the generated virtual <Clone>$()
method is called by the compiler. This virtual method is overridden by Student and its implementation calls the Student copy constructor:
1 2 3 4 5 6 7 |
Person person1 = new Student { FirstName = "Léna", LastName = "Smacchia", ID= 123 }; Person person2 = person1 with { FirstName = "Paul" }; Assert.IsTrue(person2 is Student); |
Inheritance and the value-based equality
From the Person
record perspective, in the code sample below both references have the same value. After all they share the same FirstName
and LastName
values.
1 2 3 4 5 6 7 8 9 10 |
Person person1 = new Student { FirstName = "Léna", LastName = "Smacchia", ID = 123 }; Person person2 = new Person { FirstName = "Léna", LastName = "Smacchia" }; Assert.IsFalse(person1 == person2); |
Fortunately the code generated by the compiler makes it so that these two objects are considered as different. If you look back at the code generated by the compiler you will see that the method Equals()
implementation generated relies on the virtual property protected virtual Type EqualityContract => typeof(Person);
. This property is used to check that the two objects compared have the same type.
Generic Record
Let’s mention that a record can be a generic class. This is great! Nevertheless developers will have to remember that EqualityComparer<T>.Default
is used against each property typed with T
in the code generated to compare states. This can lead to non-immediate situations:
1 2 3 4 5 6 7 8 9 |
record Pair<T>(T Left, T Right); (...) var pair1 = new Pair<string>("1", "2"); var pair2 = new Pair<string>("1", "2"); Assert.IsTrue(pair1 == pair2); // equals with string var pair3 = new Pair<object>(new object(), new object()); var pair4 = new Pair<object>(new object(), new object()); Assert.IsFalse(pair3 == pair4); // different with object |
Record used as class
Most of records will be declared with the simple positional record syntax. Certainly it will be recommended to keep records as simple as possible to handle scenarios that rely on value-based semantic. But it is possible to add methods and more members to a record. You can even override some default behavior as demonstrated by the code sample below:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
public record Person<T>(T FirstName, T LastName) { // More fields, properties, behavior, event.. authorized public int m_Id; public int ID { get; set;} public string FullName => FirstName + " " + LastName; public event Action MyEvent; // // Cannot override those // // Error CS8862 A constructor declared in a record with parameters must have 'this' constructor initializer. // public Person(int id) { } // Error CS0111 Type 'Person' already defines a member called 'Equals' with the same parameter types //public override bool Equals(object obj) => false; // Cannot be overridden since they are static //public static bool operator !=(Person? r1, Person? r2) => false; //public static bool operator ==(Person? r1, Person? r2) false; // Deconstruct() is not virtual and cannot be overriden // // Override default value-based equality and ToString() behavior // protected virtual Type EqualityContract => typeof(int); public override int GetHashCode() => 1234; public virtual bool Equals(T? other) => false; public override string ToString() => "hello"; protected virtual bool PrintMembers(StringBuilder builder) { return false; } } |
Conclusion
I wrote my first post on immutability in C# on 2008. Many of us have waited for long to finally see C# proposing such syntax. Over time C# improved at helping with immutability: for example C#8 introduced read only members for structure. Now C#9 offers a brilliant syntax to democratize the usage of immutability. And the concise positional record syntax will save a lot of keystrokes.
If – like us – your code is stuffed with immutable classes that are good candidates to be refactored with the record keyword, you should notice that refactoring toward this new syntax won’t be straightforward: client code of these classes relies on referenced-based equality which is very different than record value-base equality. As a consequence the move will have to be carefully thought-out!
Enjoy!
I have been curious if with C# 9 records you will be able to deserialize data into records?