Jeffrey Richter: Excerpt #6 from CLR via C#, Third Edition

Article
05/27/2011

Hi, this is Jeffrey Richter, offering one more excerpt from my latest book, CLR via C#, Third Edition. If you use the “CLR” tag on this blog, you can find all the other excerpts, as well as a post called “Sample chapters: CLR via C#, by Jeffrey Richter,” which offers you two complete chapters.

Today’s excerpt is from Chapter 24, “Runtime Serialization.” It’s a section called “Controlling Serialization and Deserialization.” Enjoy, and post a comment with any feedback or questions you might have.

Controlling Serialization and Deserialization

When you apply the SerializableAttribute custom attribute to a type, all instance fields
(public, private, protected, and so on) are serialized (Footnote 1). However, a type may define some
instance fields that should not be serialized. In general, there are two reasons why you would
not want some of a type’s instance fields to be serialized:

Footnote 1: Do not use C#’s automatically implemented property feature to define properties inside types marked with the
[Serializable] attribute, because the compiler generates the names of the fields and the generated names can
be different each time that you recompile your code, preventing instances of your type from being deserializable.

The field contains information that would not be valid when deserialized. For example,
an object that contains a handle to a Windows kernel object (such as a file, process,
thread, mutex, event, semaphore, and so on) would have no meaning when deserialized
into another process or machine since Windows kernel handles are process-relative
values.
The field contains information that is easily calculated. In this case, you select which
fields do not need to be serialized, thus improving your application’s performance by
reducing the amount of data transferred.

The code below uses the System.NonSerializedAttribute custom attribute to indicate
which fields of the type should not be serialized. (Note that this attribute is also defined in
the System namespace, not the System.Runtime.Serialization namespace.)

[Serializable]
internal class Circle {
private Double m_radius;

[NonSerialized]
private Double m_area;

public Circle(Double radius) {
m_radius = radius;
m_area = Math.PI * m_radius * m_radius;
}

...
}

In the code above, objects of Circle may be serialized. However, the formatter will serialize
the values in the object’s m_radius field only. The value in the m_area field will not be serial-
ized because it has the NonSerializedAttribute attribute applied to it. This attribute can
be applied only to a type’s fields, and it continues to apply to this field when inherited by
another type. Of course, you may apply the NonSerializedAttribute attribute to multiple
fields within a type.

So, let’s say that our code constructs a Circle object as follows:

Circle c = new Circle(10);

Internally, the m_area field is set to a value approximate to 314.159. When this object gets
serialized, only the value of the m_radius field (10) gets written to the stream. This is exactly
what we want, but now we have a problem when the stream is deserialized back into a
Circle object. When deserialized, the Circle object will get its m_radius field set to 10, but
its m_area field will be initialized to 0—not 314.159!

The code shown below demonstrates how to modify the Circle type to fix this problem:

[Serializable]
internal class Circle {
private Double m_radius;

[NonSerialized]
private Double m_area;

public Circle(Double radius) {
m_radius = radius;
m_area = Math.PI * m_radius * m_radius;
}

[OnDeserialized]
private void OnDeserialized(StreamingContext context) {
m_area = Math.PI * m_radius * m_radius;
}
}

I’ve changed Circle so that it now contains a method marked with the System.Runtime.
Serialization.OnDeserializedAttribute custom attribute (Footnote 2). Whenever an instance of a
type is deserialized, the formatter checks if the type defines a method with this attribute on it
and then the formatter invokes this method. When this method is called, all the serializable
fields will be set correctly and they may be accessed to perform any additional work that
would be necessary to fully deserialize the object.

Footnote 2: Use of the System.Runtime.Serialization.OnDeserialized custom attribute is the preferred way
of invoking a method when an object is deserialized, as opposed to having a type implement the
System.Runtime.Serialization.IDeserializationCallback interface’s OnDeserialization method.

In the modified version of Circle above, I made the OnDeserialized method simply
calculate the area of the circle using the m_radius field and place the result in the m_area
field. Now, m_area will have the desired value of 314.159.

In addition to the OnDeserializedAttribute custom attribute, the
System.Runtime.Serialization namespace also defines OnSerializingAttribute,
OnSerializedAttribute, and OnDeserializingAttribute custom attributes, which you
can apply to your type’s methods to have even more control over serialization and deserial-
ization. Here is a sample class that applies each of these attributes to a method:

[Serializable]
public class MyType {
Int32 x, y; [NonSerialized] Int32 sum;

public MyType(Int32 x, Int32 y) {
this.x = x; this.y = y; sum = x + y;
}

[OnDeserializing]
private void OnDeserializing(StreamingContext context)
// Example: Set default values for fields in a new version of this type
}

[OnDeserialized]
private void OnDeserialized(StreamingContext context) {
// Example: Initialize transient state from fields
sum = x + y;
}

[OnSerializing]
private void OnSerializing(StreamingContext context) {
// Example: Modify any state before serializing
}

[OnSerialized]
private void OnSerialized(StreamingContext context) {
// Example: Restore any state after serializing
}
}

Whenever you use any of these four attributes, the method you define must take a single
StreamingContext parameter (discussed in the “Streaming Contexts” section later in this
chapter) and return void. The name of the method can be anything you want it to be. Also,
you should declare the method as private to prevent it from being called by normal code;
the formatters run with enough security that they can call private methods.

Note When you are serializing a set of objects, the formatter first calls all of the objects’ methods
that are marked with the OnSerializing attribute. Next it serializes all of the objects’ fields,
and finally it calls all of the objects’ methods marked with the OnSerialized attribute. Similarly,
when you deserialize a set of objects, the formatter calls all of the objects’ methods that are
marked with the OnDeserializing attribute, then it deserializes all of the object’s fields, and
then it calls all of the objects’ methods marked with the OnDeserialized attribute.

Note also that during deserialization, when a formatter sees a type offering a method marked
with the OnDeserialized attribute, the formatter adds this object’s reference to an internal
list. After all the objects have been deserialized, the formatter traverses this list in reverse order
and calls each object’s OnDeserialized method. When this method is called, all the serializable
fields will be set correctly, and they may be accessed to perform any additional work that would
be necessary to fully deserialize the object. Invoking these methods in reverse order is important
because it allows inner objects to finish their deserialization before the outer objects that contain
them finish their deserialization.

For example, imagine a collection object (like Hashtable or Dictionary) that internally uses a
hash table to maintain its sets of items. The collection object type would implement a method
marked with the OnDeserialized attribute. Even though the collection object would start
being deserialized first (before its items), its OnDeserialized method would be called last (after
any of its items’ OnDeserialized methods). This allows the items to complete deserialization
so that all their fields are initialized properly, allowing a good hash code value to be calculated.
Then, the collection object creates its internal buckets and uses the items’ hash codes to place
the items into the buckets. I show an example of how the Dictionary class uses this in the
upcoming “Controlling the Serialized/Deserialized Data” section of this chapter.

If you serialize an instance of a type, add a new field to the type, and then try to
deserialize the object that did not contain the new field, the formatter throws a
SerializationException with a message indicating that the data in the stream being dese-
rialized has the wrong number of members. This is very problematic in versioning scenarios
where it is common to add new fields to a type in a newer version. Fortunately, you can use
the System.Runtime.Serialization.OptionalFieldAttribute attribute to help you.

You apply the OptionalFieldAttribute attribute to each new field you add to a type. Now,
when the formatters see this attribute applied to a field, the formatters will not throw the
SerializationException if the data in the stream does not contain the field.

Jeffrey Richter: Excerpt #6 from CLR via C#, Third Edition

Additional resources