More on Hashcodes

One of the devs on the BCL team just
added a bit to my recent
post on hashcodes

Enjoy!"urn:schemas-microsoft-com:office:office" />

Brad's comment above applies to Object's GetHashCode
implementation, which most interesting classes override, providing their own
hash function. We believe GetHashCode should be used as a hash function that
returns a seemingly-random value that could be negative or duplicated for
multiple values. In V1, Object's GetHashCode unfortunately gave some stronger
guarantees than this that a few people wanted to depend on, but that wasn't in
the contract of the method. Their code is already broken on version 2 (we think
the only people that depended on this were internal the company, and they would
have long since found their bug & corrected it).

Note that we also don't want user code taking a dependency on
our existing hash function implementations for any type - ideally we could
change them every time we build the product. To elaborate on that, let's look at
String.

String uses a different hash
function that looks at each character, XOR'ing in the new character with a
(presumably prime) number. We'll change String's hash function in a future
version so it both executes faster and produces a better distribution. This will
improve lookups in hash tables when using strings as keys. But because we'll
change the hash function, it is also important to not depend on one particular
version's implementation of GetHashCode. IE, never write the values you get back
from GetHashCode to disk and read them back later, or sort values based on their
hash function then persist that data to a file or send it over a
network.

Brian Grunkemeyer
MS CLR Base Class Library team