[gclist] object hash codes and copying collection

Eliot Moss moss@rhea.cs.umass.edu
Wed, 10 Dec 1997 11:40:28 -0500 (EST)


My understanding of the Java hashCode feature is that it must meet these hard
requirements:
  - Every object has a UNIQUE hash code
  - An object's hash code never changes during the object's lifetime

In addition, I would add a third, softer requirement:
  - Object hash codes are distributed such that one can find hash functions
    over the hash codes that distribute the objects into hash bins uniformly
    (with high probability).

One approach I suggest for this is to assign hash codes from a simple counter,
incremented each time you assign a hash code. This does have a significant
problem, though: in a long running program, the counter can wrap around. I
have not thought of any good ways to fix this. It would not be an issue on a
64 bit wide design, since one could not cycle through the values before the
system would crash (or be retired :-), but on a 32 bit system, one could cycle
around. At that point, one could scan the heap to find hash values that can be
reused. There are almost certainly clever background ways to do this that
avoid doing a scan-the-world all at once, and that maintain the needed
information relatively compactly -- I just have yet to work out the details.

Regards --							Eliot