[gclist] ref-counting performance cost

Andrew Chen chenandr@cse.msu.edu
Wed, 11 Oct 2000 15:44:42 -0400

At 9:53 AM -0700 10/11/00, Boehm, Hans wrote:
>  > -----Original Message-----
>>  From: Bob Kerns [mailto:Bob.Kerns@brightware.com]
>>  Perhaps your technique has relevance to non-traditional GC
>>  environments,
>>  such as distributed GC, or other non-flat memory models,
>>  where the cost
>>  tradeoffs are distinctly different?
>Or embedded systems, where the space overhead of a tracing collector may not
>be acceptable?
>It seems to me that whether or not an improved reference counting algorithm
>is interesting depends entirely on the characteristics of the algorithm.  If
>you can reclaim cycles immediately without increasing the cost of pointer
>assignments by more than a constant factor over a traditional reference
>counter, I would find that astonishing, and almost certainly worthy of
>publication, probably even as a purely theoretical result.  (In fact, I
>would expect that to be provably impossible, though I don't recall such a
>proof.  So the real question becomes what you lose in the process.)

I can't guarrantee that it will be always be a constant factor. Most 
of the time it should. Sometimes it will incur heavy costs associated 
with certain types of cyclic data, but never more than the cost of 
doing an atomic mark/sweep. But it might incurr this cost on every 
assignment in certain cases. Very minor changes to the programmer's 
style can be used to avoid this cost in every case that I can think 
of. (And I do mean very minor - the programmer will not need to 
designate any pointers as "strong" or anything like that.)

Given your plus to this idea, and that your name is (at least in my 
mind) well respected I will continue with this idea. (If you think 
the above caveats are too big please let me know. Thanks.)

BTW, if this is an appropriate context to ask, I'm getting weird 
results when I try to "fork" (a new process) while using the BDW 
collector. (I want to run the leak detector in a separate process 
(because it my leak reporter might be buggy and I don't want it to 
crash the main program), so I thought I could fork, invoke a 
non-incremental collection of everything, and have the leak collector 
work (if it does). I thought that while in the other process, since I 
found the leak, I could send this info back to the original process 
and it could free the leaked object. Thus the original process need 
not spend time garbage collecting at all - just occasionally fork.)

Do you know of any reasons that this might cause problems? I suspect 
signal handling (SIGCHLD?)? Or the mmap of /dev/zero causing the 
memory to be shared between processes?

I like the collector a lot and am glad it exists.