[gclist] When to collect.

Jeremy Fitzhardinge jeremy@zip.com.au
Tue, 25 Nov 1997 13:10:42 +1100

Bill Burcham wrote:
>         Does it seem to anyone else that "scheduling" and "memory
> management/GC"  and "persistence/transactions" are deeply
> intertwingled?!?  

Yes.  One of the conclusions I reached after dinking around with Java
runtimes is that threads and GC are deeply intermingled, and it was
pretty much impossible to disconnect the two by having a general GC
interface and a general threads interface.  It wasn't helped by the fact
that, as you mention, most threads interfaces suck, and they're at the
wrong level, or don't let you get at the information you really need.  I
was quite tempted to write my own threads package under Solaris to
replace pthreads, so I could do things like get control as threads were
multiplexed onto lwps, and be able to have per-lwp heaps and freelists;
lwps in this case being an approximation of real CPUs.  The downside to
this is that its a lot more work than it should be, and not at all
portable.  Perhaps scheduler activations would have been a better way
for Sun to have organised Solaris's kernel threads - I should really
look it up.

Also, while its easy to tell the kernel "this page isn't useful anymore"
(mmap from /dev/zero over the now free page), there needs to be a
feedback channel so a GC can tell what the kernel's doing.  The
traditional "ask for as much as you want and the kernel will accomodate
you" model doesn't cut it these days.  I looked at putting a GC under an
embedded microkernel, and once you can change the kernel to get feedback
about the overall memory pressure you can make much more sensible
decisions about when to collect.  All these "collect after X bytes
allocated" heuristics only make sense when you can't tell what's
happening globally; it seems to me the most useful time to collect is
"the system is about to run out of real memory soon, given the current
rate of allocation".  If the system has 32MB of real memory free, and
we're allocating at 1MB/hour, there's not much need to GC right now,
even if our heap is 120MB and mostly garbage.

And of course, using the GC machinery to do transparent persistence like
Texas (and ObjectStore, I presume) is very elegant.