[gclist] zillion objects

Andrew Chen chenandr@cse.msu.edu
Fri, 24 Aug 2001 08:30:21 -0400

At 3:00 AM -0400 8/24/01, Ji-Yong D. Chung wrote:
>     In my preceding email,
>>  [I wrote]  This question is a general one. In brief:
>>if you have a large number of objects, a small
>>fraction of which die (very slowly), is there a
>>method of automatic memory management
>>(i.e., gc), which would not (1) try to copy most of
>>those objects or (2) trace most of those pointers?
>     Perhaps some background material on this would
>be helpful.
>     Basically the question asks for a strategy for
>implementing an automatic memory manager
>(perhaps GC) for a high-performance servers that
>does lots of caching.

I'm not sure why you seem convinced that a small fraction of the 
objects die very slowly on such servers.I suspect it depends highly 
on the nature of the server. Do you have a specific one in mind?

>     With today's decline in memory price, servers like
>to put *everything* in memory (few gigs?).  Garbage
>collectors or automatic memory managers for such
>servers, need to be able to handle large
>data without significant collection pauses.  The
>emphasis on caching will increase as 64 bit machines
>become more prevalent.

If you just want something with low pause times, there's the train 
algorithm, or the Sapphire technique out of Intel (JavaGrande 2001).

>     I don't know if generational collector would be
>good for the above described large cache oriented servers.
>A generational collector will push all stable data to
>its oldest generation arena.  And eventually, it will
>still need to perform garbage collection on that

That's where partitioning the oldest generation, and the above 
mentioned techniques, do well.

>     When it does, it has to scan perhaps up to few gigabytes
>of data (perhaps tens or hundreds of gigabytes in
>extreme cases).  I am guessing that this has to
>result in significant GC pauses.

If you did the whole thing, sure.

>     When one has so much data on RAM, cache
>miss caused by client requests will be relatively
>rare.  This means the rate of garbage generation,
>(garbage generated by replacing pointers to old
>object by the poiner to a new object read in from a disk
>due to the cache miss) would be slow as well.

Alternatively, you could just not do garbage collection (or do it in 
a manual and buggy manner) and have replicated (physical) servers and 
a load balancer in front that will direct requests to servers that 
are running (didn't crash due to their memory leak or bus error).

Yeah, it's a less than elegant solution, but it would work, uses 
off-the-shelf components (assuming you can run multiple instances of 
the server and either that the "state" between client requests is 
easily shared, or the load balancer will always direct the client to 
the same server it went to before), and doesn't require any coding.

There's a small chance that such a solution might result in certain 
client requests not being satisfied, or being half-satisfied. 
Depending on the application, this may or may not be acceptable - but 
due to other external factors (network difficulties, client crashing, 
power outage, OS crashing, disk failure, etc...) this may happen 
anyway. Ask yourself - with that much memory, how long before I run 
out of memory (depending on the server, it may be a while)? And is it 
likely that I'll have a power outage or other event that will need to 
restart the process or reboot the machine during that time?