[gclist] When to collect.

Charles Fiterman cef@geodesic.com
Fri, 14 Nov 1997 11:41:27 -0600


The problem is what I call the collect or expand
decision. Given a request that can't be filled
without expand or collect which do I do.

I don't like any of the current algorithms and
are trying to rethink from scratch.

First the idea is not to get by with as little
memory as you can but to use as much as won't
trigger paging. This simple program shows what
I mean.

#include <strstrea.h>
#include <iomanip.h>
#include <duration.h>
#include <gct.h>

main()
{
	duration t;
	unsigned long e = 32 * 1024, col = 0, count = 100000;

	cout.setf(ios::fixed);
	cout.precision(2);

	gcPrintStats = 1;

	cout << "Show how Great Circle trades space for speed." 
		 << endl
		 << endl;

	for (;;) {
		t.start();

		for (unsigned long i = 0; i < count; i++)
			new char[1024];

		cout << count 
			 << " allocations took "
			 << setw(6) << t.end()
			 << " seconds and "
			 << setw(4) << (gcCollections - col)
			 << " collections"
			 << endl;
		col = gcCollections;

		if ((e <<= 1) > (32 * 1024 * 1024))
			break;

		cout << (gcExpandHeap(e) ?
				 "Expanded heap by " : "Failed to expand heap by ")
			 << setw(7) << e
			 << endl;
	}
}

It runs faster and faster until is runs into paging and then it slows 
down very violently.

	Initial heapsize	Elapsed time	Number of collections

	 64 kB         		8.6 sec		366
	128 kB         		6.7 sec		295
	256 kB         		5.2 sec		213
	512 kB         		4.8 sec		138
	  1 MB         		3.7 sec		 81
	  2 MB         		3.1 sec		 44
	  4 MB         		2.8 sec		 22
	  8 MB         		2.4 sec		 12
	 16 MB         		2.3 sec		  6
	 32 MB         	      129.8 sec        	  3

This suggests that the system administrator suggest a top size.
Collect or expand would prefer expand until this was reached.
Then timeing would say if we are spending more than X% of our
time in collect, expand otherwise collect. X should be substantial
as exceeding the limit triggers paging which will be worse.

This has another advantage. Overhead in a conservative collector
divides into three parts. Scaning, zeroing and allocation. All
these tend to be tightly optomized. Keeping statistics on what
was allocated since the last collection slows down allocation.

			-  
Charles Fiterman		Geodesic Systems
414 North Orleans Suite 410	Phone 312 832 1221 x223
Chicago IL 60610-4418		FAX   312 832 1230
				http://www.geodesic.com

A young man saw a product advertised in a magazine that said
"Guaranteed 100% effective against bugs when properly used."
He sent away and got two blocks of wood with the instructions
"Place bug on block A, strike sharply with block B." Debuggers
are like that, and you can get stung while using them. Great
Circle tries to be something better than that.