[gclist] Finalizers & Reference counting.

Dave Hudson dave@cyclicode.net
Thu, 29 Aug 2002 16:38:01 +0100

Hi Greg,

Greg Hudson wrote:
> On Wed, 2002-08-28 at 16:47, Reedy,Christopher L. (Chris) wrote:
> Linus probably knows more about gcc than I do, but I think spending lots
> of time in the kernel may have skewed his judgment.  Visible performance
> problems in the real world almost always stem from algorithms with poor
> scaling characteristics (O(n^2) or worse, where n is actually growing
> large).  Not cache locality, not "hot spots" which need to be
> hand-optimized, not anything like that.

The issue has arisen because gcc compile times have got radically worse 
between 2.95.3 and 3.2 - in quite a number of cases compiles are now 
taking hours (or even days) longer than they used to.  There have 
already been quite a number of improvements but a very common theme has 
been that the newer version performs a lot more memory allocations (each 
version in fact has been getting slightly worse in this respect as more 
intermediate optimizations have been added).

The discussion about gc has arisen because the change from 2.95 to 3.2 
corresponds with the introduction of gc.  While Linus has been quite 
scathing about the use of gc I think a number of other people have 
pointed out that gc per-se is not necessarily the problem, but the way 
in which internal data structures are created in the presence of the new 
allocator/collector has caused a lot the problems.

gcc's internal data representations use a lot of pointers (it's all 
rather LISPy in the RTL :-)) and chasing these pointers around when the 
objects being followed are scattered throughout the memory map has 
caused serious performance losses because of the lack of data locality. 
  Not only is this causing cache losses but also TLB-related problems 
too.  It seems that the obstacks that were in use before had a 
(aparently unplanned) side effect of helping to keep related objects 
local to each other and thus these problems were much less obvious.

I think the other question that has been raised is what sort of cache 
and TLB-related performance problems does the gc suffer/cause when it 
runs.  I don't think anyone has actually answered this yet though.