[gclist] Finalizers & Reference counting.
Thu, 29 Aug 2002 16:38:01 +0100
Greg Hudson wrote:
> On Wed, 2002-08-28 at 16:47, Reedy,Christopher L. (Chris) wrote:
> Linus probably knows more about gcc than I do, but I think spending lots
> of time in the kernel may have skewed his judgment. Visible performance
> problems in the real world almost always stem from algorithms with poor
> scaling characteristics (O(n^2) or worse, where n is actually growing
> large). Not cache locality, not "hot spots" which need to be
> hand-optimized, not anything like that.
The issue has arisen because gcc compile times have got radically worse
between 2.95.3 and 3.2 - in quite a number of cases compiles are now
taking hours (or even days) longer than they used to. There have
already been quite a number of improvements but a very common theme has
been that the newer version performs a lot more memory allocations (each
version in fact has been getting slightly worse in this respect as more
intermediate optimizations have been added).
The discussion about gc has arisen because the change from 2.95 to 3.2
corresponds with the introduction of gc. While Linus has been quite
scathing about the use of gc I think a number of other people have
pointed out that gc per-se is not necessarily the problem, but the way
in which internal data structures are created in the presence of the new
allocator/collector has caused a lot the problems.
gcc's internal data representations use a lot of pointers (it's all
rather LISPy in the RTL :-)) and chasing these pointers around when the
objects being followed are scattered throughout the memory map has
caused serious performance losses because of the lack of data locality.
Not only is this causing cache losses but also TLB-related problems
too. It seems that the obstacks that were in use before had a
(aparently unplanned) side effect of helping to keep related objects
local to each other and thus these problems were much less obvious.
I think the other question that has been raised is what sort of cache
and TLB-related performance problems does the gc suffer/cause when it
runs. I don't think anyone has actually answered this yet though.