From pekka@harlequin.co.uk Tue, 2 Jan 2001 17:27:20 GMT
Date: Tue, 2 Jan 2001 17:27:20 GMT
From: Pekka P. Pirinen pekka@harlequin.co.uk
Subject: [gclist] Java vs. ML, particularly GC
[Incidentally, is there an online archive for this list?]
Yes, it's archived at IECC
and TUNES.
The latter is threaded.
--
Pekka P. Pirinen
Adaptive Memory Management Group, Harlequin Limited
From pekka@harlequin.co.uk Wed, 3 Jan 2001 19:29:16 GMT
Date: Wed, 3 Jan 2001 19:29:16 GMT
From: Pekka P. Pirinen pekka@harlequin.co.uk
Subject: [gclist] Java vs. ML, particularly GC
> My question is how can you deduce the object overhead from doing
> experiments with the JVM.
You probably can't deduce it exactly, as there could be some off-line
tables that get allocated in large steps (page tables, mark bits,
etc.). However, such things contribute a small fraction of the
overhead, and are probably not even counted by the interfaces provided
to ask about memory consumption. The best way might be to ask your
JVM vendor.
> While i agree with Eliot's model of 2 words
> of overhead for Objects and 3 words for Object[]'s for JDK 1.3., on NT
> my estimates are different.
2/3 words is reasonable, but implementors often find other
considerations more pressing and increase the overhead.
> Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0-C)
> Java HotSpot(TM) Client VM (build 1.3.0-C, mixed mode)
> I measure 8 bytes per Object and 16 bytes per Object[].
That could just be a consequence of wanting 8-byte alignment (although
undoubtedly some use could then be found for the extra bits).
> For java version "1.2" (build )
> Classic VM (build 1.2.2, build Linux_JDK_1.2.2_RC4, native threads, nojit)
> I get 16.24 bytes per Object and 16.24 bytes per Object[].
This is a handle-based system, so you immediately get one more word
for the object pointer in the handle. Since Sun put the class info in
the handle (makes sense for dispatch speed), the object blocks need a
separate length field, so it's two words of handle and two words of
header.
--
Pekka P. Pirinen
Adaptive Memory Management Group, Harlequin Limited
From ken@bitsko.slc.ut.us 04 Jan 2001 13:27:13 -0600
Date: 04 Jan 2001 13:27:13 -0600
From: Ken MacLeod ken@bitsko.slc.ut.us
Subject: [gclist] autoconf for Boehm-Demers-Weiser gc?
Has autoconf ever been discussed for use in the Boehm-Demers-Weiser GC
source distribution?
The project I'm working on has a dependency on GC (naturally ;-) and,
for the most common case, will require the .so to be installed in a
searchable path before use.
While GC has an excellent make setup, one that I found relatively
easy, my project is geared towards users who may not find it quite so
easy. The "next level" that I'd like to see, that I think the user's
of my project would be more comfortable with, is the standard autoconf
style install: ./configure; make test; make install
Minimal usage and support of autoconf would require only minor,
non-user visible*, changes in the Makefile at this point. Automation
of certain aspects of the various README.*s could evolve later.
I can create the initial patch if that's a direction that can be
supported.
-- Ken
(* non-user-visible means that current users of GC will not need to
change their existing install/update procedures.)
From hans_boehm@hp.com Thu, 4 Jan 2001 11:53:10 -0800
Date: Thu, 4 Jan 2001 11:53:10 -0800
From: Boehm, Hans hans_boehm@hp.com
Subject: [gclist] autoconf for Boehm-Demers-Weiser gc?
I would like to see this happen, ideally in a way that makes it transparent
to build procedures that just invoke "make", and definitely in a way that
doesn't break the non-Unix build procedures (at least not any more than they
already are broken). It has been done partially as part of libgcj, which is
now part of gcc (see http://gcc.gnu.org/cgi-bin/cvsweb.cgi/egcs/boehm-gc/ ).
This could potentially solve a number of problems, e.g. automatically
locating a working compiler and assembler, and providing a cleaner way to
deal with ath least the more important configuration options. It probably
can't replace gcconfig.h, since that requires knowledge about whether
certain OS features are present on all versions, etc.
Unfortunately, it looks like previous volunteers deserted this project when
they started looking at the details. I haven't yet found the time, and I'm
not an autoconf/automake expert. (I think Tom Tromey did the gcj work.)
Hans
> -----Original Message-----
> From: Ken MacLeod [mailto:ken@bitsko.slc.ut.us]
> Sent: Thursday, January 04, 2001 11:27 AM
> To: gclist@iecc.com
> Subject: [gclist] autoconf for Boehm-Demers-Weiser gc?
>
>
> Has autoconf ever been discussed for use in the Boehm-Demers-Weiser GC
> source distribution?
>
> The project I'm working on has a dependency on GC (naturally ;-) and,
> for the most common case, will require the .so to be installed in a
> searchable path before use.
>
> While GC has an excellent make setup, one that I found relatively
> easy, my project is geared towards users who may not find it quite so
> easy. The "next level" that I'd like to see, that I think the user's
> of my project would be more comfortable with, is the standard autoconf
> style install: ./configure; make test; make install
>
> Minimal usage and support of autoconf would require only minor,
> non-user visible*, changes in the Makefile at this point. Automation
> of certain aspects of the various README.*s could evolve later.
>
> I can create the initial patch if that's a direction that can be
> supported.
>
> -- Ken
>
> (* non-user-visible means that current users of GC will not need to
> change their existing install/update procedures.)
>
From fjh@cs.mu.oz.au Fri, 5 Jan 2001 13:00:56 +1100
Date: Fri, 5 Jan 2001 13:00:56 +1100
From: Fergus Henderson fjh@cs.mu.oz.au
Subject: [gclist] autoconf for Boehm-Demers-Weiser gc?
On 04-Jan-2001, Boehm, Hans wrote:
> I would like to see this happen, ideally in a way that makes it transparent
> to build procedures that just invoke "make", and definitely in a way that
> doesn't break the non-Unix build procedures (at least not any more than they
> already are broken). It has been done partially as part of libgcj, which is
> now part of gcc (see http://gcc.gnu.org/cgi-bin/cvsweb.cgi/egcs/boehm-gc/ ).
I think http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/
^^^
is probably a more appropriate URL for that now.
I too would like to see this happen.
--
Fergus Henderson | "I have always known that the pursuit
| of excellence is a lethal habit"
WWW: | -- the last words of T. S. Garp.
From dsl@tepkom.ru Fri, 05 Jan 2001 18:17:07 +0300
Date: Fri, 05 Jan 2001 18:17:07 +0300
From: Dmitry Lomov dsl@tepkom.ru
Subject: [gclist] Java vs. ML, particularly GC
"Krishnaswami, Neel" wrote:
>
> Dave Mason [mailto:dmason@sarg.Ryerson.CA] wrote:
> >
> > I have 10-20K lines of coding experience in SML, Caml, Scheme (, and
> > [skip]
> > 1) opinions on the accuracy of this perception?
>
> True, but keep in mind that ML and Ocaml have highly-optimizing
> native code compilers, and you are comparing it to a bytecode
> system. Of course, if native MLs exists on all the platforms you
> care about then Java still loses.
I suppose Dave compares to bytecode versions of OCaml and SML...
As to original question, I think the most important difference between
OCaml and SML and Java, is that the former two, being functional
makes less reference updates, thus allowing better performance
of generational collector.
In OCaml implementation, they use 2-age generational collector,
young heap is collected by a copying collector, and
major heap - by incremental mark&sweep. This is very fast,
as you understand - I suppose, the fastest possible.
Of course, as usually the big problem is pointers from
major heap to minor heap, but those are rare in functional
languages, so thay just have a table of references from major
to young heap (not very big) and include it in roots for
minor collections. At every minor collection, table is cleared.
Whenever table overfills, minor collection is performed.
For Java, this is unacceptable, because pointers from major
to minor heap will appear all the time.
Dmitry
_________________________________________________________________
Dmitry S. Lomov
mailto:dsl@tepkom.ru ICQ#: 20524819 (Rusty)
+7 (812) 428-46-57 (b) +7 (812) 295-94-15 (h)
http://oops.tepkom.ru
From ken@bitsko.slc.ut.us 25 Jan 2001 11:43:21 -0600
Date: 25 Jan 2001 11:43:21 -0600
From: Ken MacLeod ken@bitsko.slc.ut.us
Subject: [gclist] MacOS 9 and recent Boehm-Demers-Weiser GC
I've just started trying to compile GC on MacOS 9, using MPW and
gc6.0alpha5. Everything compiles cleanly, but running 'test.c'
crashes (thank heavens for MacOSX!).
MzScheme uses gc5.0alpha2, so I know something works more recent than
what README.Mac describes.
Before I start heavy into tracking down why it crashes (and I'm a
newbie to debugging on the Mac), I was hoping to find out if anyone
has already gone this route before and can give me some help.
Mozilla appears to be using GC for leak detection (as of late '00),
and I think they have done some Mac work, tracking that down...
Any pointers would help, thanks!
-- Ken
From fare@tunes.org Tue, 30 Jan 2001 15:19:47 +0100
Date: Tue, 30 Jan 2001 15:19:47 +0100
From: Francois-Rene Rideau fare@tunes.org
Subject: [gclist] Real-Time GC for high-level languages
Dear GC Listers,
I'm looking for real-time memory management solutions so as to develop
embedded software using high-level languages. Is there a compendium listing
whatever language implementations, GC libraries, etc, provide real-time
behavior, preferrably hard real-time? My employer is mostly interested
in free software, although we're also interested in knowing what
proprietary is available. We're also possibly interested in real-time
malloc/free.
Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?)
are said to be "soft real-time". What does that mean? When do they fail?
What if paging is disabled: will that make them "hard real-time"?
What high-level language implementation can boast about real-time latency?
I have a feeling this is a FAQ. The FAQ doesn't seem to be maintained,
however and its links are out of date.
Best regards,
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ]
[ TUNES project for a Free Reflective Computing System | http://tunes.org ]
A programming language is low level
when its programs require attention to the irrelevant.
-- Alan Perlis
From neelk@cswcasa.com Tue, 30 Jan 2001 09:44:10 -0500
Date: Tue, 30 Jan 2001 09:44:10 -0500
From: Krishnaswami, Neel neelk@cswcasa.com
Subject: [gclist] Real-Time GC for high-level languages
Francois-Rene Rideau [mailto:fare@tunes.org] wrote:
>
> Dear GC Listers,
> I'm looking for real-time memory management solutions so
> as to develop embedded software using high-level languages. Is
> there a compendium listing whatever language implementations,
> GC libraries, etc, provide real-time behavior, preferrably hard
> real-time? My employer is mostly interested in free software,
> although we're also interested in knowing what proprietary is
> available. We're also possibly interested in real-time malloc/free.
The MLKit Standard ML compiler has a region-based memory allocator
that does no pointer tracing, so there are no unbounded pauses. (I
think regions are typically called "arenas" by C programmers.) You
can find the compiler at:
http://www.it-c.dk/research/mlkit/
and there are also a number of papers describing the technique. The
developers advertise realtime programming as a possible application
domain for their compiler. However, I've never done any realtime
programming so I can't really evaluate the claim.
--
Neel Krishnaswami
neelk@cswcasa.com
From amc4@doc.ic.ac.uk Tue, 30 Jan 2001 15:20:47 +0000 (GMT)
Date: Tue, 30 Jan 2001 15:20:47 +0000 (GMT)
From: Andrew Cheadle amc4@doc.ic.ac.uk
Subject: [gclist] Real-Time GC for high-level languages
Hi
I would direct you to Richard Jones' garbage collection pages:
http://www.cs.ukc.ac.uk/people/staff/rej/gc.html - it's probably the
best resource around for all types of GC info.
> On Tue, 30 Jan 2001, Francois-Rene Rideau wrote:
> "soft real-time". What does that mean? When do they fail?
A hard realtime system is one which can guarantee that a certain action
will always be carried out in less than a certain time.
... in the context of gc:
My experiences result from ongoing work to implement an incremental
garbage collector into GHC, The Glasgow Haskell Compiler.
Whilst theoretically possible to bound gc pause times (the amount of time the
user program is paused while the garbage collector runs) it is actually
very difficult to achieve in practice. Hard real-time guarantees the
pause time to be no more than a specified time. Soft real-time whilst
attempting in the best case to achieve the bounded pauses times equated
with hard real-time cannot make the same guarantees in the worst case.
Our prototype compiler gave us pause times on average of the order of
microseconds... but we couldn't guarantee the pause times for collecting
large objects such as arrays (there is a trade off in the efficiency of
the array access implementation versus the how it is gc'd).
I seem to remember a theoretical paper:
Guy E. Blelloch, Perry Cheng: On Bounding Time and Space for
Multiprocessor Garbage Collection. PLDI 1999: 104-117
which makes claims of bounded pause times. I believe, but I'm not sure,
that Perry Cheng was looking at implementing the techniques mentioned in
the above paper in the TILT ML compiler:
http://www.cs.cornell.edu/Info/People/jgm/tilt.html
I don't know how far he has got.
Apart from that I know that the Open-source Erlang provides only
soft real-time behaviour. I'm not sure I'm aware of any hard real-time
systems out there that use 'standard' gc techniques. I think I'd agree with
Neel Krishnaswami's suggestion that maybe a region based approach or
something closer to techniques used in emmbedded systems maybe what you
want, but I really don't know much about them, I'm just guessing!
Hope that helps!
Cheers
Andy
*********************************************************************
* Andrew Cheadle email: a.cheadle@doc.ic.ac.uk *
* Department of Computing http://www.doc.ic.ac.uk/~amc4/ *
* Imperial College *
* University of London *
*********************************************************************
From chase@world.std.com Tue, 30 Jan 2001 12:56:37 -0500
Date: Tue, 30 Jan 2001 12:56:37 -0500
From: David Chase chase@world.std.com
Subject: [gclist] Real-Time GC for high-level languages
At 03:19 PM 1/30/2001 +0100, Francois-Rene Rideau wrote:
>Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?)
>are said to be "soft real-time". What does that mean? When do they fail?
>What if paging is disabled: will that make them "hard real-time"?
>What high-level language implementation can boast about real-time latency?
I think the Erlang people may have attained hard-real-time latency.
Another thing to consider is whether or not there are any constraints
on the memory-related behavior of the programs that you are running.
If, for instance, you can tell me that it never has more than X Mb
of live memory, then I can tell you how long a straightforward
copying-compacting collection will take (for small sets of data,
they are fast) after making some measurements.
On the other hand, if you've got no constraints on your running
program, you will find your hard-real-time GC choices exceedingly
limited.
>I have a feeling this is a FAQ. The FAQ doesn't seem to be maintained,
>however and its links are out of date.
The FAQ is only slightly maintained. I will see what I can do about its
links. Are you, perhaps, volunteering to help? :-) I've been working on
getting more facile with cvs over the net (been using it in a canned way
for the last four years) so it might be possible to do something slightly
more organized.
David Chase
From fare@tunes.org Tue, 30 Jan 2001 19:48:29 +0100
Date: Tue, 30 Jan 2001 19:48:29 +0100
From: Francois-Rene Rideau fare@tunes.org
Subject: [gclist] Real-Time GC for high-level languages
On Tue, Jan 30, 2001 at 09:48:26AM -0800, Jim Larson wrote:
> The Open-Source Erlang implementation achieves soft real-time GC [...]
> The Erlang GC may exhibit chronic pauses if too much data is live
> in a single process. However, the Erlang programming style
> discourages this from happening.
Ok. Thanks for the whole explanation (including the snipped part).
In practice, what kind of "soft-real-time" constraints are met by
Erlang-deployed systems? And what impact does management of the shared
data (bulk as well as system management information about processes)
have on the real-time response?
On Tue, Jan 30, 2001 at 12:56:37PM -0500, David Chase wrote:
> Another thing to consider is whether or not there are any constraints
> on the memory-related behavior of the programs that you are running.
> If, for instance, you can tell me that it never has more than X Mb
> of live memory, then I can tell you how long a straightforward
> copying-compacting collection will take (for small sets of data,
> they are fast) after making some measurements.
So that actually this property is not specific to Erlang,
only the Erlang programming style specifically takes advantage of it
in achieving real-time behavior.
> On the other hand, if you've got no constraints on your running
> program, you will find your hard-real-time GC choices exceedingly
> limited.
Assuming this is the case, what does that amount to?
Also are there any high-level language compilers that do automatic
memory management accross several arena, as opposed to just static memory
management accross several arena (as is done in the ML-Kit with region)?
I.e. each arena having a different read/write/lock/allocate/free protocol,
maybe programmably so? I suppose that once again, Erlang, with message
passing accross separate processes, is mostly in that case.
> The FAQ is only slightly maintained. I will see what I can do about its
> links. Are you, perhaps, volunteering to help? :-)
Sure, especially since you say you're using CVS.
Thanks to all those who responded and will respond.
Yours freely,
[ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ]
[ TUNES project for a Free Reflective Computing System | http://tunes.org ]
They laughed at Columbus, they laughed at Fulton, they laughed at the
Wright brothers. But they also laughed at Bozo the Clown.
-- Carl Sagan
From jim@sendmail.com Tue, 30 Jan 2001 11:52:20 -0800
Date: Tue, 30 Jan 2001 11:52:20 -0800
From: Jim Larson jim@sendmail.com
Subject: [gclist] Real-Time GC for high-level languages
In message <20010130151947.A21052@Samaris.tunes.org> Francois-Rene Rideau
writes:
>Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?)
>are said to be "soft real-time". What does that mean? When do they fail?
The Open-Source Erlang implementation achieves soft real-time GC
since its runtime is composed of many small "processes" (threads)
exchanging messages. Each process has its own heap and is GC'ed
independently of the others, and message-passing makes a copy of
any data sent (except for bulk data, which is held in global
reference-counted storage and passed by reference). Since each
process's heap is small, the GC delay tends to meet soft real-time
needs.
The Erlang GC may exhibit chronic pauses if too much data is live
in a single process. However, the Erlang programming style
discourages this from happening.
Jim
From jim@sendmail.com Tue, 30 Jan 2001 14:36:56 -0800
Date: Tue, 30 Jan 2001 14:36:56 -0800
From: Jim Larson jim@sendmail.com
Subject: [gclist] Real-Time GC for high-level languages
In message <20010130194828.A553@ZhengHe.augustin.thierry> you write:
>In practice, what kind of "soft-real-time" constraints are met by
>Erlang-deployed systems? And what impact does management of the shared
>data (bulk as well as system management information about processes)
>have on the real-time response?
I'm not sure what kinds of GC pauses are typical - I think a few
milliseconds, but I don't know how much that can change in unusual
environments. Since the GC is only run when the Erlang process
(thread) is scheduled (I think), the GC, as well as ordinary
execution, is sensitive to the scheduling algorithm, which could
be the tall tentpole in any real-time concerns.
As far as the GC is concerned, the shared data is much like a remote
reference in distributed GC - you need to know when the last
reference to the remote object in your local heap goes away so that
you can take some corrective action - in this case, grab a
lightly-contested lock and decrement a reference count. Several
straightforward methods can be used to track these references, and
none should show any significant overhead in this context.
Jim
From ok@atlas.otago.ac.nz Wed, 31 Jan 2001 15:06:09 +1300 (NZDT)
Date: Wed, 31 Jan 2001 15:06:09 +1300 (NZDT)
From: Richard A. O'Keefe ok@atlas.otago.ac.nz
Subject: [gclist] Real-Time GC for high-level languages
Andrew Cheadle wrote:
Apart from that I know that the Open-source Erlang provides only
soft real-time behaviour.
Erlang's limitation to "soft real-time" doesn't really have a lot to do
with garbage collection. Most garbage collections in Erlang are
thread-local. The basic problem is Erlang IPC:
- threads can communicate with threads on other nodes over a network,
so there are unpredictable network delays
- you can't tell the Erlang system everything it would need to know to
do hard real-time scheduling.
I believe that the time to fully collect a thread is less than typical
LAN network delays.
From fjh@cs.mu.oz.au Wed, 31 Jan 2001 20:02:01 +1100
Date: Wed, 31 Jan 2001 20:02:01 +1100
From: Fergus Henderson fjh@cs.mu.oz.au
Subject: [gclist] garbage collection of dynamically loaded objects
Hi,
Has anyone tried garbage collecting dynamically loaded objects that
were loaded with dlopen()?
A student here who is working on a SOAP interface for Mercury,
Ina Cheng, has been using Mercury's dynamic loading support.
But she ran into a problem where her code was calling dl__close,
which is the Mercury interface to dlclose(), while there were still
live references to static data defined in the dynamically loaded
module. Part of the problem was that we hadn't documented the dangers
of dl__close. But figuring out when you have references to static
data in the dynamically loaded module is tricky, and may require some
understanding of implementation details of the Mercury implementation.
In an ideal world, you wouldn't need to call dl__close at all,
because the system would garbage collect the dynamically loaded
module when it was no longer referenced.
The Boehm (et al) conservative collector, which we're using, has code
to trace references *from* (writable) static data in dynamically loaded
modules. But it doesn't have any code to trace references *to*
code or static data in dynamically loaded modules. Has anyone
considered doing that?
P.S. I guess it only makes sense to garbage collect objects that were
loaded without the RTLD_GLOBAL option, because if that option is enabled,
the object file might be used to resolve references in other dynamically
loaded objects, so the garbage collector can never determine when it
is safe to collect it.
--
Fergus Henderson | "I have always known that the pursuit
| of excellence is a lethal habit"
WWW: | -- the last words of T. S. Garp.
From rv@bluetail.com Wed, 31 Jan 2001 13:35:44 +0100
Date: Wed, 31 Jan 2001 13:35:44 +0100
From: Robert Virding rv@bluetail.com
Subject: [gclist] Real-Time GC for high-level languages
Francois-Rene Rideau writes:
>Dear GC Listers,
> I'm looking for real-time memory management solutions so as to develop
>embedded software using high-level languages. Is there a compendium listing
>whatever language implementations, GC libraries, etc, provide real-time
>behavior, preferrably hard real-time? My employer is mostly interested
>in free software, although we're also interested in knowing what
>proprietary is available. We're also possibly interested in real-time
>malloc/free.
>
>Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?)
>are said to be "soft real-time". What does that mean? When do they fail?
>What if paging is disabled: will that make them "hard real-time"?
>What high-level language implementation can boast about real-time latency?
I have some references to real-time garbage collection papers:
Mats Bengtsson: Real-Time Compacting Garbage Collection Algorithms
Lund University, LUTEDX/(TECS-3028)/1-118(1990) & LU-CS-TR:90-61
Roger Henriksson: Scheduling Garbage Collection in Embedded Systems
Lund University, LUTEDX/(TECS-1008)/1-164/(1998)
Roger.Hendriksson@dna.lth.se,http://www.dna.lth.se/~roger
As someone has already pointed out "hard real-time" is when you really
worry about worst-case and soft real-time is when you worry about
average-case. As the second paper (PhD thesis) points out that a
collector is part of a complete hard-real time system so that it is not
enough for the collector to have hard bounds, the system as a whole must
meet its bounds.
This means that the memory system must not only be able to supply enough
memory when needed, but also be able to delay collecting to when it is
suitable for the system. Typically there will be bursts of high
priority real-time activity when no latency is allowed and sufficient
memory must exist interspersed with low-priority activity which can be
interrupted for collection. The collector must fit in.
Robert
--
Robert Virding Tel: +46 (0)8 545 55 017
Alteon Web Systems Email: rv@bluetail.com
S:t Eriksgatan 44 WWW: http://www.bluetail.com/~rv
SE-112 34 Stockholm, SWEDEN
"Folk säger att jag inte bryr mig om någonting, men det skiter jag i".