From pekka@harlequin.co.uk Tue, 2 Jan 2001 17:27:20 GMT Date: Tue, 2 Jan 2001 17:27:20 GMT From: Pekka P. Pirinen pekka@harlequin.co.uk Subject: [gclist] Java vs. ML, particularly GC [Incidentally, is there an online archive for this list?] Yes, it's archived at IECC and TUNES. The latter is threaded. -- Pekka P. Pirinen Adaptive Memory Management Group, Harlequin Limited From pekka@harlequin.co.uk Wed, 3 Jan 2001 19:29:16 GMT Date: Wed, 3 Jan 2001 19:29:16 GMT From: Pekka P. Pirinen pekka@harlequin.co.uk Subject: [gclist] Java vs. ML, particularly GC > My question is how can you deduce the object overhead from doing > experiments with the JVM. You probably can't deduce it exactly, as there could be some off-line tables that get allocated in large steps (page tables, mark bits, etc.). However, such things contribute a small fraction of the overhead, and are probably not even counted by the interfaces provided to ask about memory consumption. The best way might be to ask your JVM vendor. > While i agree with Eliot's model of 2 words > of overhead for Objects and 3 words for Object[]'s for JDK 1.3., on NT > my estimates are different. 2/3 words is reasonable, but implementors often find other considerations more pressing and increase the overhead. > Java(TM) 2 Runtime Environment, Standard Edition (build 1.3.0-C) > Java HotSpot(TM) Client VM (build 1.3.0-C, mixed mode) > I measure 8 bytes per Object and 16 bytes per Object[]. That could just be a consequence of wanting 8-byte alignment (although undoubtedly some use could then be found for the extra bits). > For java version "1.2" (build ) > Classic VM (build 1.2.2, build Linux_JDK_1.2.2_RC4, native threads, nojit) > I get 16.24 bytes per Object and 16.24 bytes per Object[]. This is a handle-based system, so you immediately get one more word for the object pointer in the handle. Since Sun put the class info in the handle (makes sense for dispatch speed), the object blocks need a separate length field, so it's two words of handle and two words of header. -- Pekka P. Pirinen Adaptive Memory Management Group, Harlequin Limited From ken@bitsko.slc.ut.us 04 Jan 2001 13:27:13 -0600 Date: 04 Jan 2001 13:27:13 -0600 From: Ken MacLeod ken@bitsko.slc.ut.us Subject: [gclist] autoconf for Boehm-Demers-Weiser gc? Has autoconf ever been discussed for use in the Boehm-Demers-Weiser GC source distribution? The project I'm working on has a dependency on GC (naturally ;-) and, for the most common case, will require the .so to be installed in a searchable path before use. While GC has an excellent make setup, one that I found relatively easy, my project is geared towards users who may not find it quite so easy. The "next level" that I'd like to see, that I think the user's of my project would be more comfortable with, is the standard autoconf style install: ./configure; make test; make install Minimal usage and support of autoconf would require only minor, non-user visible*, changes in the Makefile at this point. Automation of certain aspects of the various README.*s could evolve later. I can create the initial patch if that's a direction that can be supported. -- Ken (* non-user-visible means that current users of GC will not need to change their existing install/update procedures.) From hans_boehm@hp.com Thu, 4 Jan 2001 11:53:10 -0800 Date: Thu, 4 Jan 2001 11:53:10 -0800 From: Boehm, Hans hans_boehm@hp.com Subject: [gclist] autoconf for Boehm-Demers-Weiser gc? I would like to see this happen, ideally in a way that makes it transparent to build procedures that just invoke "make", and definitely in a way that doesn't break the non-Unix build procedures (at least not any more than they already are broken). It has been done partially as part of libgcj, which is now part of gcc (see http://gcc.gnu.org/cgi-bin/cvsweb.cgi/egcs/boehm-gc/ ). This could potentially solve a number of problems, e.g. automatically locating a working compiler and assembler, and providing a cleaner way to deal with ath least the more important configuration options. It probably can't replace gcconfig.h, since that requires knowledge about whether certain OS features are present on all versions, etc. Unfortunately, it looks like previous volunteers deserted this project when they started looking at the details. I haven't yet found the time, and I'm not an autoconf/automake expert. (I think Tom Tromey did the gcj work.) Hans > -----Original Message----- > From: Ken MacLeod [mailto:ken@bitsko.slc.ut.us] > Sent: Thursday, January 04, 2001 11:27 AM > To: gclist@iecc.com > Subject: [gclist] autoconf for Boehm-Demers-Weiser gc? > > > Has autoconf ever been discussed for use in the Boehm-Demers-Weiser GC > source distribution? > > The project I'm working on has a dependency on GC (naturally ;-) and, > for the most common case, will require the .so to be installed in a > searchable path before use. > > While GC has an excellent make setup, one that I found relatively > easy, my project is geared towards users who may not find it quite so > easy. The "next level" that I'd like to see, that I think the user's > of my project would be more comfortable with, is the standard autoconf > style install: ./configure; make test; make install > > Minimal usage and support of autoconf would require only minor, > non-user visible*, changes in the Makefile at this point. Automation > of certain aspects of the various README.*s could evolve later. > > I can create the initial patch if that's a direction that can be > supported. > > -- Ken > > (* non-user-visible means that current users of GC will not need to > change their existing install/update procedures.) > From fjh@cs.mu.oz.au Fri, 5 Jan 2001 13:00:56 +1100 Date: Fri, 5 Jan 2001 13:00:56 +1100 From: Fergus Henderson fjh@cs.mu.oz.au Subject: [gclist] autoconf for Boehm-Demers-Weiser gc? On 04-Jan-2001, Boehm, Hans wrote: > I would like to see this happen, ideally in a way that makes it transparent > to build procedures that just invoke "make", and definitely in a way that > doesn't break the non-Unix build procedures (at least not any more than they > already are broken). It has been done partially as part of libgcj, which is > now part of gcc (see http://gcc.gnu.org/cgi-bin/cvsweb.cgi/egcs/boehm-gc/ ). I think http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/boehm-gc/ ^^^ is probably a more appropriate URL for that now. I too would like to see this happen. -- Fergus Henderson | "I have always known that the pursuit | of excellence is a lethal habit" WWW: | -- the last words of T. S. Garp. From dsl@tepkom.ru Fri, 05 Jan 2001 18:17:07 +0300 Date: Fri, 05 Jan 2001 18:17:07 +0300 From: Dmitry Lomov dsl@tepkom.ru Subject: [gclist] Java vs. ML, particularly GC "Krishnaswami, Neel" wrote: > > Dave Mason [mailto:dmason@sarg.Ryerson.CA] wrote: > > > > I have 10-20K lines of coding experience in SML, Caml, Scheme (, and > > [skip] > > 1) opinions on the accuracy of this perception? > > True, but keep in mind that ML and Ocaml have highly-optimizing > native code compilers, and you are comparing it to a bytecode > system. Of course, if native MLs exists on all the platforms you > care about then Java still loses. I suppose Dave compares to bytecode versions of OCaml and SML... As to original question, I think the most important difference between OCaml and SML and Java, is that the former two, being functional makes less reference updates, thus allowing better performance of generational collector. In OCaml implementation, they use 2-age generational collector, young heap is collected by a copying collector, and major heap - by incremental mark&sweep. This is very fast, as you understand - I suppose, the fastest possible. Of course, as usually the big problem is pointers from major heap to minor heap, but those are rare in functional languages, so thay just have a table of references from major to young heap (not very big) and include it in roots for minor collections. At every minor collection, table is cleared. Whenever table overfills, minor collection is performed. For Java, this is unacceptable, because pointers from major to minor heap will appear all the time. Dmitry _________________________________________________________________ Dmitry S. Lomov mailto:dsl@tepkom.ru ICQ#: 20524819 (Rusty) +7 (812) 428-46-57 (b) +7 (812) 295-94-15 (h) http://oops.tepkom.ru From ken@bitsko.slc.ut.us 25 Jan 2001 11:43:21 -0600 Date: 25 Jan 2001 11:43:21 -0600 From: Ken MacLeod ken@bitsko.slc.ut.us Subject: [gclist] MacOS 9 and recent Boehm-Demers-Weiser GC I've just started trying to compile GC on MacOS 9, using MPW and gc6.0alpha5. Everything compiles cleanly, but running 'test.c' crashes (thank heavens for MacOSX!). MzScheme uses gc5.0alpha2, so I know something works more recent than what README.Mac describes. Before I start heavy into tracking down why it crashes (and I'm a newbie to debugging on the Mac), I was hoping to find out if anyone has already gone this route before and can give me some help. Mozilla appears to be using GC for leak detection (as of late '00), and I think they have done some Mac work, tracking that down... Any pointers would help, thanks! -- Ken From fare@tunes.org Tue, 30 Jan 2001 15:19:47 +0100 Date: Tue, 30 Jan 2001 15:19:47 +0100 From: Francois-Rene Rideau fare@tunes.org Subject: [gclist] Real-Time GC for high-level languages Dear GC Listers, I'm looking for real-time memory management solutions so as to develop embedded software using high-level languages. Is there a compendium listing whatever language implementations, GC libraries, etc, provide real-time behavior, preferrably hard real-time? My employer is mostly interested in free software, although we're also interested in knowing what proprietary is available. We're also possibly interested in real-time malloc/free. Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?) are said to be "soft real-time". What does that mean? When do they fail? What if paging is disabled: will that make them "hard real-time"? What high-level language implementation can boast about real-time latency? I have a feeling this is a FAQ. The FAQ doesn't seem to be maintained, however and its links are out of date. Best regards, [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] [ TUNES project for a Free Reflective Computing System | http://tunes.org ] A programming language is low level when its programs require attention to the irrelevant. -- Alan Perlis From neelk@cswcasa.com Tue, 30 Jan 2001 09:44:10 -0500 Date: Tue, 30 Jan 2001 09:44:10 -0500 From: Krishnaswami, Neel neelk@cswcasa.com Subject: [gclist] Real-Time GC for high-level languages Francois-Rene Rideau [mailto:fare@tunes.org] wrote: > > Dear GC Listers, > I'm looking for real-time memory management solutions so > as to develop embedded software using high-level languages. Is > there a compendium listing whatever language implementations, > GC libraries, etc, provide real-time behavior, preferrably hard > real-time? My employer is mostly interested in free software, > although we're also interested in knowing what proprietary is > available. We're also possibly interested in real-time malloc/free. The MLKit Standard ML compiler has a region-based memory allocator that does no pointer tracing, so there are no unbounded pauses. (I think regions are typically called "arenas" by C programmers.) You can find the compiler at: http://www.it-c.dk/research/mlkit/ and there are also a number of papers describing the technique. The developers advertise realtime programming as a possible application domain for their compiler. However, I've never done any realtime programming so I can't really evaluate the claim. -- Neel Krishnaswami neelk@cswcasa.com From amc4@doc.ic.ac.uk Tue, 30 Jan 2001 15:20:47 +0000 (GMT) Date: Tue, 30 Jan 2001 15:20:47 +0000 (GMT) From: Andrew Cheadle amc4@doc.ic.ac.uk Subject: [gclist] Real-Time GC for high-level languages Hi I would direct you to Richard Jones' garbage collection pages: http://www.cs.ukc.ac.uk/people/staff/rej/gc.html - it's probably the best resource around for all types of GC info. > On Tue, 30 Jan 2001, Francois-Rene Rideau wrote: > "soft real-time". What does that mean? When do they fail? A hard realtime system is one which can guarantee that a certain action will always be carried out in less than a certain time. ... in the context of gc: My experiences result from ongoing work to implement an incremental garbage collector into GHC, The Glasgow Haskell Compiler. Whilst theoretically possible to bound gc pause times (the amount of time the user program is paused while the garbage collector runs) it is actually very difficult to achieve in practice. Hard real-time guarantees the pause time to be no more than a specified time. Soft real-time whilst attempting in the best case to achieve the bounded pauses times equated with hard real-time cannot make the same guarantees in the worst case. Our prototype compiler gave us pause times on average of the order of microseconds... but we couldn't guarantee the pause times for collecting large objects such as arrays (there is a trade off in the efficiency of the array access implementation versus the how it is gc'd). I seem to remember a theoretical paper: Guy E. Blelloch, Perry Cheng: On Bounding Time and Space for Multiprocessor Garbage Collection. PLDI 1999: 104-117 which makes claims of bounded pause times. I believe, but I'm not sure, that Perry Cheng was looking at implementing the techniques mentioned in the above paper in the TILT ML compiler: http://www.cs.cornell.edu/Info/People/jgm/tilt.html I don't know how far he has got. Apart from that I know that the Open-source Erlang provides only soft real-time behaviour. I'm not sure I'm aware of any hard real-time systems out there that use 'standard' gc techniques. I think I'd agree with Neel Krishnaswami's suggestion that maybe a region based approach or something closer to techniques used in emmbedded systems maybe what you want, but I really don't know much about them, I'm just guessing! Hope that helps! Cheers Andy ********************************************************************* * Andrew Cheadle email: a.cheadle@doc.ic.ac.uk * * Department of Computing http://www.doc.ic.ac.uk/~amc4/ * * Imperial College * * University of London * ********************************************************************* From chase@world.std.com Tue, 30 Jan 2001 12:56:37 -0500 Date: Tue, 30 Jan 2001 12:56:37 -0500 From: David Chase chase@world.std.com Subject: [gclist] Real-Time GC for high-level languages At 03:19 PM 1/30/2001 +0100, Francois-Rene Rideau wrote: >Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?) >are said to be "soft real-time". What does that mean? When do they fail? >What if paging is disabled: will that make them "hard real-time"? >What high-level language implementation can boast about real-time latency? I think the Erlang people may have attained hard-real-time latency. Another thing to consider is whether or not there are any constraints on the memory-related behavior of the programs that you are running. If, for instance, you can tell me that it never has more than X Mb of live memory, then I can tell you how long a straightforward copying-compacting collection will take (for small sets of data, they are fast) after making some measurements. On the other hand, if you've got no constraints on your running program, you will find your hard-real-time GC choices exceedingly limited. >I have a feeling this is a FAQ. The FAQ doesn't seem to be maintained, >however and its links are out of date. The FAQ is only slightly maintained. I will see what I can do about its links. Are you, perhaps, volunteering to help? :-) I've been working on getting more facile with cvs over the net (been using it in a canned way for the last four years) so it might be possible to do something slightly more organized. David Chase From fare@tunes.org Tue, 30 Jan 2001 19:48:29 +0100 Date: Tue, 30 Jan 2001 19:48:29 +0100 From: Francois-Rene Rideau fare@tunes.org Subject: [gclist] Real-Time GC for high-level languages On Tue, Jan 30, 2001 at 09:48:26AM -0800, Jim Larson wrote: > The Open-Source Erlang implementation achieves soft real-time GC [...] > The Erlang GC may exhibit chronic pauses if too much data is live > in a single process. However, the Erlang programming style > discourages this from happening. Ok. Thanks for the whole explanation (including the snipped part). In practice, what kind of "soft-real-time" constraints are met by Erlang-deployed systems? And what impact does management of the shared data (bulk as well as system management information about processes) have on the real-time response? On Tue, Jan 30, 2001 at 12:56:37PM -0500, David Chase wrote: > Another thing to consider is whether or not there are any constraints > on the memory-related behavior of the programs that you are running. > If, for instance, you can tell me that it never has more than X Mb > of live memory, then I can tell you how long a straightforward > copying-compacting collection will take (for small sets of data, > they are fast) after making some measurements. So that actually this property is not specific to Erlang, only the Erlang programming style specifically takes advantage of it in achieving real-time behavior. > On the other hand, if you've got no constraints on your running > program, you will find your hard-real-time GC choices exceedingly > limited. Assuming this is the case, what does that amount to? Also are there any high-level language compilers that do automatic memory management accross several arena, as opposed to just static memory management accross several arena (as is done in the ML-Kit with region)? I.e. each arena having a different read/write/lock/allocate/free protocol, maybe programmably so? I suppose that once again, Erlang, with message passing accross separate processes, is mostly in that case. > The FAQ is only slightly maintained. I will see what I can do about its > links. Are you, perhaps, volunteering to help? :-) Sure, especially since you say you're using CVS. Thanks to all those who responded and will respond. Yours freely, [ François-René ÐVB Rideau | Reflection&Cybernethics | http://fare.tunes.org ] [ TUNES project for a Free Reflective Computing System | http://tunes.org ] They laughed at Columbus, they laughed at Fulton, they laughed at the Wright brothers. But they also laughed at Bozo the Clown. -- Carl Sagan From jim@sendmail.com Tue, 30 Jan 2001 11:52:20 -0800 Date: Tue, 30 Jan 2001 11:52:20 -0800 From: Jim Larson jim@sendmail.com Subject: [gclist] Real-Time GC for high-level languages In message <20010130151947.A21052@Samaris.tunes.org> Francois-Rene Rideau writes: >Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?) >are said to be "soft real-time". What does that mean? When do they fail? The Open-Source Erlang implementation achieves soft real-time GC since its runtime is composed of many small "processes" (threads) exchanging messages. Each process has its own heap and is GC'ed independently of the others, and message-passing makes a copy of any data sent (except for bulk data, which is held in global reference-counted storage and passed by reference). Since each process's heap is small, the GC delay tends to meet soft real-time needs. The Erlang GC may exhibit chronic pauses if too much data is live in a single process. However, the Erlang programming style discourages this from happening. Jim From jim@sendmail.com Tue, 30 Jan 2001 14:36:56 -0800 Date: Tue, 30 Jan 2001 14:36:56 -0800 From: Jim Larson jim@sendmail.com Subject: [gclist] Real-Time GC for high-level languages In message <20010130194828.A553@ZhengHe.augustin.thierry> you write: >In practice, what kind of "soft-real-time" constraints are met by >Erlang-deployed systems? And what impact does management of the shared >data (bulk as well as system management information about processes) >have on the real-time response? I'm not sure what kinds of GC pauses are typical - I think a few milliseconds, but I don't know how much that can change in unusual environments. Since the GC is only run when the Erlang process (thread) is scheduled (I think), the GC, as well as ordinary execution, is sensitive to the scheduling algorithm, which could be the tall tentpole in any real-time concerns. As far as the GC is concerned, the shared data is much like a remote reference in distributed GC - you need to know when the last reference to the remote object in your local heap goes away so that you can take some corrective action - in this case, grab a lightly-contested lock and decrement a reference count. Several straightforward methods can be used to track these references, and none should show any significant overhead in this context. Jim From ok@atlas.otago.ac.nz Wed, 31 Jan 2001 15:06:09 +1300 (NZDT) Date: Wed, 31 Jan 2001 15:06:09 +1300 (NZDT) From: Richard A. O'Keefe ok@atlas.otago.ac.nz Subject: [gclist] Real-Time GC for high-level languages Andrew Cheadle wrote: Apart from that I know that the Open-source Erlang provides only soft real-time behaviour. Erlang's limitation to "soft real-time" doesn't really have a lot to do with garbage collection. Most garbage collections in Erlang are thread-local. The basic problem is Erlang IPC: - threads can communicate with threads on other nodes over a network, so there are unpredictable network delays - you can't tell the Erlang system everything it would need to know to do hard real-time scheduling. I believe that the time to fully collect a thread is less than typical LAN network delays. From fjh@cs.mu.oz.au Wed, 31 Jan 2001 20:02:01 +1100 Date: Wed, 31 Jan 2001 20:02:01 +1100 From: Fergus Henderson fjh@cs.mu.oz.au Subject: [gclist] garbage collection of dynamically loaded objects Hi, Has anyone tried garbage collecting dynamically loaded objects that were loaded with dlopen()? A student here who is working on a SOAP interface for Mercury, Ina Cheng, has been using Mercury's dynamic loading support. But she ran into a problem where her code was calling dl__close, which is the Mercury interface to dlclose(), while there were still live references to static data defined in the dynamically loaded module. Part of the problem was that we hadn't documented the dangers of dl__close. But figuring out when you have references to static data in the dynamically loaded module is tricky, and may require some understanding of implementation details of the Mercury implementation. In an ideal world, you wouldn't need to call dl__close at all, because the system would garbage collect the dynamically loaded module when it was no longer referenced. The Boehm (et al) conservative collector, which we're using, has code to trace references *from* (writable) static data in dynamically loaded modules. But it doesn't have any code to trace references *to* code or static data in dynamically loaded modules. Has anyone considered doing that? P.S. I guess it only makes sense to garbage collect objects that were loaded without the RTLD_GLOBAL option, because if that option is enabled, the object file might be used to resolve references in other dynamically loaded objects, so the garbage collector can never determine when it is safe to collect it. -- Fergus Henderson | "I have always known that the pursuit | of excellence is a lethal habit" WWW: | -- the last words of T. S. Garp. From rv@bluetail.com Wed, 31 Jan 2001 13:35:44 +0100 Date: Wed, 31 Jan 2001 13:35:44 +0100 From: Robert Virding rv@bluetail.com Subject: [gclist] Real-Time GC for high-level languages Francois-Rene Rideau writes: >Dear GC Listers, > I'm looking for real-time memory management solutions so as to develop >embedded software using high-level languages. Is there a compendium listing >whatever language implementations, GC libraries, etc, provide real-time >behavior, preferrably hard real-time? My employer is mostly interested >in free software, although we're also interested in knowing what >proprietary is available. We're also possibly interested in real-time >malloc/free. > >Notably, the Boehm GC, the RScheme GC (one of them), the Erlang GC (which?) >are said to be "soft real-time". What does that mean? When do they fail? >What if paging is disabled: will that make them "hard real-time"? >What high-level language implementation can boast about real-time latency? I have some references to real-time garbage collection papers: Mats Bengtsson: Real-Time Compacting Garbage Collection Algorithms Lund University, LUTEDX/(TECS-3028)/1-118(1990) & LU-CS-TR:90-61 Roger Henriksson: Scheduling Garbage Collection in Embedded Systems Lund University, LUTEDX/(TECS-1008)/1-164/(1998) Roger.Hendriksson@dna.lth.se,http://www.dna.lth.se/~roger As someone has already pointed out "hard real-time" is when you really worry about worst-case and soft real-time is when you worry about average-case. As the second paper (PhD thesis) points out that a collector is part of a complete hard-real time system so that it is not enough for the collector to have hard bounds, the system as a whole must meet its bounds. This means that the memory system must not only be able to supply enough memory when needed, but also be able to delay collecting to when it is suitable for the system. Typically there will be bursts of high priority real-time activity when no latency is allowed and sufficient memory must exist interspersed with low-priority activity which can be interrupted for collection. The collector must fit in. Robert -- Robert Virding Tel: +46 (0)8 545 55 017 Alteon Web Systems Email: rv@bluetail.com S:t Eriksgatan 44 WWW: http://www.bluetail.com/~rv SE-112 34 Stockholm, SWEDEN "Folk säger att jag inte bryr mig om någonting, men det skiter jag i".