On memory management

Matthew Tuck matty@box.net.au
Tue, 01 Dec 1998 20:27:01 +1030


Ursula Dreier wrote:

> I'm relieved to read this (yesterday I had a discussion with a C++
> programmer who thought that GC would be a bad thing, inefficient and
> non-deterministic... ;)

I always supported GC, but after a little research, I became a
proponent.  Todays GCs can be faster than manual deallocation.  Some
improve locality as a side-effect.

Incremental GC is important for interactive programs, but both
incremental and stopping GC should be an option ideally.

As for non-determinism, there's already some in todays' languages. 
Parameter evaluation order is one I can think of.  Have you ever worried
about that?  Mpst programmers probably would'nt even know if the
language they're programming in uses deterministic or non-deterministic
ordering for that matter.

As for finalisation ordering issues, I believe that can be a problem. 
I've got to get around to going through my GC mailing list folder which
I haven't done since about the year ago since I subscribed, I believe
they talk about this a bit.

> I mean a pointer to a class object: An object of class "class",
> containing everything that is needed to access the class' features.
> If you call that a metaclass object, that's what I mean. But why
> "metaclass"?

That's just the terminology often used for a class class.  The "meta"
prefix just means talking on one level up.  It's probably not as
applicable here though, but in Smalltalk you have meta-meta-classes and
so on.  A metaclass object is a class.  The class called "Class" is the
metaclass.

> Since the GC (or persistence manager, or heap walker...) has to
> inspect the whole object space, there has to be a means to identify
> links between objects.

Yeah I understand that, I just got the impression that you were
referring to something else as well.  After the message, I thought "hmm,
nodes, atoms, ... d'oh!".  I was not thinking about the words
themselves, just your description of them.  =)

BTW, what is a heap walker?

> short unsigned UnusedStoreAtEndOfObject;

What is this line referring to?

> Or we might consider another layout where there are no direct pointers
> but rather offsets into an object table that contains the memory
> addresses (allowing objects to be moved without having to fix up
> references, but at a performance penalty).

If the system knows what is a pointer it is still possible to move
things in memory by changing all the pointers.  This is what a
compacting garbage collector does.

This would probably be more of an issue in the virtual machine rather
than anything the compiler would worry about.

> Whatever layout we choose, we should insulate these memory management
> related issues inside a class as to be able to try out different
> implementations without having to rewrite it all.

That sounds a bit hard but is worth trying.

> Partial dump is correct. Maybe it would be easier to temporarily save
> the current memory image, chop off the parts that are not be exported
> (by removing their references from the root name), to let GC remove
> everything that is now no longer referenced, save that image as the
> export file and restore the temp image.

That's certainly an interesting idea, but I would think persistence dump
is made for that sort of thing.  I guess that might be one
implementation of persistence for all I know.  It sounds a bit slower,
although it would depend on how many classes you're dumping.  A GC would
still have to traverse most of the object store.
 
> This allows retaining parts of the development environment in the
> released end customer image (say, for dynamic expression evaluation by
> on-the-fly compilation). It also means that the saved image will only
> contain those objects that are actually needed, regardless of how many
> superfluous libraries had been included in the first place.

I'm not convinced supplying the development environment is the best way
to do this.  Can supplying compilation facilities to allow run-time code
generation by the program itself allow this?

I can see the programmer can break in and generate things, which could
be more of a debugger environment.  Then again maybe what you're getting
at is that you want this debugger/environment to be compiled in if
desired.

When you said before that you want to pack the IDE together with the
program, were you speaking logically or physically?

> Here's an example what dynamic adaptation can be done to the editor:

OK, I understand the example.  What I think you're saying that you get
with this is firstly the ability for the editor to show you the
information you need to enter.  So it is a bit like passing parameters
by keyword, e.g. put(text => "Hello", pos => 3, colour => red) in Ada's
syntax.  And secondly you get to enter a string which is validated and
translated.  The validation could be done in the normal way with
preconditions and compile-time warning "this will fail a precondition at
runtime".

But the point is taken though.  You want the editor to be extendible in
certain places in that it can actually call the program to find out what
to do.  That's a lot like the extendibility spirit of allowing the
writing of views.  I'll have to wait and see what examples of program
integration people come up with before commenting further.

> BTW, even the outline of a function declaration can be done using this
> device:
> 
> Function: myFunction
>     Description:
>     Parameters
> ...
> 
> In this example "Description: " and "Parameters" would be properties
> of the class "FunctionDefintion" that the editor refers to when a
> function has to be defined.

I'm not sure what it would achieve letting the program determine what
are a valid set of parameters - these should be in the specification,
totally different to the implementation.

> In fact, the user can completely define the behaviour of the editor.

Under what you're saying it is possible for a class to have something to
say about how to show itself.  I think the defining view is a more
understandable way than a huge number of hooks though.

> If we implement incremental compilation at least on a
> one-method-at-a-time level, I don't think this could be a problem as
> long as we allow optimised and non-optimised methods to be mixed
> freely. It also would primarily affect the editor where top
> performance shouldn't be that important. And of course, it only
> depends on the user's preferences: Quick compile for immediate
> availability or slow compile for maximum performance.

Yeah, I was referring to inter-component optimisation, but it can quite
happily be left off for debuggable mode.

> Yes. It's primarily thought of as a security feature to keep the IDE
> intact.But it could also be used for other purposes:

I was thinking about a general sandbox facility in the language. 
You'll see that in my nesting hierachy proposal a while back.  I think
it would be a useful facility for writing extendible software.
 
> And there are "wizards" as well. But because those operate on internal
> structures rater than produce source code that is due to (hard to
> parse) changes by the user, there is a chance that they need not be
> limited to the "generate once" approach often found in IDEs.

Yes, this is what I was thinking about when you were referring to
resource editors and stuff.  They have the same problem in that they are
one level of abstraction up from what I was trying to do.  Extendible
parse trees, like I have been talking about, can help, in that they have
rules for generating the lower level code if needed.  An alternative for
complicated generation mechanisms like wizards are storing both the
shorthand and the full form.

> Both. But I wouldn't go as far as to advocate self-modifying programs
> in the sense that this possibility is used as a standard device; this
> should remain under the user's control.

Yeah.  If you'll notice, in most systems, there is always a kernel which
doesn't change, which changes other stuff.  Modifying code is useful
providing it is understood and necessary.  The best examples of this are
run-time compilation in which you are really extending the system.  The
definition of modifying code is really blurry anyway.

I would be really interested in seeing examples of optimisations that
take non-self-modifying high-level code and transform it into
self-modifying machine code that is more efficient that normal code
could be.

-- 
     Matthew Tuck - Software Developer & All-Round Nice Guy
                              ***
       Check out the Ultra programming language project!
              http://www.box.net.au/~matty/ultra/