On memory management

Ursula Dreier Ursula.Dreier@ruhr-uni-bochum.de
Sat, 28 Nov 1998 23:22:14 +0100



Matthew Tuck schrieb:

> Automatic memory management should be the norm in any language by now.
> What I'm not sure of is whether we should allow for a dynamic delete if
> the programmer really wants to.
>

I'm relieved to read this (yesterday I had a discussion with a C++
programmer who thought that GC would be a bad thing, inefficient and
non-deterministic... ;)

> General persistence also needs to be able to recognise pointers in an
> arbitrary memory block.
>

That's true.

> > Here's my suggestion for a memory layout:
>
> > You have storage allocated from the OS in big "chunks". It will be
> > divided into smaller clocks, each containing one object. Each object
> > has a length, a type (which is a pointer to another object), some
>
> By "type", do you mean pointer to metaclass object?
>

I mean a pointer to a class object: An object of class "class", containing
everything that is needed to access the class' features. If you call that a
metaclass object, that's what I mean. But why "metaclass"?

By the way, if the object in question is an instance of some parameterized
class, the actual class parameter is accessible through the "type" pointer
as well.

> > the rule) binary stuff, marked by some prefix and (if of unknown
> > length) accompanied by a length.
>
> Are these exceptions integers, characters, etc.?  Why is there the need
> for this distinction?

Since the GC (or persistence manager, or heap walker...) has to inspect the
whole object space, there has to be a means to identify links between
objects. Everything else can be regarded as "binary stuff" if there is no
need to examine it in more detail (which involves interpreting the
description of the class). Most often an object just contains links OR
binary stuff, but I think it might be a good idea to remain as flexible as
possible, while having to provide some general framework for objects.

Let me sketch a posiible memoty layout as an example:

struct Object {
Object* ClassPointer;        // ("type",see above)
long TotalLength;
short unsigned UnusedStoreAtEndOfObject;
short unsigned Flags;        // Contains bit flags for GC, reference count,
distinczion between node & atom
// followed by the actual contents
}

Or we might consider another layout where there are no direct pointers but
rather offsets into an object table that contains the memory addresses
(allowing objects to be moved without having to fix up references, but at a
performance penalty).
Whatever layout we choose, we should insulate these memory management
related issues inside a class as to be able to try out different
implementations without having to rewrite it all.


>
>
> > When you produce a version for shipment to the customer, the objects
> > containing the source code and the development environment will be
> > discarded as far as they are not referenced by the compiled code
> > (their references being chopped off by deleting to names of their name
> > spaces and the memory being recycled by GC).
>
> This sounds like you're doing a partial persistent dump.  I'm not
> exactly sure what you mean here though.

Partial dump is correct. Maybe it would be easier to temporarily save the
current memory image, chop off the parts that are not be exported (by
removing their references from the root name), to let GC remove everything
that is now no longer referenced, save that image as the export file and
restore the temp image.

This allows retaining parts of the development environment in the released
end customer image (say, for dynamic expression evaluation by on-the-fly
compilation). It also means that the saved image will only contain those
objects that are actually needed, regardless of how many superfluous
libraries had been included in the first place.

> > This means that you have a truly INTEGRATED environment where your
> > application and your development tools can run together in the same
> > process at the same time. It means that you would even be able to view
> > and modify the development environment while it is running, if you
> > choose. Of course, normally your application project, the runtime and
> > the development environ would be separated by means of different name
> > space entries as well as access restrictions to avoid accidental
> > damage.
>
> I guess I've missed the plot of this.  I know Smalltalk does something
> with keeping the program and development environment together.  The main
> advantage of the run program with development environment I can see is
> for debugger and profiler modes.  And you can changed code at run-time,
> but I'd call this a debugger feature.  Is that basically what it
> allows?


Here's an example what dynamic adaptation can be done to the editor:

Say, our classes (or at least some subclass of class) allows for a special
kind of instance items, called PROPERTIES. They would behave like ordinary
instance items, except for the fact that they can be individually preset for
each object that is created due to a declaration (rather than using the
"new" method), using some built-in (invisible) constructor that accepts the
initialisation list generated by the editor. Whenever you declare such an
object, the editor would automatically insert corresponding subitems for
that declaration into the edit outline. Let's have a class "aPoint" with two
properties called X and Y. You declare an item of class X like this:

aPoint: Origin

When you hit return and thus remove focus from that tree item, subitems will
be generated automatically:

aPoint: Origin
    X:
    Y:

and you can enter values for them. Maybe this example sounds a little bit
remote, but I assure you, it's exactly what is needed if you declare
graphical items.

BTW, even the outline of a function declaration can be done using this
device:

Function: myFunction
    Description:
    Parameters
...

In this example "Description: " and "Parameters" would be properties of the
class "FunctionDefintion" that the editor refers to when a function has to
be defined.

To be really useful, the definition of a property should also contain the
following:
- the (non editable) heading text of the property
- methods for displaying, entering and validating  the property's value.

This lets the user define the way he wants to customize the objects the
defines, and that customisation is available immediately after the class has
been compiled.

In fact, the user can completely define the behaviour of the editor.

> I'm not too sure how this would work with heavily optimised
> code, but I guess you could turn off the feature or the optimisation.
>

If we implement incremental compilation at least on a one-method-at-a-time
level, I don't think this could be a problem as long as we allow optimised
and non-optimised methods to be mixed freely. It also would primarily affect
the editor where top performance shouldn't be that important. And of course,
it only depends on the user's preferences: Quick compile for immediate
availability or slow compile for maximum performance.



> I assume the "access restrictions" you're referring to are a form of
> sandboxing - I'd use a similar box for the editor views as I mentioned
> before.

Yes. It's primarily thought of as a security feature to keep the IDE
intact.But it could also be used for other purposes:
Say, we only allow acces to those parts of a name space that are flagged as
"public" (in the sense of C++): This would be a way to let the user use just
the interface, while being able to compile and use the implementation (which
is present but invisible) at the same time.

And there are "wizards" as well. But because those operate on internal
structures rater than produce source code that is due to (hard to parse)
changes by the user, there is a chance that they need not be limited to the
"generate once" approach often found in IDEs.

> > It's all very similar to LISP, maybe except for the fact that a node
> > object in LISP usually consists of only two pointers (car & cdr)
> > whereas node objects here are intended to contain any number of
> > pointers, their semantics depending on the object type.
>
> By resemblance are you referring to the memory model or the ability to
> change itself?
>

Both. But I wouldn't go as far as to advocate self-modifying programs in the
sense that this possibility is used as a standard device; this should remain
under the user's control.