[gclist] Reference counting on the fingers of one thumb.

Amit J Patel amitp@Theory.Stanford.EDU
Fri, 31 Jan 1997 15:01:54 -0800


[To the GCList folk:  Not all of this is about GC, but some stuff in
 the middle is, so I thought I'd leave it here -- let me know if you'd
 like to see this thread move to e-mail.  -- Amit]

John Carter wrote to me on January 30:
> My first response was to sputter and fume, my second response is to
> say yes, you are quite right. I should make := mean handover, and have
> "deepcopy" as a function call or something. I knew that asking the net
> for comment would save me from many fundamental blunders.

To make ':=' handover would be quite confusing to people who are used
to other languages; I hope you'll use a different syntax.  :)

> > I view names (variables) as references to objects, not the objects
> > themselves.  
> 
> Yes, that is the normal view of them. I wish to move to a view of "If
> I have its name, I have it in my hand. If I have it in my hand, I can
> do something with it." In the spatial world, we can reach out and pick
> up an object and do something with it. In the computer world, the
> variable names are the "hands" in which we hold the objects we operate
> on.

You say:

	If I have it in my hand, I can do something with it.

I agree with that, but not inverse (converse?):

	If I do not have it in my hand, I can not do something with it.

I want to talk to you without having to know where you are.  I don't
want to have you in my hand (!) before I can talk to you.  You said
this can be done by keeping a 'container' and 'index' pair, but:

	1.  You may move.  Do you notify everyone that knows you every
	    time you move from room to room?  How do they update their
	    information?

	2.  How do I refer to the container?  Don't I have to be
	    HOLDING it to get to it?  (If not, you will have to
	    perform garbage collection on everyone referring to
	    containers!)

	3.  Even if you wanted to notify everyone that you've moved,
	    how could you?  You can't tell them anything because you
	    don't have them in your hand.

	4.  How do you handle circular structures?

> The normal situation, and the reason for much of the need for GC, is
> that we can have arbitarily many "hands" holding the same object.

A "real-world" need for garbage collection at my house is:

	The newspaper arrives.  It is separated into sections A
	through F.  I want to read section E; my wife wants to
	read section B.

	Sections A, C, D, and F are not being 'held' by either
	one of us?  Are they garbage collected?

	* If so, then I am unhappy because I cannot later read
	  section C.

	* If not, then how do they ever get garbage collected?
	  Their 'container' is the dining table.  I do NOT want
	  to throw away the dining table in order to get rid
	  of the newspaper.

	There is a need for GC here -- I need to throw away a
	section of the newspaper only after nobody "wants" it,
	not after nobody "holds" it.  In a normal programming
	language with garbage collection, I would express an 
	interest in a section by keeping a variable that refers
	to that section.  Until I no longer want to read any
	of those sections, the sections cannot be garbage 
	collected.


>From this I conclude that there is something similar to garbage
collection in the real world, so if I was trying to make a programming
language that worked like the real world, I would have to have garbage
collection in my programming language.



[Disclaimer:  I do not own a house, I am not married, I do not 
              receive the newspaper, and our newspaper has more
	      sections than I listed.  All references in the above
              story are purely fictional and any resemblance to
              persons or places, actual or fictional, is purely
              accidental.]


> > I would like to page you without knowing which container you are in.
> > I want an alias for you that doesn't involve knowing where you are.
> 
> There is nothing to stop you putting everyone in the container called
> "global_heap", and working with integer indices thereafter. This is
> exactly back to ye olde C pointers. The point is the default behaviour
> in C is unsafe, but you can make is safe. I propose to make the
> default behaviour safe, but you can explicitly make it unsafe.

I don't want to use C pointers, and I want to page you without knowing
which container you are in.  In your language proposal, in order to do
what I want (which I consider safe), I have to do unsafe things.  IMO,
I should never have to do something unsafe in the language to do
something that's conceptually safe.

> > Another example is a program accessing some file on the disk.  If I
> > move the file into a different folder, I want that program to still
> > access the file; it shouldn't have to know that the file has 'moved'.
> > (Most OSes don't give you this, unfortunately.)
> In Linux "mv old.name new/place;ln -s new/place old.name". But I
> really wouldn't like that to be the default behaviour of move.

This isn't a great solution; it is going to leave behind thousands of
links in the old locations that point to the new locations.  What I
really want is all references to the old file to be changed to refer
to the new file.  

If I have a folder called 'Bills to be Paid' and a program that keeps
a list of all my bills, then when I move a bill (file) from 'Bills to
be Paid' to 'Bills Paid', the program's data structure should be
updated (somehow) so that it doesn't lose track of this bill.  I'm not
sure if that makes sense unless you've seen an OS that does it.  :)

> > What fun thing would you like to do with this integer?  :)  I can't
> > think of anything really useful to do with it.  If I have a reference
> > to you (your room + your integer), does it help to multiply your
> > integer index by four?  
> 
> Yes, if you know that every fourth person in the room, (as numbered),
> is an elligible person of your sexual orientation.

And if someone leaves the room, and someone new comes in, what happens
to the numbering?  Is it still valid?

> In C you can do fun things like trot up and down arrays with pointers
> in arbitary steps. Pascal pointers are boring. You can only follow
> them. 

So .. why can't you do this 'every fourth person' thing in Pascal, by
using arrays and indices?  I don't understand what the C pointers give
you.

> If you have two structures "in sync" with each other, you can transfer
> the integer offset of the one pointer to access the corresponding
> object in the other structure. The notion of correspondence is quite
> universal. 

If you just kept the integer index in the first place, you'd be able
to keep the correspondence.

> Admittedly these practices are unsafe, that is why I have doubts
> about it.

> Yes. It is good at creating complex webs. But complex webs are unsafe,
> and can be represented in other ways. Look to the database
> world. Databases once were hairy things that created complex webs of
> pointers. Now no one will even admit that their database isn't
> relational. (Even if it isn't). The relational model has won
> overwhelmingly. Think of your example above in terms of building a 4th
> normal form database. A notable feature of a 4th normal form database
> vs the old all-in-one-with-pointers-everywhere is the proliferation of
> smaller tables (relations), with entries that are indices of some form
> into another small table.

Are you trying to build a database programming language?  :)  I must
admit that I am not too familiar with relational databases, but it
seems that they do not deal with the object identity problem well.
Two objects may have the same data in them; how do you tell them
apart?  For example, it's possible (but unlikely) that there are two
people named 'JOHN CARTER' in the world, with the same height, hair
color, weight, etc.  In the relational model, these two objects would
be indistinguishable.  However, in the 'real world', there is a
difference between TWO John Carters and ONE John Carter.  Databases
sometimes have an artificial solution called the "ID" but that too is
not satisfying -- what does it represent in the real world?  I need it
only to get around this problem with relational databases -- because
they deal with "values" and the essense of John Carter is not merely a
set of numbers (values).



	- Amit


-- 
Amit J Patel, Computer Science Department, Stanford University 
       ``A language based on point and click is like a 
         human language based on point and grunt.'' -- me