Renaming newEmpty/newSize: to new &capacity:

Mon Jan 31 20:20:46 PST 2005

On Jan 31, 2005, at 8:03 PM, Shaping wrote:

> Capacity refers to a "hint" to the
>> collection that it needs space for so-and-so many elements. The 
>> collection can ignore it at will,
>
> Why would it do so?  I would think that the collection would remember 
> it, and refer to it before growing.

The question is why /wouldn't/ it, first. Slate's memory allocation is 
based on fixed-size Arrays, so a certain size of memory has to be 
allocated by every Set, ExtensibleArray, Dictionary (two arrays), Bag 
(again, two), etc., and then /totally replaced/ if the collection needs 
to grow. So you perform two or possibly more allocations before adding 
in all your elements. That's why there's a #capacity concept in the 
first place. (And to pre-empt your question, we don't allow "stretchy" 
arrays because this stresses the memory manager's complexity quite 
thoroughly, and in fact pretty much just hides the very same 
allocation.)

Now, as for ignoring #capacity, trees allocate one node at a time, so 
space allocation will be proportional to the elements added and 
incremental. You /could/ "pre-allocate" tree nodes, but this /also/ 
requires building a cache of such nodes into the tree object and then 
clearing them when they are "recycled", but this has potentially nasty 
problems with memory management, and pretty much is not recommendable 
unless you find a situation where it's found necessary through specific 
profiling.

>> which is what makes it a perfect
>> candidate for optional parameters, and capacity is already a distinct 
>> variable from size. A collection's size is the number of elements it 
>> contains. A collection's capacity is the number it /can/ contain 
>> without growing (for trees, it is LargeUnbounded, for obvious 
>> reasons).
>
> Then capacity and size are distinct ideas.  The one does not replace 
> the other, as I thought you were suggesting.  I did not understand 
> your original statement below.  You seem to be developing two distinct 
> protocols.  If that is so, then the choice is good.

Right. Okay, cool, it's clear to you then. (This is also commented in 
the collections code, as well.)

>> We have a unit library (src/lib/dimensioned.slate) which will not be 
>> used for something this basic. "Units" are just indexable slots in 
>> this case. #sizeInBytes just shows that you are thinking at totally 
>> the wrong abstraction level.
>
> The example was given to demonstate a more specific mode of counting, 
> the form of measurement.
>
> Is your capacity protocol only quantic (Integer-based)?  Will it be 
> extented to continuous ideas (namely Floats)?  Slots are always 
> counted, of course.  But the idea may be more flexible, without 
> creating comprehension problems.

Well, first of all, Slate is dynamically-typed by default, so whatever 
object is passed in is accepted until an error is thrown or message not 
found when sent. But if you try allocating arrays of size 5 / 2 or 3.4, 
right now you will get an uncontrollable cascade of errors, which will 
immediately crash your image. Squeak, I note, raises primitiveFailed on 
the same request. There's just no logical response for it (although the 
image crashing is a bug we need to fix - basically ensuring the 
argument isSmallInt).

Give me a use-case that matters, and I'll find a way to make the 
protocol more robust, but remember that it'd add some overhead, so 
there's a cost/benefit balance here.

--
Brian T. Rice
LOGOS Research and Development
http://tunes.org/~water/