release and thoughts
Mon, 7 Aug 1995 20:52:16 -0400 (EDT)

  I'll get to what you wrote to in a bit, but first, I do want to discuss
the multitasking model you have in mind.  I know that you want cooperative
and think that preemptive is either evil or lame (or both, I've forgotten),
but I'm here to show you that preemtive is not quite as lame or evil as you

  One reason you state that preemptive is bad is because of the high
overhead of saving state information, and that with cooperative, there isn't
nearly has much overhead.  This is true.

  But the other side of the coin is that in a cooperative system, a single
task can hog the entire CPU without reliquishing control.  Your way around
this is to forbid the use of Assembly (which I think is a bad idea, but I'll
get to that in a bit) and use a compiler, which will "insert" yield calls

  The downside of this is you've just thrown speed out the window.  

  "How?" you may ask.  "The yield() call does nothing until the quantum
(timeslice) is up, then it actually does something".  So, let's say that
saving the state will take X cycles for preemtive, Y cycles for cooperative,
where X > Y.  And further, let's say that the call to yield(), until the
quatum is up, is equivilent to the following (I'll be using the 386 as an

		call		yield
yield:		retd

Given the following timing for the 386 in PM mode:

		CALL		7+m	(m = extra components)
		RETD		10+m 

  Since there is an extra component in the CALL (offset) that's a total of
18 cycles overhead for each call to yield().  Preemtive is a bit more

		INT		99	(via Interrupt Gate)
		IRETD		82

  So the overhead of the CPU responding to the timer interrupt is 181
cycles.  Now, the total overhead for each can be expressed as:

Cooperative:		18 * yc + Y
Preemtive:		181 + X

X:			Overhead of preemtive context save + timer code*
Y:			Overhead of cooperative context save + timer code*
yc:			Number of times yield function is called per quantum

* Assume the timer code is the same for both.

  The only way cooperative is going to be faster than preemtive is when 

                            18 * yc + Y < 181 + X

  Now, discounting the overhead of context saving (the X and Y portions) for
a second, and just taking about the overhead of the CPU, we get:

                                18 * yc < 181

  From there, I hope you can see that if the yield() function is called more
than 10 times per quantum, then cooperative overhead overtakes preemtive.

  Or, if you want it theorectical, preemtive is O(1) and cooperative is
O(n).  The more calls you make to yield() per quantum, the better preemtive
is, unless X is exceedingly larger than Y.

  But then, in another message, you go:

>    Because the function is not actually called !
> Instead, code is modified at
> some point by the interrupt-driven routine that decides of a schedule...

  This is better than preemtive?  It sounds preemtive to me, and while it IS
possible, it sounds like more work than just having the CPU save the state
and move on to another task.  Because now you have to save the old code that
is overwritten with the call to yield, which then probably writes the
old code back.  

  What, exactly, do you want?

On some network somewhere in cyberspace, Francois-Rene Rideau transmitted:
>    I have received no help at all, but in a few available source files;
> books about the ix86 and the PCs are crap; docs on the internet are better,
> but do not tackle the specific problems of the kind of software we're writing.

  As far as the Intel stuff goes, you'd probably be better off getting
reference materials directly from Intel.  As far as the PC goes, get several
different references.  They all help.

>    As a result, the code yield is poor for so much work-time. But my time was
> not lost for that: by munging with assembly, I have learnt (and crystallized)
> a lot, not only about the lame PC hardware, but about programming in general.
>    Firstly, doing low-level code is very difficult and annoying, because the
> hardware is complex and not well documented, that is, it has evil semantics.

  Please explain this.  What is evil about hardware semantics?  Hardware,
unfortunately, has to deal with the real world.  Which means that the
software that deals with the hardware has to deal with the real world. 
Granted, some of it isn't easy, but to call it evil is a bit harsh I think.

> The only reasonable way to develop is at abstract over the hardware to
> obtain objects with better, *cleaner semantics* (abstracting without cleaning
> the semantics, as in C, is lame); else you get too many bugs. 

  Could you explain this as well?  I'm lost now.

> When things run
> correctly, it is time to make them quicker, but not before. So I wrote
> *lots* of macros, that made development possible. I also saw how no good
> macro language exists. CPP is pure shit; I chose m4, but it has
> evil reflectivity semantics (I had to systematically remove reflectivity
> from my code, but for some trivial macros). 

  I'm sorry, but I still don't quite understand reflectivity.  Could you
please post some example of what you were trying to do but couldn't?  Maybe
then I can understand it.

> Anyway, with as86 (a simple, but
> robust assembler), and m4 (a powerful macro system), I could do much more
> than anyone can do with lame DOS-based macro-assemblers, or ad-hoc C code.

  Maybe it's just me, but I found that MASM is one of the better macro
processors around (that just happens to be inside an assembler).  m4 may be
powerful, but I've had to fight it enough times that I don't really like it,
and I feel that MASM is much better than m4 (no, really).

  And not all C code is ad-hoc.  Avoid the Unix based stuff then.

>    The second point is that most of the hassle (and a great source of nasty
> stupid bugs) with low-level code is what I call "calling conventions",
> which is commonly solved in HLLs by "typing": 1) you have to give explicitly
> implementation details of functions, and 2) even when you've successfully
> defined your function, you must manually conform to these details when
> using the function further in your code. Note that calling conventions
> are not just immediate arguments, but all contextual dependencies and
> all the background required for a function to work properly (e.g. such
> pointers are of such "type", are not aliased; such structure is uniquely
> referenced, such list is injective, those two functions are inverse one of
> the other, etc). Actually, this is the more general problem of language
> *expressivity*: no existing low-level languages can't abstract over calling
> conventions; conversely no high-level language can't talk about low-level
> details; and all existing general purpose languages have a deterministic type
> system (when not a trivial one), and thus very little expressive.

  Are you SURE you don't want DWIM?  (Do What I Mean)  No matter at what
level you are at, there IS a calling convention.  Now, compilers can help
with the "calling convention", but there still is one.

>    The work-around for using languages that lack abstraction power,
> is that calling conventions should always be particularly well documented,
> and great caution is taken when invoking a function; for that reason also,
> people often choose arbitrary global conventions that reduce performance
> greatly to reduce the amount of bugs. 

  Really?  I tend to avoid global variables as much as possible myself.

> All that is stupid. A language just
> *should* have enough abstraction power so that one could define arbitrary
> conventions or meta-conventions, and let the computer do all the dirty
> work (programmable calling conventions).

  I would disagree.  Maintaining code is bad enough with static calling
conventions.  Letting programmers working on different aspects of a program
to pick their own calling convention is suicide.  One might pick left to
right evaluation, another right to left and a third prefix.  In the same

  Thanks, but no thanks.

>    I will be satisfied of the HLL compiler only the day when I can achieve
> better and more reliable code than I can currently do with hand-coded
> assembly, and in much less time, while having a high-level prototype
> immediately; 

  Have you considered using CASE?

> and this can be done easily by programmably combining
> higher-order code transformations on high-level code, and generic
> meta-implementation operator that map high-level objects to low-level
> implementations (which is the generalization of calling convention),
> with a simply customizable syntax, so I can adapt the input tool to the
> input data. *This* is the trend where we should go.
  But who writes the actual CODE?

  -spc (I'll have to double check the glossary of TUNES ... )