release 0.0.0.20 and thoughts

spc@gate.net spc@gate.net
Thu, 10 Aug 1995 02:21:41 -0400 (EDT)


On some network somewhere in cyberspace, Francois-Rene Rideau transmitted:
> 
> >   I'll get to what you wrote to in a bit, but first, I do want to discuss
> > the multitasking model you have in mind.  I know that you want cooperative
> > and think that preemptive is either evil or lame (or both, I've forgotten),
> > but I'm here to show you that preemtive is not quite as lame or evil as you
> > think.
>    I don't think that preemptive is lame or evil per se. Actually, if you
> followed the discussion between Billy and I (and other people) on the list,
> you could see that he showed how my proposal could very well be called
> preemptive multithreading, too !

  Well, I am on the list, so you don't need to CC: me or the list.

>    What I think and prove is lame and evil is the no-information paradigm
> from traditional system design, by which no information is statically or
> dynamically assumed or enforced about running threads, leading to a paranoid
> protection mechanism that is inefficient in time space and security alike.

  Well, my own view is not to assume anything, or at the best, assume the
worse unless proven otherwise.  And as to verifing code to make sure it'll
do what it says it'll do I feel is an NP-complete problem, but that's just
me.

  There is a real concern over security, and if you have two people using a
single machine, you've cut your effective security in half.  Same goes with
computers over a network (and I hope you realize that in these enlightened
days, the network IS the computer).  And it's in a network based distributed
system where micro-kernels (well, designed around certain paradigms) really
shine.  They also work better on multi-processor (MIMD) systems.

  Unless you can show me a way that cooperative multitasking will work
across a network, in a multiprocessor computer, or both.

>    Of course, at the time when computer memories were tiny, one couldn't
> afford complicated systems, and would prefer make it tiny, and let people
> hack their way with it. But this is no more tolerable on nowadays' computers,
> that can handle much more memory than what the user can input.
> 
  And it's that thinking, where memory is infinite, that lead to the current
state of systems, like Windows or modern Unix systems.  You may find it hard
to believe, but I have, sitting next to me right now, a Unix based box with
only 512K physical RAM (probably a few meg total in virtual memory) with a
15M harddrive.  With a kernel size of 69.4K.

  Now imagine what you can do if you had 16M of RAM, with a 70K kernel. 
That leaves more space to run more processes.  One of my favorite OSes had a
kernel of only 10k.  This on a Pentium running at 100MHz.

> >   One reason you state that preemptive is bad is because of the high
> > overhead of saving state information, and that with cooperative, there isn't
> > nearly has much overhead.  This is true.
>    The problem is *not* only the registerwise information state, but also and
> mostly all the rest, semaphores, etc, things that make a shared GCed
> address space impossible 

  Count me stupid, but I sure don't see how you have to save semaphores as
part of "state" information of a task.  Sure, semaphores are owned by tasks,
but I've yet to come across an implimentation were they had to be "saved"
across context switches.

> (remember that Tunes must be resiliant and
> persistent hence supporting garbage collection, even if this could be disabled
> on specific statically linked applications). In a non-informative system,
> cost to resynchronize is wondrous. 
                           ^^^^^^^^ 
  I think you meant to say "enormous".  Wonderous has a different
connotation, as in "something that is wonderful".

> In a cooperatively informative system,
> synchronization is enforced at compile time when possible, or with the
> help of compile-time calling conventions.
> 
  I fail to see how that helps.  I keep thinking of starving philosophers
myself.

> >   But the other side of the coin is that in a cooperative system, a single
> > task can hog the entire CPU without reliquishing control.  Your way around
> > this is to forbid the use of Assembly (which I think is a bad idea, but I'll
> > get to that in a bit) and use a compiler, which will "insert" yield calls
> > periodically.
>    Why forbid assembly ? Sure forbid unchecked assembly to novices,
> or at least not outside of a paranoid isolation/emulation box. 

  But this goes against that TUNES stands for.  Freedom of programming and
all that, unless I'm missing something again 8-)

> But
> that's not worse than what exists on current systems. The problem with
> those systems is that you *cannot* use assembly at all or anything to
> extend the system itself. 

  I've yet to come across a system were I can't use assembly.  Even on Unix,
the assembler (usually called 'as') is available for anyone to use (assuming
the C compiler is available for anyone to use).

  Now, as to extending the system, Unix it's nearly impossible unless you
have access to modify the kernel.  But under QNX it is possible to extend
the system, and even though QNX is a microkernel, it was one of the fastest
MULTIUSER multitasking OSes I've encountered.  You want TCP/IP under QNX? 
Not a problem.  Run a process.  Instant TCP/IP.  No mucking around with
changing the kernel at all.

  Not that I'm trying to sell QNX here.  Just pointing out there are systems
out there that allow to extend the system itself.  

> Just *everything* is done in those damn paranoid
> inefficient coarse-grained "processes".

  Well, if you're talking about Unix, then yes, the unit of execution is
quite coarse-grained.  But most OSes written after 1980 have the concept of
thread (or task), which has less context than a process.

>    Also see how yield could be just a sti() cli() pair, or simply having
> a way for the interrupt routine to find a interrupted-position-dependent
> routine to do the interrupt cleanly. 

  I don't see how you can make yield() be a sti()cli() pair.  And what is an
"interrupted-position-dependent routine"?  And if it's what I think it is,
how does the system "know" a routine is "interrupted-position-dependend"?  I
thought the whole idea of interrupts was that the interrupted routine is
unaware of being interrupted.  

> So even though I quite agree that
> a naive stubborn cooperative implementation, or even a slightly enhanced
> CALL/RET based implementation, is bad unless we have a way for the overall
> software to pause often enough but not too often, still I'm convinced that
> cooperatively informative systems are the most reliable way to do things.

  Reliable, maybe.  Fast.  No.

  But I think you still don't understand preemtive multitasking that well. 
Behind me my Amiga is currently on.  Just checked and there are 16
processes/tasks in the system.  Even though the system is clocked for 7.1Mhz
it's still very responsive (for comparison, on my Unix box, there's only six
processes running).  Now, why is the Amiga so responsive, even though there
are 16 processes/tasks?  Because most of them are waiting (and no, it's not
a busy loop) for some event to occure.  In a system that is I/O bound, even
though the system is preemtive in nature, most processes probably don't use
their entire timeslice.  Most interactive processes don't consume the CPU at
all, but spend most of the time blocked for I/O.  So in effect, when they
call the system for input/output, that IS a form of cooperative
multitasking.  They yield the CPU (or rather, the I/O routines yield).  And
in this case, all 16 tasks are waiting for I/O.

  And in the rare case where a process is CPU bound, I adjust the priority
such that even though I'm calculating e to 100,000 digits (which is
definitely CPU bound) I don't notice it, because the e engine will be
preempted for more critical jobs, like dialing up and checking my email
(that is, if I used the Amiga to dial up).

  And when I pause a moment to gather my thoughts in replying, the e engine
will get the CPU, and less overhead (in general) will be lost during the
rescheduling (say I pause for a minute) than in the e engine having to
yield() every so often.

  And how, in a cooperative system, do you give the e engine less CPU time
if something more critical comes along?

>    The interrupt takes place in both systems. The difference is not there.
> The difference is address space, page table, stack, and process info swapping;

  Well, on the 386 (assuming you are using paging and not the segments at
all) just reloading CR3 remaps the address space, ESP readjusts the stack. 
All that's really left is the rest of the general registers (a popd will
work wonders here) and that's about it.  I've also checked the VAX
architecture.  Pretty much the same thing (just reloading a few certain
registers).

> the problem is synchronization for garbage collection and/or checkpointing.
> This is several hundreds of cycles (to the least) at each context switch,
> while a cooperative way would cost only tens of cycles, so we already gain
> at least one order of magnitude in such overheads.
> 
  I still don't see this.

> >>    Because the function is not actually called !
> >> Instead, code is modified at
> > > some point by the interrupt-driven routine that decides of a schedule...
> > 
> >   This is better than preemtive?  It sounds preemtive to me, and while it IS
> > possible, it sounds like more work than just having the CPU save the state
> > and move on to another task.
>    You can call it preemptive. And it does save time as compared to stubborn
> state save in that you can customze it to feet your needs.
> 
  I'm sorry, but unless I see actual code, I'm not buying this argument.

> >>   Firstly, doing low-level code is very difficult and annoying, because the
> >> hardware is complex and not well documented, that is, it has evil semantics.
> >   Please explain this.  What is evil about hardware semantics?  Hardware,
> > unfortunately, has to deal with the real world.  Which means that the
> > software that deals with the hardware has to deal with the real world. 
> > Granted, some of it isn't easy, but to call it evil is a bit harsh I think.
>   It is not its being "hardware", but its being uselessly complex (because of
> compatibility with score years old technology) and desperately badly
> documented (because of lack of compatibility ;) that makes PC hardware
> programming a pain. Also the lack of clean interfaces like the AmigaDOS,
> RISCOS or even MacOS systems have.
> 
  Well, I can tell you that programming the hardware on the Amiga is a pain
as well.  If you think the floppy drive interface on a PC is bad, it's worse
on the Amiga (that is, if you hit the hardware directly).  Do you realize
that unless you really want to suck up CPU usage, you HAVE to use the
graphics blitter to prepare the data before you can write it out to the
floppy?

  This is hardware.  It's nasty reguardless of the computer.

  And the interfaces you're talking about are software interfaces.  Which
may or may not be available if you switch OSes (or roll your own).

> >>> The only reasonable way to develop is at abstract over the hardware to
> >>> obtain objects with better, *cleaner semantics* (abstracting without
> >>> cleaning the semantics, as in C, is lame); else you get too many bugs. 
> >>  Could you explain this as well?  I'm lost now.
>   Well, setting such bit to spinlock, wait for the 8052 to answer, then
> set a bit in such io register, here are weird semantics for enabling some
> hardware feature (enabling line A20) itself very had to express cleanly.

  It ain't no picnic on the Amiga either.

> >> I wrote
> >> *lots* of macros, that made development possible. I also saw how no good
> >> macro language exists. CPP is pure shit; I chose m4, but it has
> >> evil reflectivity semantics (I had to systematically remove reflectivity
> >> from my code, but for some trivial macros). 
> >   I'm sorry, but I still don't quite understand reflectivity.  Could you
> > please post some example of what you were trying to do but couldn't?  Maybe
> > then I can understand it.
>   I could not define functions that defined new functions (except for
> trivial ones), because the binding mechanism with insert-arguments-immediately
> and evaluate-macros-at-the-last-moment, with horrible quoting, stubborn 
> syntax, lack of local contexts (that I emulate with CPS style macros), etc.
> The macro system being Turing-equivalent (btw, have you ever programmed a
> Turing machine ? ;), there's nothing that you really "cannot do". The problem
> is the way you have to do it. The ugly m4 sources from the tunes src
> distribution talks by itself.
> 
  Here's a hint:  Forth.  Dr. Dobbs presented a full 386 assembler written
in Forth a few years ago.  60 or so screens (which is about 20 pages).  I
can supply the reference, and as far as I can tell, you can define functions
that define functions that define functions in Forth.  PFE is available for
Unix, MS-DOS and OS/2, supports ANS and F83, so the code I saw should be
able to run.

  And yes, I've written a Turing Machine.  In BASIC, lo these many years
ago.

> >>    The work-around for using languages that lack abstraction power,
> >> is that calling conventions should always be particularly well documented,
> >> and great caution is taken when invoking a function; for that reason also,
> >> people often choose arbitrary global conventions that reduce performance
> >> greatly to reduce the amount of bugs. 
> > 
> >   Really?  I tend to avoid global variables as much as possible myself.
> Global conventions=a one fixed convention for all routines. e.g. C argument
> passing.
> 
  I would think using a single calling convention would speed things up, if
only to avoid the gyrations needed to massage the parameters around (or glue
code).

  -spc (Okay, I got a handle on reflectivity now ... )