Hardware references.

John Morrison jm@mak.com
Fri, 09 Oct 1998 10:30:04 -0400

Hello Ray (et al);

Ray Dillinger wrote:
> As an assembly-language "prefix" that the compiler must emit
> whenever it's outputting a bootable kernel image?  If it's
> suitable for that, then yes, that would be a wonderful thing.

Actually, "yes" in that at the link stage (vs. the compile stage), the
linker would put these object files first such that the actual first
instruction (entry point) is at the zeroth byte (or some other
easy-to-find offset) of the kernel image. The code would then figure out
how much memory we've got, put the i386+ into protected mode, set up the
native stack, etc.  Under UNIX, the code that does all this (and is
linked in almost invisibly to you by the default linker settings) and
then calls your "main" procedure typically lives in a file called
something like "crt0."  (BTW, the image must be in a format expected by
your bootstrap loader -- the most typical format being a "raw binary"

> When you use the word "toolchain" do you mean the Editor/
> Compiler /Linker /Loader /Assembler /Disassembler /Debugger
> combination that is needed for system-level development?  That's

Yes, to include the bootstrap loader.  The i386+ real-mode to
protected-mode transition, the PC boot sequence, and the PC memory map
are, well, *interesting*.  For example, I first tried to do this all
under WindowsNT, and had, er, limited success.  I then moved to Linux as
the development environment, and the Linux kernel as my model-to-follow,
but punted on its multi-stage bootstrap sequence as way too complicated
(it seems to use two entirely different, incompatible assembler/linker
pairs depending upon whether one is doing real-mode bootstrap stuff or
protected-mode stuff -- as86/ld86 vs. gas/ld; and it blits the kernel
image around, etc.).  I also had trouble with floppies going bad on me
after repeated rewriting of kernel bootstrap images, without the benefit
of any ECC or detection (when I finally figured out that it wasn't an
obscure kernel bug, but instead was a media error, I was, well, quite
put out).  Finally, I settled upon a NASM, gcc, objcopy, and Etherboot
cross-development combination for more reasons than I feel like typing
in right now.  Seems to work great.  No bizarre hard-coded offsets,
runtime bss initialization gyrations, media errors, or anything.

> where I've started work, and it looks like a fairly big job.  I
> had hoped that, when the major work on the tools was nearer
> completion, getting the hardware with no OS to behave the way
> my VM does (and retargeting my compiler at actual bare wires
> hardware) would be more a matter of discovering a few hidden
> mysteries than a conquering of a thousand nit-picking details -
> I haven't really been there yet, though, so if you say it's
> really grotty I believe you.

Well, it's more like the Death Of a Thousand Cuts.  If you need
something more functional (e.g., you actually need native-code drivers
for lots of devices), than you might want to either hack the Linux
kernel (good luck to you) or use the FluxOS Toolkit (I think they pulled
their most recent distribution back -- I forget why).

> Right now I have a Virtual machine which runs in any r4rs/r5rs
> compliant scheme. It slowly runs an extremely bulky, low-level
> simulated machine code called FACE.  FACE is not designed to be
> an efficient "bytecode", but is designed to be sufficiently
> low-level that machine code style optimizations can be done
> directly on it, and, like RTL, the optimized code is then
> ridiculously easy to translate into particular types of Actual
> machine code.  (I picked FACE isntead of RTL because FACE is a
> semantic-retention code -- dynamic type information and other
> optimization information is more available in later stages of

Can you not just have the box interpret FACE right off?  It would be
most expedient, and enable other people to help you at the earliest
possible time.  You could get to self-hosting sooner, and it's always
instructive to "eat your own dog food" (to quote David Cutler).

> compilation than with RTL).  I plan to build an assembler,
> debugger, linker, and global optimization utilities around this
> VM, but right now those are secondary goals to finishing the
> scheme - to - FACE compiler.

Frankly, building such a complete toolchain could take years, and it's
entirely machine-specific.  (By that time, nobody might remember Lisp!)

> I am absolutely going native-code.  Bytecode doesn't perform
> as well.  LISPs in general suffer from a reputation of having
> inferior performance. The reputation is no longer as deserved as
> it once was, but political realities being what they are, unless
> the OS and applications can be observably **FASTER** than
> Windows (an easy target, but still) we will have to cope with
> those prejudices.  So -- no shortcuts.

Understood -- however, trying to implement a Java box, I don't have much
choice about whether to use bytecode, eh?  Hmmm..., what's wrong with
JIT compilation for FACE?

> And more importantly, why bother doing it at all if we don't do
> the best we can do?  If you've ever had a programming job where
> features and schedule were more important than quality and
> correctness, you've had to swallow a lot of bile and take a lot
> of shortcuts (including more than a few which were simply "The
> Wrong Thing To Do", like Visual Basic <shudder>). Recreational

Yeah, well, I absolutely positively don't want to start a flame war, but
doing the expedient thing actually has some advantages that can
sometimes make the expedient thing actually more of the Right Thing than
some pre-prototype, clairvoyant speculation can be.

For example, I was thinking about doing some extensive generational
copying GC using all the hardware support available for barriers (e.g.,
read and/or write-protecting pages using the on-chip i386 virtual memory
hardware).  What a ton of work.  Then, I read about a school of thought
that says it almost doesn't matter how many CPU cycles the GC takes,
what matters is how often you go to slow mass-storage for any reason
whatsoever -- say because your memory space is twice as big due to
having both tospace and fromspace.  (Ironically, I was reading about
this on an airplane after the Windows95 laptop I was using had gone
through both sets of batteries due to thrashing its disk despite the
fact that I had every power management option I could find enabled.) 
Then, I read about an alternative in which an I/O-averse machine can
*compress* pages rather than send them out, and thus seem to have a much
larger (albeit a little slower) DRAM array than it really has.  So, I'll
do the minimum necessary to get the box working enough so that I can:
instrument the running prototype, learn something, and do the right
thing after I figure out what it is.

> projects like this represent a chance to do things *right* for
> once; we are under no schedule constraints and free to implement
> things as we believe they ought to be implemented.  And for me,
> at this stage, that means compiling to raw machine code.  It

Well, there is a lot to be said for compiling to C/C++ code, because
then you could use existing tools, which have hundreds if not thousands
of FTE-years invested in them, and some of which are pretty damned good
(some of the newer C/C++ compilers can do some optimizations with
respect to instruction scheduling that even expert assembly-language
programmers would find challenging).  Furthermore, it's more portable so
it'll run on lots of different boxes (Intel's StrongArm initiative
sounds exciting from a MIPS/Watt and a MIPS/$$$ perspective, Alpha is
fast enough to run Genera in emulation mode, and compiling directly to
Merced kinda-sorta VLIW machine code is going to be unbelievably hard).

> means handling dynamic libraries and time-sharing and security
> and all the other awful details of a "real" operating system,
> eventually.

I'm not sure I agree there.  I see DLLs/DSOs as poor substitutes for
dynamic Lisp (or Java) bindings.  And who cares about time-sharing in an
era of $340 PCs? (MicroCenter is shipping a line of $399 sans-monitor
PCs, and you KNOW that $60-$80 goes straight to Redmond for the
pre-installed Windows98.)  Run Linux on your servers and run
LispOS/LispUX (catchy, eh?) on your clients (until LispOS becomes more
mature, e.g., has 4mm backup tape drivers and software).

> It's not a masochistic streak, I don't think.  Doing things
> *right* actually feels very good.

Well, somehow, time-to-market must figure into "right."  Doesn't
somebody have a URL to that "Worse Is Better" (or something like that)
paper?  I mean, the GNU HURD has lost out to Linux even if it may have
been (technically speaking) more of the Right Thing (and also even if
Linux user-mode code is substantially GNU code).

Sorry for the length, but you raised some very good points.  (And I hope
I didn't alienate anyone with my "get it running ASAP" philosophy.  I
must be betraying my raw-hardware and UNIX roots.)


==== John Morrison ==== jm@mak.com == http://www.mak.com/welcome.html
==== MaK Technologies Inc., 185 Alewife Brook Parkway, Cambridge, MA
==== vox:617-876-8085 x115
==== fax:617-876-9208