lispOS and persistent store

David Gadbois gadbois@cyc.com
Tue, 29 Apr 1997 23:48:16 -0500 (CDT)


   Date: Tue, 29 Apr 1997 18:02:24 -0700
   From: Luca Pisati <pisati@nichimen.com>

   [...]

   Can you really build a pure LispOs without a tagged architecture,
   and then be prone to Lisp Objects inefficiency ?

   Can you write efficient code not only for A.I.sh kind of software,
   but also for numerically-heavy applications ... ?

   [...]

   Isn't it Open Genera emulating a 40bits (8 tag + 32 data) hardware
   on top of a 64 bits machine (then wasting 24bits per word) ?

Last I heard, it was using a 32-bit pointer model and sticking the
tags bits elsewhere in the address space so that there was no memory
waste.

This reminds me of a story that I would like to relate:

A few years ago, when Open Genera 1.0 was still in beta, Scott McKay
came down to Austin to visit.  Over a very nice dinner, I pressed for
info on the VLM, and Scott allowed that the emulation of the
SYS:%MEMORY-READ instruction (essentially, a pointer dereference) took
seven Alpha AXP instructions.  I (almost literally) fell out of my
chair:  it was totally outrageous.  Scott managed to convince me that,
with this new-fangled superscalar stuff, it wasn't all *that* bad.

More recently, I was writing a new garbage collector for our C
run-time system.  For debugging purposes, I had all memory references
go through a function that checked the validity of the source and
contents, verified whether the type of each was appropriate for the
kind of access, and stripped off tags explicitly.  (I.e., the
SYS:%MEMORY-READ equivalent took dozens of instructions.)  This
approach caught bugs as they occurred rather than letting them cause a
difficult-to-diagnose problem billions of instructions later on.

The production version was set up to do the pointer dereferences
inline instead of going through the function call.  When things
appeared to be working smoothly (mind you, this turned out to be about
10 bugs away from actual correctness), I switched over to the
production mode and ran some performance tests.  The difference: about
25%, which was pretty inconsequential in terms of our usual multi-
minute or hour turnaround time.  And I wasted a massive amount of time
flipping the debugging switch back on and recreating multiple hours
worth of state to track down those last 10 bugs.  It took a month of
debugging, had a high opportunity cost, and jeopardized a
multi-million dollar contract.  I don't want to repeat that mistake.

(Disclaimer: This applies to the kind of memory intensive, locality
busting codes I run.  (Like, e.g., an OS.)  If you are doing highly
local and vectorizable 3D rendering type stuff, then you mileage will
certainly vary.)

The moral of this story is that hardware changes a lot faster than we
programmers can keep up, and the conventional wisdom about what is
possible, or at least, what is reasonable to do, isn't always correct.
And at least part of the excitement of this Silk (I am still working
on getting used to that name) project is challenging that wisdom.

But that is not a particularly good moral, insofar as it merely tells
us that this project isn't completely loony instead of giving reasons
why it is a good thing to do.  Here's an attempt at a better one:

A while back, I was extremely disappointed to learn that my new
486DX-2/66 PC was four times as fast at running Lisp as my Symbolics
XL400.  Nowadays, that PC is relegated to X-terminal status, and I'm
still hacking on the XL400.  Here, the moral is that it is worth it to
put up with the snail's pace of 10-year-old hardware to get actual
work done more quickly and to be able to do things that would
otherwise require too much time to get done at all.

My latest PC is many times as fast as that old (ha!) 486.  There are
lots of kewl games to play on it, my finances might be better if I
ever spent the time to use Quicken right, and those Microsoft Word
documents people keep sending me sure look pretty, but the keyboard
doesn't get nearly so grungy as the XL400's.

(And I'm still trying to figure out how you do a couple of loads along
with address manipulations, indexing, and software read-barrier
checking and trapping in seven instructions.)

--David Gadbois