MMU question

Ashley Winters jql@accessone.com
Mon, 6 Mar 2000 19:03:32 -0800 (PST)


On Mon, 6 Mar 2000, Tom Novelli wrote:
> > Tom Novelli wrote:
> 
> > when i save objects in brix i check the dirty bits for each page of the
> > object and save only those pages that are 'dirty'. VM is handled by
> > unsetting the present bit when swapped out to disk and sets the bit and
> > loads the data back into ram when accessed.
> > 
> > without VM hardware you would need to save the entire object. you also
> > dont know which objects have chaned so to be truely persistent you have
> > to save any object that gets loaded even if none of them get modified.
> > the only objects you wouldnt have to save are the ones that have been
> > write-locked.
> 
> We can get around that.  Break large objects up into blocks, and mark each
> block 'dirty' when you change it... like a traditional Forth system with
> raw disk blocks.

I would argue it's even simpler than that. The only 'dirty' data is input
altering the state machine. Keyboard/network/RTC/mouse/voice input are
what need to be recorded in persistent storage immediately. Objects are an
abstraction created from the inputs, and are cached in persistent storage
because they are efficient.

When a non-lexical object is changed, immediately commit that change to
the persistent storage manager at lowest priority. If we really have
control over the compiler, that's not a problem. The persistent storage
manager can gather and analyze all the data and write it out efficiently.
The more information we give it, the more efficient it can be.

As for MMU functionality:
- Protection
  Unnecessary

- Demand Paging
  Nice feature, but we will have to know what parts of an object are in
  ram anyways beyond what the Present bit can provide. What happens to
  objects less than PAGE_SIZE? If we are good little programmers, we
  would store multiple objects per page, and keep some external record of
  what's in ram. If we're already doing that, what's the point?

- Dirty/Accessed/Present bits
  Dirty and Accessed can be predicted at compile or signaled at runtime.
  Present is unnecessary since everything, everywhere is kinda present so
  long as you're hooked up to the internet. Hard drive is a cache for the
  internet, ram is a cache for the hard drive. At least as far as programs
  are concerned.

- Virtual memory (swapping)
  If we don't guarantee the physical layout or structure of an object a
  program is using, we need not keep entire objects in ram, linearly.
  Instead of using the hard-drive as a memory cache (swapping), we
  interpret all accesses to a huge object as requests to access the
  hard-drive. If it happens to already be cached in ram, that's great.
  Basically, we mmap() without the MMU.

- Convenience
  Flat memory-space is nice, but we don't use pointers directly. That lets
  us relocate objects in ram, and that too can be convenient. It's a
  trade-off, but not a bad one.

The network may eventually become the computer, but I'd rather keep stuff
closer to home. From a latency perspective, the CPU and L1 cache is a
little building in Oregon, the L2 cache is in Paris, RAM is on the moon,
the hard-drive is orbiting Saturn, and the network starts around
Proxima Centauri.

Where does garbage collection fit in? Well, certainly not in RAM. Garbage
collection is an efficient means by which we can take all those keystrokes
and network packets and either store them or discard them.

Garbage collection of inputs can be implemented with an optimizing
compiler. If you convert every input into a line of code corresponding to
the response to the input, the compiler can look at the entire set of
actions performed in response, and determine whether any interesting
side-effect occured that we need to concern ourselves with. If so, the
compiler can generate the code necessary to perform that action, or it can
store the data resulting from that action. Otherwise, the input is
discarded as unnecessary.

For example, take mouse inputs (please). Record 2000 mouse inputs, and
create a line of code, like window->mouseMove(205, 113) for each one.
Since we are using *constants* for all variables, after the fact, a
compiler can exactly determine what code would be run in response. Chances
are, nothing happens when you move your mouse across a window, and the
compiler will discover that and discard them all.

I'll stop there before I ramble on to insanity. Tunes probably won't have
to bite the bullet and go into paged mode, even for compatibility reasons.
Since all foreign applications will be quarantined in their own task, they
can have their paging bit flipped on, and Tunes can just turn it back off.
No more overhead than a standard task switch on an Intel chip.

Ashley Winters

--
Feel free to apply these regular expressions to my messages.
s/we/I/
s/will/should/
s/are/could be/