Unicode (Was: mikemac's proposal)

Pierpaolo Bernardi bernardp@cli.di.unipi.it
Fri, 9 May 1997 18:20:36 +0200 (MET DST)


> > How do you plan to make readtables in Unicode work?  or are you
> > considering unicode only for characters and strings, and not for
> > source code?
> Huh?
> 
> As the lisp compiler is written in lisp I can't see how it should lack
> support. The right approach is to redefine what we mean with character,
> right? (I should read up in the HyperSpec on this) 

I'm not familiar with the internals of cmucl, however a reasonable
implementation of readtables would be using a 256 characters vector.
If you are using Unicode, a 65536 char vector, e.g. 256 Kb, is not a 
reasonable implementation anymore.  So readtables must be implemented
with a more sophisticated data structure, I suppose.

> The only problem I
> still have is if all characters should be Unicode or if we should support
> a "compressed" character that's ascii only. (and 7 bits only to avoid
> problems)

One way is described in the plan9 paper.

For a more lispy example you may want to look at Gambit-C.

In short, in Gambit-C there's command-line switch to tell the reader
if it is reading from a latin-1, unicode or utf8 stream; the file
opening function do have an additional argument telling whether the
file is latin-1, unicode ot utf8. In all cases characters are
represented internally as unicode.

Hope this helps.

Poka,
Pierpaolo.