Unicode (Was: mikemac's proposal)
Pierpaolo Bernardi
bernardp@cli.di.unipi.it
Fri, 9 May 1997 18:20:36 +0200 (MET DST)
> > How do you plan to make readtables in Unicode work? or are you
> > considering unicode only for characters and strings, and not for
> > source code?
> Huh?
>
> As the lisp compiler is written in lisp I can't see how it should lack
> support. The right approach is to redefine what we mean with character,
> right? (I should read up in the HyperSpec on this)
I'm not familiar with the internals of cmucl, however a reasonable
implementation of readtables would be using a 256 characters vector.
If you are using Unicode, a 65536 char vector, e.g. 256 Kb, is not a
reasonable implementation anymore. So readtables must be implemented
with a more sophisticated data structure, I suppose.
> The only problem I
> still have is if all characters should be Unicode or if we should support
> a "compressed" character that's ascii only. (and 7 bits only to avoid
> problems)
One way is described in the plan9 paper.
For a more lispy example you may want to look at Gambit-C.
In short, in Gambit-C there's command-line switch to tell the reader
if it is reading from a latin-1, unicode or utf8 stream; the file
opening function do have an additional argument telling whether the
file is latin-1, unicode ot utf8. In all cases characters are
represented internally as unicode.
Hope this helps.
Poka,
Pierpaolo.