Strings and C side primitives
=?X-UNKNOWN?Q?Olli_Pietil=E4inen?=
ollip at freeshell.org
Mon Aug 16 07:32:35 PDT 2004
Hello.
Disclaimer: I don't know mobius or pidgin too well, so here might be
errors/misconseptions.
> Hi,
>
> Recently we had some discussion about vm.h not being includable, and then we ended up that basically it shouldn't be included at all
> because C side primitives shouldn't really use anything from the vm.
>
> It brings up a question how should C prims be written platform independently that use strings. File prims are an example, that return
> unicode file names on windoze.
>
> We could fix the interface to communicate trough an utf16[], but on some platforms it would only be a hassle, etc...
I think UTF-8 would be easier, since we already have ByteArrays, but not
16-bit arrays, IIRC. We could also use arrays of integers, and then there
would be no need for UTF encoding/decoding in most platforms. Some sort of
conversion would be needed anyways (except in platforms where file
names are communicated in 32-bit arrays), but that would be simpler. Not
that UTF-8 is too complicated, either.
> But we could also extend vm generation so that it contains all the platform specific code in pidgin reduced to one-liner "directly"
> entries. What do you think about this?
>
> And if it's not the way to go, then what should be the interface between VM and C prims with strings?
I'd say either UTF-8 or array of integers. Also the interface with
characters should be decided, propably just integers.
The unicode library is nearing a point where you can do with unicode
strings/characters everythingthing you could with the old strings and
characters. This means that all references to String and Character (or
StringProto/CharacterProto) used internally in VM should be converted to
ByteArrays and Bytes (or what ever that is called, ASCIICharacter?). The
stuff that interacts with the image should be converted to use the proper
interface, whatever that is decided to be. Just remember that no assumption
that String == ByteArray should be made anywhere, or that Character == Byte.
Olli
More information about the Slate
mailing list