String character range
luigi at lambrate.inaf.it
Mon Aug 4 07:28:59 PDT 2008
I think that Unicode chars would be rarely sent, and control chars never
at all. Probably in the 99% of the cases ASCII charset with the
limitations you indicated is enough, so I don't have a strong position
respect the Unicode support.
Anyway I've thought to Dough's suggestion regarding UTF-8 and I've
looked here and there for what string encoding mechanism adopt other RPC
systems like ZeroC's Ice and DBus (I've also looked for CORBA encoding,
but I didn't succeed). Well, DBus and Ice, either use UTF-8 (with no
limitations). I've not looked at the other RPC systems (there are a
plethora) but those are my favourite (along with XML-RPC and SOAP of
course) and so I've looked there.
Now, suppose that in the far far future, a perverted guy decides to
implement SAMP using a different profile, for instance using Ice as wire
protocol (in principle it should be possible) instead of XML-RPC. It
would be a shame if such an implementation inherited the limitations
coming from the XML limits. In my opinion the limits should be put at
implementation and language level, not at protocol level... it should be
as general (and flexible) as possible.
So, why not follow Dough's suggestion and specify at SAMP protocol
definition level that the strings serialization is in UTF-8 (in
general), and specify at Standard Profile level that not all the UTF-8
chars are allowed but only those supported by XML?
> My feeling is it would be better to restrict what can be sent in a
> SAMP string to something that is going to be easy to implement in all
> sensible languages/transports (probably 0x09, 0x0a, 0x0d, 0x20-0x7f),
> so that both the standard, and the requirements on clients, stay as
> simple as possible. If specific requirements for sending full Unicode
> strings arise, we could mark these on a per-MType basis
> and come up with a convention along the lines of the SAMP int and
> SAMP float already defined in Section 3.4.
> Which of these is best depends on how important the requirement to
> be able to send Unicode and control characters is. My vote is not
> very. Can we have a show of hands?
More information about the apps-samp