String character range
luigi at lambrate.inaf.it
Fri Aug 1 09:25:36 PDT 2008
I find that your suggestion below is a good compromise. I would split it
in two points:
1. At SAMP protocol definition level we might define that "string" can
accept any sequence of 0X01-0x7f characters adding the escape convention
for any printable Unicode char out of the specified range (so it is
2. At Standard Profile level I would put more constraints, limiting the
charset to the XML range and introducing the escape convention for the
other unsupported chars.
Is it reasonable?
> As far as SAMP goes: that character looks to me like code point 0xf1,
> from the Latin-1 Supplement code block. So you could not send it using
> either the existing definition for a SAMP string or the proposal (4)
> that I am suggesting. If we used a variant of my suggestion (3):
> 3. Define some escaping convention for un-XML characters, e.g. \u001f
> for character 31.
> with the intention that this escaping mechanism could be used for
> any 8-bit character it would be possible to transmit this kind of
> non-7-bit Latin character. However, characters with the 8th bit set
> might cause problems for certain other transports and language
> environments. I must admit apart from RFC-822 mail-type contexts I
> can't think of what these might be, but I'd be inclined to steer clear
> of non-7-bit characters just in case. However, if others (e.g. with
> less Anglo-Saxon prejudices) think that it's an important requirement to
> permit transmission of characters like this within
> SAMP we could take that on board. We could even in principle say that
> this escaping mechanism could be used to specify any Unicode character -
> but I think that would definitely be a bad idea as it would effectively
> restrict use of the protocol to languages with Unicode support, which
> excludes quite a lot.
More information about the apps-samp