String representations of numeric values
tam at lheapop.gsfc.nasa.gov
Thu Apr 20 11:44:15 PDT 2006
Mike Fitzpatrick wrote:
> Well, the USNO-B1 service from Flagstaff adds another twist and writes the
> RA/DEC as a 3x1 array of doubles (e.g. "12 34 56.7") and not with the
> colon-delimiters. I can't find another example quickly just now, but
> I know I've
> seen sexigesimal in other data sets. Note the registry validation
> level is 2 and
> the ConeSearch validator only issues a warning that the 'units' should
> be in degrees
> (they're listed as 'hh mm ss", is/would that be different than "hh:mm:ss"?).
The validator doesn't check the format of the numbers, but I
don't think this should be considered a valid VOTable.
It does point out a serious error/omission in the VOTable definition though,
in Section 7. For the normal <TABLEDATA> representations that are almost
exclusively used, the definitions of the primitives for types short, long,
float, double and probably bit are all incorrect. We tried to define
how the data would be stored on the computer in terms of what it gets
parsed into, but if you don't know how to parse it, that's not very
helpful. The data
are not binary, they are an ASCII representation and there is very
little to specify how the transformation to binary is to be done.
E.g., is prefixed white space legal? It is not for character data
(i.e., I believe it's supposed to be part of the string). But we
generally permit it for numeric data and I'm not at all sure that
we are consistent with white space before strings.
If we have a bit value in TABLEDATA do we encode it as a string of 0's and 1's
or do we follow the spec and encode bits into bytes -- even though that might
give us invalid XML if we happened to generate a byte that had the bit pattern
How is a NaN to be written in table data. Using the string 'NaN'? Are NAN or nan allowed?
What about the infinities?
Are spaces allowed between signs and digits? What exponential notations
are permitted? E.g., is the Fortran 1.D10 allowed or only 1.E10. What about
1.e10? Do we allow exponents at all? Do we use Java syntax?
What about numbers that go outside the range supported by singles and doubles?
How are they supposed to be represented? NaN? Inf?
What about spaces inside numbers? Fortran allowed those to be treated
as 0's. Do VOTables support that?
If X is a double are the two values
1. and 1.000000000000000000000000000001
the same? IEEE doubles can't distinguish them (assuming I put
in enough 0's). Is a VOTable reader
that enables a user to distinguish these non-compliant?
Even though I'm nominally one of the VOTable authors, I'm not sure I know
what the answers to these questions are.
Since the tabledata are clearly not IEEE data, we probably shouldn't say that
they are, but we need to have some rules for how numbers are represented
As a byproduct this would have made it clear that the USNO VOTable is wrong, but
this is a more general failing of the standard.
More information about the votable