Empty TD element wording
m.b.taylor at bristol.ac.uk
Wed Jun 21 00:34:09 PDT 2006
On Tue, 20 Jun 2006, Edward J. Sabol wrote:
> Mark Taylor wrote:
> > I've had the following misleading wording in the VOTable specification
> > pointed out to me.
> > In section 4.6 it says:
> > In the TABLEDATA data representation, the default representation of
> > a ``null'' value is an empty column (i.e. <TD></TD>); for fields
> > containing arrays, individual ``null'' elements of the array can
> > be specified either by the value specified in the null attribute,
> > or by the "NaN" or "nan" text in place of the expected numeric value.
> > This sentence only applies to certain datatypes; in particular it is
> > not true of integer types (unsignedByte, short, int, long), which fact
> > is made explicit in the relevant paragraphs of section 6.
> > I suggest that (at such time as the next version of the VOTable
> > document is released) the sentence above is withdrawn, or modified to
> > make it clear that it applies only to certain datatypes.
> I have actually been meaning to propose a change to the VOTable specification
> here that would extend the validity of "<TD></TD>" to include integers, so I
> am in agreement with Mark and would strongly favor withdrawing this
> restriction on integer nulls entirely. The only alternative method for
> specifying null integers is to determine some integer value that does not
> exist in the range of valid values for that column and specify that as the
> null value. This is rather impractical and makes it impossible to dump the
> table data a row at a time (such as when querying from a database) in a
> stream-like fashion. Basically, you have to read or scan the whole table (or
> at least just the integer columns) before being able to dump the table in
> VOTable format. Also, I am aware of at least two VOTable implementations that
> ignore this restriction, so perhaps a case could be made that the VOTable
> standard should reflect the reality of existing VOTable implementations. When
> I first came across this restriction on how to encode integer nulls in
> TABLEDATA respresentation some months ago, I spent some time searching the
> mailing list archive and wiki trying to determine why this restriction was
> written into the standard. I could not find any such discussion, though
> perhaps I just did not look hard enough. Could someone here can explain or
> make a positive case for it or point me to some historical dicussion of the
I don't know if the rationale for this decision is written down anywhere,
but I have always assumed it to be as follows: there is no way to
represent a null integer value in the BINARY or FITS variants of
VOTable, since every bit pattern represents a valid integer value.
If you allow empty TD elements for integers then you can't necessarily
transform any TABLEDATA-format VOTable into an equivalent BINARY-
or FITS-format one. Furthermore, it may no longer be possible to
perform a given FITS->VOTable->FITS round trip without loss of
information (since FITS BINTABLE has no way to do the equivalent
of an empty TD), which is an explicit goal of the standard
(see sec 2.3 of the spec).
This argument does not apply to floating point types, since the IEEE
representation specifies certain bit patterns which represent
Not-A-Number values. The empty TD element for floating point values
is just a convenience notation equivalent to <TD>NaN</TD>.
Having said all that, I'm well aware of the fact that it would
often be much easier to be able to use empty TD elements, especially
as you say when streaming out TABLEDATA-format VOTables.
My STIL parser, like many (all?) others will accept empty TD elements
as nulls if it comes across them, and certainly many VOTables out there
use this convention despite the fact that it is not legal.
So there is an argument for relaxing the current ideologically
respectable position and allowing empty TD elements for all datatypes
(though it would probably have the effect of marginalising the
BINARY and FITS variants even more than at present).
I'll leave the discussion, if any, of whether that would be a
good thing to this list - I don't have a strong opinion either way.
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
More information about the votable