Comments on VOTable draft

From: tam <tam-at-lheapop.gsfc.nasa.gov>
Date: Tue, 12 Mar 2002 16:38:05 -0500


Roy and Francois,

Here are some further comments on the current draft (0.94) of the VOTable. Most of these are relatively minor, editorial comments or explanations that seem to be missing. Numbers 1, 22, and 24 reflect more serious concerns for me.

        Tom                  

  1. Concern. 5.1.2. A single physical FITS file can contain many extensions including multiple tables. If we are going to use FITS to serialize the data, then there must be a keyword in the table description that indicates which FITS extension is being used within a file to serialize the data. I believe it would be a serious error to have VOTables only able to represent the first extension of a FITS file. I would suggest an 'extension' attribute in the STREAM element but I don't really care how it's done.
  2. Nit. Section 1. Is Astrores really based on FITS binary tables? It seems much more like FITS ASCII tables.
  3. Editorial. Section 2. This seems like the natural place to include the information in section 8 which otherwise seems out-of-place. I.e., "Here's the data model" flows naturally into
    "and here's the overall organization of the file".
  4. Editorial. Section 2. The table in section 2 should be near the descriptive information in section 9 rather than separated by most of the document.
  5. Editorial. 2.1. "Except for the Bit type, each primitive has the fixed length in bytes given in the table. Bit scalars and arrays are stored in the minimum number of bytes feasible."
  6. Question. 3.2 It is unclear if this syntax policy is relevant to this document or a general mandate on VOTables. This should be clarified.
  7. Editorial. 3.3. The last sentence is not really precise. I.e., one could presumably build a non-validating parser that used the referencing mechanism properly (but failed in some other aspect). I think the previous sentence says all that need be said.
  8. Nit. 4. Why do we make <ASTRO> the outermost block? If we ever want to expand beyond the astronomy domain this seems a gratuitious barrier. Could we use something more neutral?
  9. Error. 5. There is a reference to a section 4.5 that does not exist....(5.5?)
  10. Editorial. 5. The discussion of strings is unclear. I'd suggest something like:

"Strings are not a primitive type: characters are. To simulate
variable length strings users can use variable length arrays of characters. VOTables support two kinds of characters: ASCII 1-byte characters and Unicode 2 byte characters...."

  1. Question/Nit. 5. Do we really want to support Unicode?
  2. Error. 5. There is a reference to section 8 that is probably meant to be a reference to section 9. (Of course above I suggest moving all of section 8 into section 2 which would fix that!)
  3. Question. 5.1. I'm a little confused by the discussion of the width attribute. Why do we retain this hint and not other display information from Astrores? Should we just get rid of it? What does it gain us?
  4. Question. 5.1 Why are we creating a new 'magnitude' scale with our precision field? Why not just make it the linear error, rather than the -log(error)? Writing precision=".00001" seems a lot clearer to me than precision=5. Also, why not use A and R for absolute and relative rather than E and F for ?? and ??. [Surely we weren't hearkening back to Fortran E and F formats?]
  5. Missing. 5.5. The 'content-role' attribute is mentioned but I can't see that it is ever explained.
  6. Missing. 5.5.1. The 'action' attribute is mentioned but no example is given. If it is not used at all why is it included? If we are going to need it in future releases then we should give some hint of the syntax.
  7. Editorial. 5.6 The discussion of the 'TRIGGER' attribute is unclear to me. It probably belongs in section 7 with an example. [Of course I want to get rid of section 7 altogether]
  8. Missing. 6.1.1. How does a user specify Unicode characters in TABLEDATA? Do we need a mechanism for specifying a Locale (a la Java)? Is there an escape sequence for specifying a Unicode character?
  9. Missing. 6.1.1. Is it an error to specify fewer/more <TD> elements than there are fields?
  10. Editorial. 6.1.1. I believe it is an error to specify an invalid set of characters for a field in the TABLEDATA. Thus the example given of a reader not succeeding in reading the data is inappropriate. What happens when a parser tries to read a malformed VOTable is probably beyond the specification. We should only describe how to properly form VOTables not how to handle errors.
  11. Missing!!!!. 6.1.2. Just reiterating comment 1, that there needs to be a way to specify the FITS extension.
  12. Question. 6.1.2. In the example we have 'encoding=gzip', and I'm a little concerned about the need to put the encoding in the VOTable. Shouldn't the encoding be something that is specified in the header that we get when we begin to download the URL (perhaps with defaults associated with certain file extensions)? Putting the encoding in the VOTable seems counter to the philosophy of the Web. What we should mandate in the document is the encodings that a reader should be prepared to handle.

If the data is present in the file, then an encoding attibute might be appropriate but not for linked data.

23. Inconsistency. 6.1.3. There is a problem with variable Bit format in the binary mode. One cannot determine how many bits were encoded from the number of bytes in the array. E.g., one byte could correspond to 1-8 bits. One could address this problem, and be closer to FITS usage, by having the user give either the total number of elements of the specified type, or the number that was given as a '*' in the field length specification.

24. Concern. 7. I'm not really enthusiastic about using the VOTable to specify the query as well as the results. E.g., note how much less clear the example is in the VOTable format that it was in the original. I'd prefer to use VOTables for results and not encumber the specification with issues that address queries -- at least until we've gotten the former under control.

25. Editorial. 8. I don't think this little fragment section is very useful. Either it should be moved into section 2 or deleted.

26. Editorial. 9. The title is wrong.
We shouldn't care what the origin is here. This is an important section describing the characteristics of the primitive types. Maybe we want a sentence at the beginning saying that these types are mostly derived from FITS, but the title should be 'VOTable Primitive Types' or some such. As mentioned earlier this should be joined with the table in section 2. [But I don't care which one moves.]

26. Editorial. 9. The description of Bit arrays is a little unclear. Maybe the last senstence should be "A bit field shall be composed of the smallest number of bytes that can accommodate the number of elements in the field. Padding bits shall be 0.'

27. Editorial. 11. There are lots of elements/attributes that are not really explained in the text. I'd suggest that the DTD be given in an annotated style so that we had at least a short explanation for every element/attribute. Received on 2002-03-13Z08:58:23