VOTable alternative? On Flexibility
Martin Hill
mchill at dial.pipex.com
Mon Jan 26 10:22:31 PST 2004
A few comments have been made about how handy VOTable is for drawing plots and
generic use. We don't need to lose this capability - I don't see why any schema
we agree on cannot also have a transformation to VOTable. Or indeed to other
formats suitable for commonly available feature-rich plotters such as PtPlot.
But I am curious - I would have thought that a Spectral Energy Distribution
plotter would have different characteristics to, say, a Herzprung/Russell
diagram? For anything more than a trivial one column plotted against another?
Also... though this is fairly off the top of my head - I don't see why we can't
plot using 'rich' XML. Instead of defining in the plotter which columns you
want to display, you define parent elements, and then plot child elements
against other child elements. Which makes me think there must be XML plotters
out there already, but I can't see any on the web... Perhaps we should write
one, sell it and make millions!
Cheers
Martin
I also wrote:
>
> I won't argue that VOTable is a good way to describe generic tables, and
> I don't see any point in replacing VOTable as a way to store & transmit
> generic tables. But *we* shouldn't be storing generic tables in XML; we
> get the worst of both tables & XML then: very very limited structure and
> semantics, and very very large files. Instead we should be looking at
> richly-described XML documents for metadata and small-medium datasets,
> and suitable binary formats to adopt for images, large catalogues, etc.
>
> To answer your points directly:
>
> 1a) We've talked about not being able to use XML tools directly on
> VOTable. You are quite right, we can build large/serial XPath queries
> to extract our information, but this involves 'programming' and makes a
> simple task more difficult for all the wrong reasons. The fact that
> VOTable does not lend itself to doing *any* XML-common task nicely using
> standard techniques should raise warning bells that something is not
> right about it, not that we should slavishly adopt each and every XML
> technology that appears.
>
> 1b) Don't follow why schema people = SQL-database people? I like
> schemas because I can check nobody's put silly text/numbers into the
> wrong bit. And schemas offer a lot more than just typing elements -
> though this is assuming you want more than just a generic *table*...
>
> 3) The thing is it's not a few days code we save. If we start off now
> in the community with an uncommon-XML-standard way of presenting our
> data, it means all future users of the VO are going to have to write
> their own libraries to cope with it, rather than any standard
> XML-handling libraries that come along. The example you've given is
> XPath - someone is going to have to write their own code (in FORTRAN, C,
> Java, Perl, etc etc) to extract the FIELD info and then the correct
> column. We've seen how XPath has come along only recently - we can
> expect all manner of other inventions to appear over the next few years.
>
> 3+) We need to argue over the schemas for a very important reason -
> because we are arguing over how we share our information. VOTable does
> *not* do this! It's a cop out that lets us pass information around, but
> without having to agree what's in it. This is sometimes presented as a
> good thing, but actually it's not - it just means we have deferred the
> problem, and we can continue to defer it while we pat ourselves on the
> back for having produced something. For example, how do we use it to
> transport spectra? Aha, we need to discuss this and agree it. How do
> we use it to describe datacenter metadata? Aha, we need to discuss this
> and agree it. So in fact we still have the original problem, and have
> solved nothing. Indeed, we've made it worse, because now we have no way
> of checking our agreements. Agreeing and publishing a schema means
> everyone everywhere has something to develop against, and something to
> validate against both as they publish data and receive it. You can be
> sure you're not getting a spectra when you expect a catalogue, etc.
>
> You're quite right, there is nothing (very) wrong with VOTable as a way
> of describing *generic tables* in XML. But there is everything wrong
> with using it to pass information between our services...
>
> Cheers,
>
> Martin
>
> (Thanks for reviving it BTW!)
>
>
>
> Norman Gray wrote:
>
>> Hmm, this list has been a bit quiet for several hours now, methinks it
>> needs a bit of gingering up.
>>
>>
>>
>> The more I read of this discussion, the more I am persuaded that VOTable
>> is a well-designed and thoughtful application of XML.
>>
>> Dictum no. 1: XML does not become purer, better, more proper, or
>> `harder' in direct proportion to the size of your schema.
>>
>> Less is more.
>>
>> ----------
>>
>> I follow the argument that says that, in a case such as the following:
>>
>>
>>> <TABLE>
>>> <FIELD name="Object name" ID="NAME" datatype="char" arraysize="*"/>
>>> <FIELD name="V Magnitude" ID="VMAG" datatype="double"/>
>>> <DATA>
>>> <TABLEDATA>
>>> <TR><TD>M31</TD><TD>3.4</TD></TR>
>>> <TR><TD>Fomalhaut</TD><TD>1.23</TD></TR>
>>> </TABLEDATA>
>>> </DATA>
>>> </TABLE>
>>
>>
>>
>> ...this is arguably ugly because you (a) you can't write an XPath
>> expression to find the magnitudes, and (b) can't write a constraining
>> schema for <TD> (it has to be simply ANY). It does not follow that this
>> is just CSV, since you do have all the metadata you need in the sequence
>> of <FIELD> elements, in a way which is straightforwardly accessible.
>>
>> (a) There isn't an element named `MAG', so you can't find it using XPath.
>> Nonsense: /TABLE/DATA/TABLEDATA/TR/TD[2] gives all the elements in
>> column 2 (the VMAG column) once you know (from <FIELD>) that's the column
>> you're after. Extracting that using XPath and XSLT is not _completely_
>> trivial, but it's not exactly hard. You also know the column's type
>> (FIELD/@datatype) and potentially semantic content (FIELD/@ucd).
>>
>> (b) If you believe in XSchema, then you believe in your soul that element
>> contents are typed (column 1 here is a string, column 2 a numeric, for
>> example), and you will be upset that these don't have specified types.
>>
>> If you don't believe in XSchema (and you possibly shouldn't, unless you
>> are a database person who wants to suck all XML into SQL databases), then
>> you don't care that there isn't a predefined type for each element --
>> the associated validation and processing lies very firmly on the
>> application side of this particular technology boundary.
>>
>> The main thing you get from XSchemas (as opposed to DTDs) is the type
>> system, and that's where _all_ the complication lies.
>>
>> Dictum no. 2: If you do not NEED XSchema's type system, then you
>> do not want XSchema.
>>
>>
>> ----------
>>
>>
>> VOTable defines reasonably general tables (ahem!), therefore you
>> don't know a priori what types of elements are going in which columns.
>> So what? You've still got to process the table with an application
>> which is smart enough to parse the <FIELD> elements and thus devise
>> something clever to do with the columns. This means that you might have
>> difficulty processing a VOTable with standard XML tools; but so what?
>> Echoing Mark[1] ``In short, I think that VOTable is primarily a storage
>> and transmission format, and we shouldn't be afraid of parsing it before
>> doing more complicated things with the data it contains.''
>>
>> Dictum no. 3: Spending months arguing over schemas in order to
>> generate a few days worth of code is not a good
>> tradeoff.
>>
>> XSchema types and schema-generated code are necessary _if_ what you
>> want to do is suck XML files into SQL databases, so that parsing and
>> validation are the bulk of what you need to code. It is _not_ the
>> bulk of what _we_ need to code, and obsessing about code-generation is
>> allowing the tail to wag the dog, and counter-productive.
>>
>> XSchemas and code-generation are Good Things -- I know that as well
>> as anyone else. However they also have costs, and I can recall no
>> arguments that these costs are worth _us_ paying, in the production of
>> _our_ data analysis applications.
>>
>>
>> ----------
>>
>> Thus:
>> There's nothing wrong with VOTable. It ain't broke. Discuss.
>>
>>
>>
>> Norman
>>
>>
>> [1] http://www.ivoa.net/forum/votable/0418.htm
>>
>> --
>> ---------------------------------------------------------------------------
>>
>> Norman Gray
>> http://www.astro.gla.ac.uk/users/norman/
>> Physics and Astronomy, University of Glasgow, UK
>> norman at astro.gla.ac.uk
>>
>>
>
--
Martin Hill
Software Engineer
AstroGrid @ ROE
Tel: +44 7901 55 24 66
www.astrogrid.org
More information about the votable
mailing list