VOTable alternative? Rich Astronomical metadata

Martin Hill mchill at dial.pipex.com
Mon Jan 26 12:09:04 PST 2004


In fact with a good schema we can add much deeper, useful information than we 
can with VOTable.  Consider the following, have a look at the <Brightness 
filter="peach"> and the <Filter ID="peach">:

<StellarCatalogue>
   <Galaxy name="NGC 300">
     <Altname>PGC 3238</Altname>
     <Position>
         <WCS epoxinoxthingy="J2000">
         <RA unit=sexigesimal>00:54:52.6</RA>
             <DEC unit=sexigesmial>-37:40:57</DEC>
       </WCS>
     <Galactic>
         <Long unit=degree>299.2306</Long>
         <Lat unit=degree>-79.4210</Lat>
     </Galactic>
     </Position>
     <Brightness band="B" unit="johnsonmag">8.95</Brightness>
     <Brightness filter="peach" unit="johnsonmag">6.95</Brightness>
     <Extinction band="I" unit="johnsonmag">0.02</Extinction>
     <Extinction band="B" unit="johnsonmag">0.055</Extinction>
     <Shape>
         <Ratio>0.73</Ratio>
         <MorphologyT>7</MorphologyT>
         <MorphologyM>SA(s)d</MorphologyM>
     </Shape>
   :
   </Galaxy>
   <Filter ID="peach">
      <CenterWlen units='nm'>650</CenterFreq>
      <FreqWidth units='millihertzteehee'>12</FreqWidth>
   </Filter>
</StellarCatalogue>

We can extend the schema snippet describing the <Filter> element to include all 
kinds of bandwidth information and characteristics.  And as I understand it, we 
can also use this technique to refer to other XML documents, such as 
astronomical metadata held at the datacenter.  Whether we want to allow metadata 
to be 'distributed' like this is another matter...

Er, and I forgot to include UCDs in the above. There's no reason why they can't 
also be in as attributes.

Cheers,

Martin

Martin Hill wrote:

> 
> 
> Mark Taylor wrote:
> 
>> On Mon, 26 Jan 2004, Martin Hill wrote:
>>
>>
>>> 3+) We need to argue over the schemas for a very important reason - 
>>> because we are arguing over how we share our information. VOTable 
>>> does *not* do this!  It's a cop out that lets us pass information 
>>> around, but without having to agree what's in it.  This is sometimes 
>>> presented as a good thing, but actually it's not - it just means we 
>>> have deferred the problem, and we can continue to defer it while we 
>>> pat ourselves on the back for having produced something.  For 
>>> example, how do we use it to transport spectra?  Aha, we need to 
>>> discuss this and agree it.  How do we use it to describe datacenter 
>>> metadata?  Aha, we need to discuss this and agree it.  So in fact we 
>>> still have the original problem, and have solved nothing.  Indeed, 
>>> we've made it worse, because now we have no way of checking our 
>>> agreements.  Agreeing and publishing a schema means everyone 
>>> everywhere has something to develop against, and something to 
>>> validate against both as they publish data and receive it.  You can 
>>> be sure you're not getting a spectra when you expect a catalogue, etc.
>>
>>
>>
>> I think this is the confusion about what astronomers mean by metadata
>> and what computing science people mean by metadata cropping up again
>> (fourth time in this mailstorm by my count - well done Clive Page
>> for spotting it the first time!).
>>
>> It seems to me (for the purposes of this discussion an astronomer 
>> rather than a computing person) that, trying to interpret a spectrum 
>> as a catalogue is not in practice the kind of problem that your 
>> working astronomer is going to encounter.  You submit a catalogue 
>> request to a federated catalogue-serving service, and you expect to 
>> get a catalogue back.  If you submit a catalogue request to a 
>> federated spectrum-serving service and expect to get a catalogue back, 
>> well you probably don't have any business using the service.
> 
> 
> That's fair, and *perhaps* this is a bad example - it's just the two 
> things that we seem to be using at the moment.  However, I can think of 
> tools that might take  spectra and catalogues, or make spectra out of 
> catalogues, or spectra out of catalogues and other spectra.  And getting 
> the right information in to the right point is a Good Thing.
> 
>> The kind of problem which astronomers really do need to solve is
>> comparing two catalogues and working out what column in one can
>> sensibly be compared to what column in the other.  Metadata in the
>> sense that it appears in VOTable FIELD elements (UCDs, utypes, units)
>> is the way to do this; I don't believe that you can make much or
>> any contribution to it by using schemas.
> 
> 
> Quite - here is a specific use-case on catalogues.  But careful - your 
> comment is based on the fact that most astronomy is done by thinking in 
> terms of columns in a database.  1) I expect that this will continue - 
> most joins/comparisons will continue to happen in a database.  2) It can 
> be done just as easily using 'rich' XML as using VOTable.  eg, Find all 
> elements of type "StellarObject" and plot their child elements named 
> "Mag" against the child elements named "Freq".
> 
> Yes we definitely need all the extra information - let's call it the 
> astronomical metadata.  Yes we *must* provide it (whether as part of the 
> same document/file or separately is a different debate).  But there is 
> no reason why it can't be included in 'rich' XML documents using more 
> common XML techniques (such as XPointer) that other (common) tools can 
> make use of.
> 
>> I'm not saying that the computing science problems here are
>> unimportant, but if we solve those problems at the expense of
>> the astronomical ones, then our software and standards, however 
>> beautiful and XML-friendly they are, are not going to be useful for or 
>> used by our intended customers.
> 
> 
> I agree hugely largely etc.  In fact the reason I am pushing for 
> better-described schemas is because it will make interpreting the data 
> *so much easier*.  I am not after pure XML, but after something that is 
> straightforward to use for people with some IT skills but an interest in 
> a different *specific* discipline: astronomers.
> 
> Cheers,
> 
> Martin
> 
> 

-- 
Martin Hill
Software Engineer
AstroGrid @ ROE
Tel: +44 7901 55 24 66
www.astrogrid.org




More information about the votable mailing list