Column Groups in VOTable
Francois Ochsenbein
francois at vizir.u-strasbg.fr
Mon Apr 28 11:51:09 PDT 2003
Dear All,
The forthcoming meeting in Cambridge could be a good opportunity
to discuss how "column groups" can be introduced in VOTable.
Such a functionality was already expressed (maybe not explicitely);
I feel that this it has also implication on UCDs, and I'm therefore
posting this message to both VOTable and UCD groups. I apologize to
those who will receive this message twice.
-- Francois
================================================================================
Francois Ochsenbein ------ Observatoire Astronomique de Strasbourg
11, rue de l'Universite F-67000 STRASBOURG Phone: +33-(0)390 24 24 29
Email: francois at astro.u-strasbg.fr (France) Fax: +33-(0)390 24 24 17
================================================================================
================================================================================
The "column groups" proposition tries to answer questions frequently
themail: Undefined variable
asked about column associations, typically:
--> error (or standard deviation) associated to a column, e.g. a flux
consists of two numbers: the measured value + the mean error
--> qualities or weights associated to values
--> source or origin (e.g. telescope, or bibliographical reference) of a value
--> individual components e.g. x,y position of a CCD
--> etc...
This "column grouping" has obviously the same role as defining structures
made of columns; defining structures made of structures can also be viewed
as grouping groups of columns.
I see essentially two ways of defining such "column groups" in VOTable:
a) generalize the <COOSYS> method currently used to describe the coordinate
systems. This kind of "by reference" method defines a structure,
and any <FIELD> can declare (via the "ref" attribute) to be a member of
that structure.
As an illustration of a group of columns containing a flux value
and its error, the XML code could look like:
1. within the <DEFINITIONS> element, define a structure as e.g.:
<STRUCTURE ID="Flux1" name="FluxParameters">
<PARAMETER ID="Freq1" name="Frequency" value="8.6" datatype="float"
unit="GHz" ucd="OBS_FREQUENCY" />
</STRUCTURE>
2. within the <TABLE> definition, columns belonging to this structure
refer to it:
<FIELD name="flux" datatype="float" ref="Flux1" unit="mJy" />
<FIELD name="e_flux" datatype="float" ref="Flux1" unit="mJy" />
b) introduce a new element e.g. <GROUP> in the <TABLE> description which
would contain the fields. The same example of a flux + its associated error
would be coded as:
<GROUP name="Flux" ucd="PHOT_FLUX_RADIO_8.4G">
<FIELD ID="Flux1" name="fluxValue" datatype="float" unit="mJy">
<DESCRIPTION>Value of the flux at 8.4GHz</DESCRIPTION>
</FIELD>
<FIELD ID="e_Flux1" name="errFlux1" ucd="ERROR" datatype="float" unit="mJy">
<DESCRIPTION>Error on flux value</DESCRIPTION>
</FIELD>
<PARAMETER ID="Freq1" name="Frequency" value="8.6" datatype="float"
unit="GHz" ucd="OBS_FREQUENCY" />
</GROUP>
There could be a third way which would introduce new tags within each
table element like e.g. <VAL> and <ERR> to give
<TD><VAL>11.35</VAL><ERR>1.12</ERR></TD>
but it would be against the current philosophy of VOTable which defines
all metadata first, and is followed by the data alone, in order to
keep the efficiency and the FITS compatibility; this third method would
also require frequent modifications of the schema (XMLSchema) -- generally
disturbing for working applications.
The <GROUP> defined in b) above seems to me to be a good framework for
this definition. I see several advantages:
=> the basic tabular scheme remains -- VOTable can still be viewed as a
relational database, and keeps a full compatibility with existing
FITS binary tables;
=> groups of groups (i.e. recursive <GROUP> tags) enables a definition
of arbitrary complex structures;
=> the UCDs become more accurate when defined in a group:
-- adding <PARAMETER> tags within a <GROUP> nicely introduces a way
of parametrizing a UCD
-- FIELDs defined in a group can acquire the UCD of the group
e.g. the error part of the flux group of fields just need the
"ERROR" UCD.
Using the "ref" attribute in <FIELD> also permits one column to be a
member of several groups: for example, an error common to two fluxes
measured at different frequencies can be defined as:
<GROUP name="Flux" ucd="PHOT_FLUX_RADIO_8.4G">
<FIELD ID="Flux1" name="fluxValue" datatype="float" unit="mJy">
<DESCRIPTION>Value of the flux at 8.4GHz</DESCRIPTION>
</FIELD>
<FIELD ID="e_Flux1" name="errFlux1" ucd="ERROR" datatype="float" unit="mJy">
<DESCRIPTION>Error on flux values, both at 8.4 and 7.5GHz</DESCRIPTION>
</FIELD>
<PARAMETER ID="Freq1" name="Frequency" value="8.6" datatype="float"
unit="GHz" />
</GROUP>
<GROUP name="Flux" ucd="PHOT_FLUX_RADIO_7.5G">
<FIELD ID="Flux2" name="fluxValue" datatype="float" unit="mJy">
<DESCRIPTION>Value of the flux at 7.5GHz</DESCRIPTION>
</FIELD>
<FIELD ref="e_Flux1" />
<PARAMETER ID="Freq2" name="Frequency" value="7.5" datatype="float"
unit="GHz" />
</GROUP>
================================================================================
More information about the votable
mailing list