VOTable alternative? Mapping Rich Data to DBMSs
Martin Hill
mchill at dial.pipex.com
Tue Jan 27 04:02:25 PST 2004
Yes I think we're talking about different things. What I mean by 'flat' is 2
dimensional, not curved, but not necessarily level (bit like my appartment floor :-)
You can represent pretty much any structural data in a single 2d table in
exactly the way that SQL query results do - the more structure the data has, the
sparser the table is likely to be. If we consider a typical catalogue
datacenter with a RDBMS of sky objects, it is likely to consist of many
different tables - some with galaxy-specific data, some with quasar-specific
stuff, some with cross-indexing for grouping objects, and so on. When we query
such a database we get a 2d table back that loses all this structure.
I was only talking about the data, not about how it relates to the astronomical
or schema-like metadata. But again, when we come to describing our astronomical
metadata (UCDs, passband characteristics, etc) a deep structure is much more
useful, flexible and yet standard than the 2d limitations of VOTable's metadata.
On your lastish comment, as you say there is no satisfactory method yet for
storing or extracting a datacenter's metadata. For datacenters we are going to
have to have a new and specific way of describing its astronomical metadata, and
I believe this is what the registry group are working on, but I'm not sure how
far they go.
Cheers,
Martin
Clive Page wrote:
> On Tue, 27 Jan 2004, Martin Hill wrote:
>
>
>>In fact few (if any?) of our databases are flat. However the results of
>>our queries usually return a 'joined' table that is, so we would need to
>>unflatten it to recreate structure. This is a way of re-introducing the
>>natural structure of the information - with the normal query result we
>>get, for example, shape cells for objects that don't have shape.
>
>
> I'm not sure I agree with that, or maybe we disagree on the meaning of the
> word "flat" (or the word "is" like Bill Clinton did).
>
> A FITS table is a good way of representing a table, a huge improvement on
> the way that the relational DBMS does this, as it includes a lot of
> important metadata. These are metadata in both senses, structural
> metadata such as the number of columns and data type, and astronomical
> metadata such details of the telescope which produced the dataset and the
> reference frame for the coordinates.
>
> FITS has a rather crude way of representing metadata, essentially only
> what you can get on an 80-column punched-card, but even though its still
> way ahead of any commercial DBMS. The columnar metadata isn't represented
> like this in FITS, but actually it is best set out as another table, for
> example like this:
>
> 1 RA real*8 degrees F8.4 not null
> 2 DECL real*8 degrees F8.4 not null
> 3 VMAG real*4 magnitudes F5.2 nullable
> 4 XPOS int*2 pixels I4 nullable
>
> and so on. (of course even FITS can handle more metadata than this, such
> as scale factors, and a text description, but this is just an example).
>
> So a tabular dataset could be represented quite well as two tables (one of
> actual data, one of columnar metadata) plus some other overall metadata
> which applies to the whole table (name of observer, equinox, epoch, etc).
> The last are mostly simple and scalar (but not quite all, suppose you want
> seeing conditions as a function of time?) - would VOTable cope with this?
>
> This isn't quite a flat structure, but pretty nearly. Even if you join
> one such table with another, you don't make the resulting structure any
> more or less flat than the input structures (which is where I fail to
> understand Martin's remarks).
>
> As far as the columnar metadata goes:
>
> * FITS stores them but not as a table and all mixed up with other
> metadata;
>
> * VOTable puts them in an XML structure the tabular nature of which isn't
> at all evident;
>
> * an RDBMS holds a tiny bit of columnar metadata in internal tables which
> you can't get at easily, and refuses to store anything else (such as
> physical units).
>
> None of these solutions is very good; they mean that we have no obvious
> way of passing columnar metadata to the Registry, for example, except by
> inventing yet another format for them (which existing tools like TOPCAN
> then cannot handle).
>
> On the whole, however, I think I agree with Guy and Norman: the case for
> VOTable2 is not yet proven.
>
>
--
Martin Hill
Software Engineer
AstroGrid @ ROE
Tel: +44 7901 55 24 66
www.astrogrid.org
More information about the votable
mailing list