VOTable alternative? Mapping Rich Data to DBMSs

Martin Hill mchill at dial.pipex.com
Tue Jan 27 04:02:25 PST 2004


Yes I think we're talking about different things.  What I mean by 'flat' is 2 
dimensional, not curved, but not necessarily level (bit like my appartment floor :-)

You can represent pretty much any structural data in a single 2d table in 
exactly the way that SQL query results do - the more structure the data has, the 
sparser the table is likely to be.  If we consider a typical catalogue 
datacenter with a RDBMS of sky objects, it is likely to consist of many 
different tables - some with galaxy-specific data, some with quasar-specific 
stuff, some with cross-indexing for grouping objects, and so on.  When we query 
such a database we get a 2d table back that loses all this structure.

I was only talking about the data, not about how it relates to the astronomical 
or schema-like metadata.  But again, when we come to describing our astronomical 
metadata (UCDs, passband characteristics, etc) a deep structure is much more 
useful, flexible and yet standard than the 2d limitations of VOTable's metadata.

On your lastish comment, as you say there is no satisfactory method yet for 
storing or extracting a datacenter's metadata.  For datacenters we are going to 
have to have a new and specific way of describing its astronomical metadata, and 
I believe this is what the registry group are working on, but I'm not sure how 
far they go.

Cheers,

Martin

Clive Page wrote:

> On Tue, 27 Jan 2004, Martin Hill wrote:
> 
> 
>>In fact few (if any?) of our databases are flat.  However the results of
>>our queries usually return a 'joined' table that is, so we would need to
>>unflatten it to recreate structure.  This is a way of re-introducing the
>>natural structure of the information - with the normal query result we
>>get, for example, shape cells for objects that don't have shape.
> 
> 
> I'm not sure I agree with that, or maybe we disagree on the meaning of the
> word "flat" (or the word "is" like Bill Clinton did).
> 
> A FITS table is a good way of representing a table, a huge improvement on
> the way that the relational DBMS does this, as it includes a lot of
> important metadata.  These are metadata in both senses, structural
> metadata such as the number of columns and data type, and astronomical
> metadata such details of the telescope which produced the dataset and the
> reference frame for the coordinates.
> 
> FITS has a rather crude way of representing metadata, essentially only
> what you can get on an 80-column punched-card, but even though its still
> way ahead of any commercial DBMS.  The columnar metadata isn't represented
> like this in FITS, but actually it is best set out as another table, for
> example like this:
> 
> 1	RA	real*8	degrees		F8.4	not null
> 2	DECL	real*8	degrees		F8.4	not null
> 3	VMAG	real*4	magnitudes	F5.2	nullable
> 4	XPOS	int*2   pixels          I4      nullable
> 
> and so on.  (of course even FITS can handle more metadata than this, such
> as scale factors, and a text description, but this is just an example).
> 
> So a tabular dataset could be represented quite well as two tables (one of
> actual data, one of columnar metadata) plus some other overall metadata
> which applies to the whole table (name of observer, equinox, epoch, etc).
> The last are mostly simple and scalar (but not quite all, suppose you want
> seeing conditions as a function of time?) - would VOTable cope with this?
> 
> This isn't quite a flat structure, but pretty nearly.  Even if you join
> one such table with another, you don't make the resulting structure any
> more or less flat than the input structures (which is where I fail to
> understand Martin's remarks).
> 
> As far as the columnar metadata goes:
> 
> * FITS stores them but not as a table and all mixed up with other
> metadata;
> 
> * VOTable puts them in an XML structure the tabular nature of which isn't
> at all evident;
> 
> * an RDBMS holds a tiny bit of columnar metadata in internal tables which
> you can't get at easily, and refuses to store anything else (such as
> physical units).
> 
> None of these solutions is very good; they mean that we have no obvious
> way of passing columnar metadata to the Registry, for example, except by
> inventing yet another format for them (which existing tools like TOPCAN
> then cannot handle).
> 
> On the whole, however, I think I agree with Guy and Norman: the case for
> VOTable2 is not yet proven.
> 
> 

-- 
Martin Hill
Software Engineer
AstroGrid @ ROE
Tel: +44 7901 55 24 66
www.astrogrid.org




More information about the votable mailing list