VOTable session @ Interop.Moscow

Mark Taylor m.b.taylor at bristol.ac.uk
Thu Sep 21 07:14:58 PDT 2006


On Thu, 21 Sep 2006, Bob Mann wrote:

> On Thu, 21 Sep 2006, Mark Taylor wrote:
> > I'd like to see a concrete use case for this kind of thing
> 
>  	How about the following use case, which is fairly concrete
> in that I had a PhD student doing essentially this a year or so ago.
> 
>  	An astronomer wants to cross-match sources in catalogue
> A with objects in catalogue B using an algorithm which cannot be
> executed inside either database or as part of an ADQL query. She
> runs a conesearch (or equivalent) on each database, to extract the
> entries from each lying in the area of sky in which she is interested,
> and obtains two VOTables, votA and votB. She passes these both to
> her cross-matching service, which returns a VOTable recording pairs
> of entries in votA and votB which her algorithms judges to be safe
> matches.
> 
>  	As Mark points out, it would be perfectly possible for the
> final VOTable to have entries of the form (row N in votA, row M
> in votB), but this would limit the future utility of the cross-matches.
> If votA and votB have been extracted by simple conesearches (or any
> equivalent query that does not have an "order by" clause) then the
> ordering of the rows in votA and votB is arbitrary, in the sense that
> re-running those queries will not necessarily yield the same ordering.
> 
>  	This means that the cross-match pairs can only be used by people
> having access to votA and votB, not their functional equivalents
> generated by running the same conesearch queries at a later date. It
> would be much nicer if the IDs for the rows in votA and votB referred
> to some (assumed persisent) identifier for rows in A and B, since then
> the cross-match pairs could be re-used with any data extracted from A
> and B, and not just the particular VOTable votA and votB.
> 
>  	Now, Mark may counter - and I apologise for putting words into
> his mouth - that the VOTable format should not be influenced by wider
> concerns like that, and should only be concerned with how these
> particular files votA and votB are used....and he may be right in
> saying that, but as, in practice, VOTables often (usually?) contain 
> subsets of larger datasets contained in databases, I think it would be 
> very useful for there to be a mechanism whereby rows in VOTables could
> be identified using references to rows in their parent database.

Bob,

your use case is certainly a reasonable one, but I'd say the way to
tackle it is simply to insert an identifier column in each table,
containing the same tag as you would have put in the TR ID attribute.
In this way the server can attach a unique persistent identity to each 
row which application software can make use of without requiring any
new special purpose machinery to keep track of per-row TR ID attributes.

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/



More information about the votable mailing list