Registry science metadata - Brown Dwarf example
amsr at jb.man.ac.uk
Wed Apr 30 05:43:02 PDT 2003
Here is an example of the science metadata which would be pulled out of
the registry to answer one of the AstroGrid Science Cases.
I thnk that we need to be very clear about the difference between metadata
needed to enable dataset selection (automatic, using VO algorithms, or
just returning a list to the human), which can be handled in a separate
stage(s) just within the registry, and the metadata used in actually
evaluating the query; in the latter case we need to go and fetch the
dataset, or send an agent to extract values from it in situ, etc.
The material below can also be found at
with links to explanations, the original case etc.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Dr. Anita M. S. Richards, AVO Astronomer
MERLIN/VLBI National Facility, University of Manchester,
Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K.
tel +44 (0)1477 572683 (direct); 571321 (switchboard); 571618 (fax).
BrownDwarf Registry Requirements - Science content
Where values are strings, I have stated the options if specific values are
needed. Otherwise, I have
just given the value type, and the query will select values within a
coverage "format" value = "ascii", = "VOTable"
coverage "decmin" value = decimal
coverage "sourcedensity" value = decimal
coverage "tablenrows" value = decimal
coverage "tablesize" value = decimal
coverage "startdate" value = decimal
dataquality "astrometryerror" value = decimal
dataquality "photometryerror" value = decimal
dataquality "timingerror" value = decimal
subjectkeyword value = "Null", = "Milky Way", = "Stars"
"type" value = "catalogue", = "survey"
wavelengthrange value = "ir", value = "optical"
wavelengthshort value = decimal
wavelengthlong value = decimal
Additional metadata if nDim data as well as catalogues are used:
coverage "angularfraction" value = decimal
coverage "format" value = "FITS"
resolution "angularresolution" value = decimal
resolution "spectralresolution" value = decimal
"type" value = "archive"
wavelengthrange value = "uv", value = "mm"
Note these are not 'proper' UCDs as we do not yet have a final convention,
but show the sort of
things which need UCDs. I am not differentiating between UCDs in the
header e.g. to describe a
single overall positional uncertainty and UCDs for each column.
I hope we adopt some sort of modular or atomic UCD structure, and a
capacity to cross-reference
columns, so that UCDs below which are not single words would be built up
in a logical way but we do
not have to have e.g. PHOT_ABC where ABC is every possible frequency or
filter. The exact order of
conditional queries (e.g. look for Optical first? or Photometry first?)
depends on what UCD convention
we adopt. There may be more than one error per quantity, e.g. systematic
errors in the catalogue
header plus random errors per entry.
Membership of StellarCluster
AngularPosition (RA, Dec)
AngularPosition Error (RA, Dec)
Optical Colour, Photometry (I_Band, R_Band, numerical band spec.))
IR Colour, Photometry (K_Band, numerical band spec.))
Photometry FilterBandpass (for the above, e.g. Cousins, Johnson)
Photometry Error (per band)
ChemicalAbundance (Li, CH4 etc.)
Additional UCDs describing nDim image and spectral DataSets, if used.
Examples of searching the Registry:
For this example I consider only catalogues, not extraction of data from
images, spectra etc. If we
were doing the latter we would need additional ResourceMetadata about
I have used what I hope are JAVA arithmetical and logical operators in
most places for brevity,
occasionally I have spelt out operators for clarity.
1. RegistryQuery for potential Brown Dwarfs located in Galactic Clusters
1.1 Query ResourceMetadata
(format value == "ascii" || "VOTable") &&
(type value == "catalogue" || "survey") && (subjectkeywords value ==
"Null" ^ ("Milky Way" || "Stars")
&& (coverage "decmin" value != NaN))
This should select DataSets which are tabular (not images or other such -
this iteration) and
contains measurments of sources (as distinct from a list of instrument
pointings) (the suggested
values may not be a complete list) It either has no subjectkeywords, i.e.
the things it lists are
unclassified, or they are explicitly classified as Milky Way or Stars.
The DataSets should have meaningful values of decmin, implying that they
information. This means that we must fill in angular coverage for any data
set containing celestial
coordinates or object names which can be resolved by SIMBAD into
1.2 From the DataSets meeting criteria in 1.1, query ResourceMetadata and
accordingly (perform steps in order)
coverage "sourcedensity" descending order (ie high=good)
coverage "tablenrows" descending order for completeness or ascending order
coverage "tablesize" ascending order
dataquality "astrometryerror" ascending order
dataquality "timingerror" ascending order
This is an optional prioritisation step to allow only the DataSets which
are more likely to be useful to
be selected, and/or to choose the order in which DataSets are queried or
(the interim results) moved.
Prioritisation could be applied at any later step. Other criteria like
sensitivity could also be used. This
implies that, for the errors, NaN counts as very large. It would be
simplest if Null values were not
allowed for sourcedensity and the size etc. of the DataSets
1.3 From the DataSets meeting criteria in 1.1, query UCD list
UCD == (StellarCluster || (Star && MembershipofStellarCluster?))
This should select DataSets which either explicitly list stellar clusters,
or which list stars and states
whether they are members of a cluster.
3. RegistryQuery for existing proper motion and distance measurements
3.1 From the DataSets meeting criteria in 1.1, query UCD list
UCD == (ProperMotion)
3.2 From the DataSets meeting criteria in 1.1, query ResourceMetadata
coverage "startdate" value != NaN
This should select DataSets with meaningful values of the epoch of
observation (so proper motions
could be calculated if not already explicitly catalogued).
5. RegistryQuery to enable colour-colour selection:
5.1 From the DataSets meeting criteria in 1.1, query ResourceMetadata
wavelengthrange (value = "ir" || value = "optical")
This should select DataSets which explicitly contain optical or IR
5.2 From the DataSets meeting criteria in 5.1, query ResourceMetadata
((wavelengthshort (value x))
(wavelengthshort (value y)))
This should select DataSets which explicitly contain optical or IR
measurements, or which cover at
least part of the wavelength range x-y which covers the I, R and K bands
as defined in the optical/IR.
5.3 From the DataSets meeting criteria in 5.1, query UCD list
UCD = (Optical && (Photometry && ((I_Band || R_Band || numerical band
(IR && (Photometry && ((K_Band || numerical band spec.)))) ||
((Optical || IR) && (Colour && (I_Band || R_Band || K_Band || numerical
This should select suitable DataSets whether or not they have a detailed
numerical description of the
wavelength coverage (ideally all should have but this may take a while to
implement) - or which
already contain Colour information. In the case of e.g. K/R colour, I
presume this would be classified
as both IR and Optical.
More information about the registry