Observation data model comments
jcm at head.cfa.harvard.edu
Mon May 10 23:20:56 PDT 2004
Subject: Re: Observation data model comments
I am now replying to Anita's message on Saturday in her reply
: > > e.g. spatial coverage in objects/deg^2
: > I would say that objects/deg^2 does not belong to "spatial coverage", it's
: > instead a way to characterise the observation.
: I don't quite understand what you mean - what class should objects/deg^2
: be under? (in either case it is under the Characterisation superclass!)
I agree with Alberto. To me, the source density is a measure
not so much of spatial coverage, but of flux (observable) coverage.
It pretty much converts to a limiting flux (you need the luminosity
function, but if that is position-independent you don't need any
spatial information). But in fact it's not even that, it's
its own UCD: source density. So I would just add another axis
to the Characterization of this particular observation,
call it SourceDensity, and tag it with an appropriate UCD.
Now if you do a query to the VO on SourceDensity (give me all catalogs
covering SourceDensity more than this..) you are all set when the
VO gets to this Observation. If the VO gets to an observation
which has no SourceDensity entered in its characterization, but
does have a Flux characterization, then you may still be in luck
*if* the query specifies an assumed luminosity function and also
refers to, or is able to know via some default set of VO
resources, how to link the UCDs for SourceDensity and Flux.
(I realize I'm skipping crucial details like the fact that it
depends what band the flux is in!)
To me this is a nice example to show why specifying that
Characterization should be a fixed set of 5 axes would not work.
The axes given in our example table are just the common examples,
to provide a familiar context for the generalized characterization
variables. They are not meant to represent all the possible axes.
Instead, we provide a framework for data providers and users to describe
whatever constraints make sense to them. The standard axes of position,
time, spectral coordinate and flux will often be the ones wanted, but
not always. Forcing something subtly different like `source density'
into these standard axes will cause trouble and lack of
The source density is not the same as a filling factor -
the filling factor is what fraction of the region we looked at,
not what fraction of the region is filled with sources.
It does potentially imply a confusion/saturation factor, I guess.
: Maybe that is before I put up my latest plot. I have now put Processing
: back as a part of Provenance but I am still a bit concerned as different
: processing methods will give data with different characterisation e.g.
: different synthesised beam size - for interferometry data you can
I think that's fine. The Provenance should apply to a particular
realization of a dataset, and will be different for different
runs of MEMSYS (:-). It may indeed decompose into a part which is
fixed for the observation and a part which varies from realization
to realization, but that needn't concern us at this level of the model.
So if I understand your concern about versioning correctly, it's for the
data discovery part of the problem - if an archive has different
versions of the observation, they can all have different
Characterization and Provenance - that seems fine - but if the archive
can run the mapping software on the fly, there may be a range of
possible characterizations that could be generated - we'll have
to be careful how to represent that in a query response, but I think
it can be done. For the analysis problem, once you have selected
a certain realization of the data to download, everything is well
defined and not problematic.
: > In the last table:
: > Bounds: for Flux more than the noise rms I would use the limiting flux
: > for the lower limit, and I would add the saturation level for the upper limit.
: That is the difference between models for different situations, in radio
: interferometry the limiting flux is partly dependent on how you process
: the data and is normally taken as 3sigma noise (or more if noise is
Here the different things are somewhat interconvertible and can
perhaps be distinguished via UCDs.
: or visibility domain. Similarly there is not usually a saturation upper
: limit (unless you are observing the Sun or very bad rfi or something else
: which makes a receiver go non-linear). There is a limit on the
: signal-to-noise ratio which you can achieve. However in Bounds I put
: quantities which are fixed limits for any one data product
The Observable upper bound here is generic - there's no assumption in
characterization that it's due to saturation or dynamic range limitations,
`saturation' was just an example of something that can set an upper bound.
The details can go in Provenance.
: > Sensitivity: for Temporal I would say "exposure map"
: This is hard to generalise for radio interferometry. To make a good image
: this is the synthesis time, that is, the time it takes for the earth to
I think there's confusion here. For me, the temporal sensitivity is the
change in sensitivity with time. Not the integration time. It would
be the correction factor you have to apply to the visibilities
because some idiot has installed a flaky oscillator in one of the
antenna feeds and the signal from your calibrators is going up and
down by a factor of three every five minutes.
: > Sample precision: for Spatial -> pixel scale,
: As I've explained, if interferometry image data have been conventionally
: reduced, the pixel is a rough indicator of sample precision, but it isn't
: an immutable property of the detector. For extracted source catalogues
Doesn't matter. Characterization is about the dataset in your hand, not
about the detector. So sample precision is exactly that: the
pixelization you have chosen for your map, if a map is what you have.
It doesn't say anything about your point source position precision,
that's covered by the Resolution which is a different element of the
model. If you're dealing with visibility data and haven't made a map
yet, then there's no spatial sample precision to define, but there is
sample precision in the U V coordinates (at least in the pixelized U V
data I have seen coming out of AIPS) and in an ideal world that should
be recorded in U and V characterization axes. Of course it will be a
while before much software will do useful stuff with such axes.
: As an aside, there is a whole additional set of jargon for single dish
: observations, where you may need to know how to convert Tant (antenna
: temperature) into non-instrument-dependent units (using Tsys, efficiency
: and sensitivity) etc. etc.
That can be covered by Mappings on the observable, and by
special T-ant characterization axes if appropriate.
: The main lesson to me from this, is that I think we will never end up with
: one table which covers every domain and even if we did the words chosen
: would mean different things to different people, causing confusion. This
: excercise (which we should indeed repeat for other domains, such as your
: HST expertise, x-ray etc.) should just tell us if the column and row
: headings of the matrix are usable, and then establish a translation
I would say it's the rows (e.g. SensitivityFunction, that's the rows right?)
that must be usable generally. The columns (spatial, etc.) can vary,
although the standard ones will almost always be usable.
: I suugest renaming either or both classes:
: Bounds > SensitivityBounds
: Sensitivity > SensitivityFunction
: - what do you think?
Uurgh. I think that's complicated and long, but I guess I can
live with it.
More information about the dm