dtody at nrao.edu
Wed Apr 2 07:26:37 PST 2003
Hi Markus -
On Wed, 2 Apr 2003, Markus Dolensky wrote:
> As you know in mid May there will be an interoperability meeting in
> Cambridge (http://www.ivoa.net/twiki/bin/view/IVOA/InterOpMay2003).
> In preparation to this event I'd like to gather information on how to
> extend Simple Image Access to spectroscopy including SEDs.
Everyone it seems is interested in extending SIA in various ways. Arnold
has been looking at some aspects of this and it would be reasonable for
you to do so too. Roy and others are interested in the problem of object
identification and replica management. CDS and others have made a number
of suggestions as well. A reasonable approach might be to think about
what we would all like to do and discuss it in the upcoming meetings.
Below is some email I posted within NVO yesterday in preparation for
our upcoming team meeting here (which is tomorrow). This is just an
attempt to summarize many of the things we could do - we probably can't
do them all, so we should decide what our priorities are.
>From dtody at nrao.edu Wed Apr 2 08:19:09 2003
Date: Tue, 1 Apr 2003 13:30:37 -0700 (MST)
From: Doug Tody <dtody at nrao.edu>
To: metadata at us-vo.org, tech at us-vo.org, pmr at us-vo.org
Subject: Call for proposals for SIAP Version 2
Arnold - Thanks for the SAO perspective on the extensions needed to SIA
to better handle your data. I would like to open this up for broader
discussion and solicit suggestions from everyone for further enhancement
We will have at least one more version of SIA before it is replaced by
(or folded into) a more general facility. In considering what to add we
should ask ourselves the following question: what do we most need to add
to SIA to support our VO science prototypes over, say, the next year?
It is easy to come up with a list of features far longer than we can expect
to agree upon in the near future, or get service providers to implement.
We need to identify the highest priority additions, and in keeping with
the philosophy of SIA, find a reasonably simple way to implement these.
It must remain easy for a service provider to put up a basic SIA service.
The full data discovery and data access problem cannot be fully addressed
in a prototype and will ultimately require technology (e.g., a metadata
framework, updated UCDs, data models, etc.) which does not yet exist.
Possible extensions to SIA will be discussed (along with other data
access issues) at the NVO team meeting in Pasadena later this week,
at the IVOA interoperability workshop at Cambridge next month, and in
the various working groups. A reasonable plan would be to try to reach
agreement so that we can proceed with the next version by late May,
after the interoperability workshop.
Below is a strawman list of possible enhancements to the next version
of SIA. This is intended only to provoke further discussion. What else
is needed? What are the priorities? We can hopefully update the list
after we discuss this in Pasadena at the end of the week. - Doug
Simple Image Access
Potential version 2 enhancements
This is my personal highest priority for SIA. Currently there is
no SIA registry - the existing Web list is incomplete and does not
list most of the implementations already out there. Service
discovery is needed to actually use a VO service like SIA.
The following are all high priority candidates for additions to the
image metadata and/or query parameters to better characterize the
images dealt with by SIA.
Image provenance and identification
Needed to identify images, or select images from a particular
source. Minimum requirements:
data collection identifier (e.g., survey name)
dataset identifier (uniquely defined within data collection)
Issue: How to describe virtual data and relate it back to
physical datasets (e.g., relate an image cutout back to the
physical image it is derived from). This gets complicated
in the case of mosaics or other image combinations.
Issue: Recognition and handling of data replicas. We have
to be able to uniquely identify datasets within some namespace
before we can do anything with replicas. What constitutes
a replica may not be clear.
(This issue has been extensively discussed in the mail groups)
Spectral bandpass (already present)
Necessary to select images given the spectral band, or to
select the individual planes of an image cube.
A version of this is already present in SIA and could be used
as the model for other image attributes.
This would be useful for radio data where only a range of
spatial frequencies may be present in the image. Analogous
to spectral bandpass and could be handled in a similar fashion.
The approximate spatial resolution of the image. This can be
problematic if one tries to specify it too precisely, but for
multiwavelength data analysis even a crude estimate of the
spatial resolution can be very useful.
Sensitivity, limiting magnitude or flux, or flux / rms. Exposure
time is not useful here as it does not provide an absolute measure
of the detection limit of the image. A wavelength-independent
representation is needed.
If we extend the data model supported by SIA (see below) we might
want to add an image type attribute to specify the type of "image",
e.g., 2D sky projection, sky projection data cube, spectra, and
so forth. This issue was discussed earlier in connection with
raw and calibrated data and we decided to support only calibrated
data, but we could revisit the issue.
All of the image attribute parameters are potentially useful as
POS,SIZE as currently defined is simple but is not well defined
at the pole. This is only worth fixing (for SIA) if we can find a
simple solution. POS is used mainly to define the query region - the
actual WCS image footprint given in the returned metadata can always
be used by the client to select the actual images to be retrieved.
It might be nice to be able to specify the coordinate system -
however any such query generalization places an extra burden on
service providers. The alternative is to specify a larger region in
ICRS coordinates and refine the image selection on the client side.
Region rotation is currently handled in the same way.
It might be nice to have the ability to simultaneously query multiple
2D sky images
Currently supported. We need to characterize the data better
as discussed under Image Attributes above.
Spectral data cubes
Do we want to improve support for spectral data cubes? What is
required? Probably the most useful option would be the ability to
use a bandpass parameter to select a particular band to be returned.
Support for spectra could be added to SIA - but this could get
complicated. In what format should the spectra be returned,
FITS or XML? Currently SIA is flexible in the format of the
data returned. If most of the complexity of spectra is left
to the output dataset then it could be possible to discover and
retrieve spectra using SIA. It might be necessary to return only
some restricted metadata in the VOTable, similar to what is done
for images, with the real spectral metadata being returned in the
spectral dataset. More ambitious schemes are possible of course.
For most VO usage event list data is probably most useful if
rendered into an image by the image service. It might be useful
to be able to pass through event list specific image-generation
parameters to the service (e.g., a time filter) but it could
be difficult to standardize anything more than a pass-through
In some cases (as Arnold suggests) it could be useful to retrieve
the actual event list as a table, but this could get complicated.
Currently SIA targets calibrated data, not raw data. Is there
a standard export format which we could use? Handling event
data can be difficult due to the complexity and to instrument
dependencies. In any case, it would not be difficult for SIA
to use image metadata to characterize event data and provide an
access reference, with the format of the data left unspecified.
Most of the comments for event lists apply here as well.
SIA could be used to retrieve this data but we have to ask how
useful this would be, compared to the archive access mechanisms
already available for retrieval of raw data. In the VO context
it is probably more interesting to look at on-the-fly calibration
and imaging, returning generated images to the client. This can
be done even with the current SIA.
Resource and service metadata
This needs further work in connection with the work on registries.
Further information on coverage is needed, as well as more
complete service specific metadata. The service should be able
to completely describe itself and the service registry should
query the service to obtain all information to be cached in the
Currently we are using a URL-based scheme. Do we want to
add a web services-based protocol? This is certainly doable,
but would our client software be able to do anything useful
with the data?
Staging and messaging
The interface contains a preliminary specification for this
but thus far it has not been needed and has not been used.
Replica identification and selection
Is this an image metadata issue, or an access issue, or both?
Currently we are deferring this - it should be possible to add it
later without changing the current interface, which is data centric.
More information about the dal