|
|
International Virtual Observatory Alliance |
We present a data model describing the structure of spectrophotometric datasets with spectral and temporal coordinates and associated metadata. This data model may be used to represent spectra, time series data, segments of SED (Spectral Energy Distributions) and other spectral or temporal associations.
This is a Proposed Recommendation, developed with the intention to support the Simple Spectral Access Protocol. The working group seeks confirmation that comments have been addressed to the satisfaction of the full IVOA community.
This document has been developed with support from the National Science Foundation's http://www.nsf.gov / Information Technology Research Program under Cooperative Agreement AST0122449 with The Johns Hopkins University, from the UK Particle Physics and Astronomy Research Council (PPARC) http://www.pparc.ac.uk , and from the Eurpean Commission's Sixth Framework Program http://fp6.cordis.lu/fp6/home.cfm
via the Optical Infrared Coordination Network (OPTICON), http://www.astro-opticon.org .
The Virtual Observatory (VO) is general term for a collection of federated resources that can be used to conduct astronomical research, education, and outreach.
The International Virtual Observatory Alliance (IVOA) (http://www.ivoa.net ) is a global collaboration of separately funded projects to develop standards and infrastructure that enable VO applications.
Spectra are stored in many different ways within the astronomical community. In this document we present a proposed abstraction for spectral data and serializations in VOTABLE, FITS, and XML, for use as a standard method of spectral data interchange.
We distinguish in several places between the implementation proposed in this document, referred to as Version 1, and capabilities proposed for possible later implementation.
2007 Jul 10 V1.01 RC3 Rev 1 - Minor edits following RFC
2007 May 15 V1.01 RC2 Rev 15 - Proposed recommendation.
2007 May 1 V1.01 RC2 Rev 14 - Trivial cover formatting
2007 Apr 30 V1.01 RC2 Rev 13 - Added Support.Extent as suggested by A. Micol - Improved text in several places
2007 Apr 26 V1.01 RC2 Rev 12 - UCD time.expo updated to time.duration/stop/start;obs.exposure - Updated VOTABLE examples
2007 Apr 25 V1.01 RC2 Rev 11 - Included the correct file; Rev 10 was bogus
2007 Apr 17 V1.01 RC2 Rev 10 - Fixed errors in XSD and in text - Revised Characterization text - Added RESTZ FITS keyword for CoordSys.SpectralFrame.Redshift
2007 Apr 12 V1.01 RC2 Rev 9 - Incorporate D Tody comments - DataID.Title mandatory - Changes to recommended case of Utype fields e.g. Redshift not redshift. - Utypes involving stat.error changed to put the stat.error first.
2007 Apr 4 V1.01 RC2 Rev 8 - FITS keyword TMID added; TDMINn/TDMAXn
2007 Apr 1 V1.01 RC2 Rev 7 - Modifications for compatibility with Char working draft: Moved Calibration utype from CharAxis.Accuracy to CharAxis Added SamplingPrecision.SampleExtent and SamplingPrecision.SamplingPrecisionRefVal.FillFactor
2007 Feb 12 V1.01 RC2 Rev 5 - Curation.Reference can have multiple instances
2007 Jan 17 V1.01 RC2 Rev 4 - Changed FITS keyword SIZE to DATALEN (D Tody request) - Added text describing use of non standard units. - Reformat units in Tables 2,3 to OGIP convention
2006 Dec 11 V1.01 RC2 Rev 3 - Fixd more typos in XSD and XML example
2006 Dec 6 V1.01 RC2 Rev 2 - Upgraded UCDs to version 1.21 - Added SpectralAxis.ResPower and SPECRP keyword for resolving power; added element to XSD. - XSD changed segmentType definition to put Data element at end of sequence. - XSD corrected type errors in a few cases in Curation type. - XSD added missing elements CreatorDID, Bandpass to DataID. - XML instance example corrected errors in Characterization axes. - FITS keywords changed: CREATOR to AUTHOR; DERERR to DERZERR - FITS added more TUTYPn keyword examples. - FITS added comment on VOCSID - Corrected mistakes in FITS and VOT examples - Clarified role of Aperture - Further clarified CreatorDID, PublisherDID, DatasetID distinction. - Clarifications and corrections in text
2006 Oct 22 V1.0 RC1 (since V0.98d Rev 4) - Added table numbers - Changed some defaults in Table 1 - Added flux UCDs for transmission curves, polarized flux - Amplified discussion of RedshiftFrame - Added Spectral location and bounds - Reorganized order of some sections - Further rationalization of FITS keywords, rewrote FITS section - Added TUCDn and TUTYPn
We need to represent a single 1-dimensional spectrum in sufficient detail to understand the differences between two spectra of the same object and between two spectra of different objects.
We need to represent time series photometry, with many photometry points of the same object at different times.
Finally, we need to represent associations of spectra, such as the segments of an echelle spectrum, or spectral energy distributions (SED) which consist of multiple spectra and photometry points, usually for a single object. The 'Spectral Associations' model will be described in a separate document which builds on the structures described here.
Our model for a spectrum is a set of one or more data points (photometry) each of which share the same contextual metadata (aperture, position, etc.). Specifically, a spectrum will have arrays of the following values:
|
|
and will have associated metadata including, for example,
|
|
In later sections we elaborate these concepts in detail, including some complications that we explicitly do not attempt to handle in this version. The data model fields and possible values are listed. We distinguish between optional and required fields in the text, as well as via a column in the tables which has values of MAN (Mandatory, i.e. required), REC (Recommended) and OPT (Optional). Where appropriate we list those values of the physical units which interoperable implementations are required to recognize.
|
|
|
Figure 1: UML class diagram for the spectral data model. The Characterization, Curation, DataID and Derived classes are shown in detail below in diagram form and with further text description in Section 5. The minimal required content is:
Note that each Spectrum instance has only one spectral coordinate axis. If you want to provide *both* flux-vs-wavelength and flux-vs-frequency for a single dataset, you must (in this version of the model) make two separate instances (VO resources).
|
|
|
|
Figure 2: Diagram for Data object
|
|
|
|
Figure 3: Diagram for Characterization object
|
|
|
|
Figure 4: Diagram for CoordSys object
|
|
|
|
Figure 5: Diagram for remaining metadata: Curation, DataID, Derived, Target objects
|
We adopt the WCS/OGIP convention for units: Document OGIP 93-001
(http://legacy.gsfc.nasa.gov/docs/heasarc/ofwg/docs/general/ogip_93_001/ogip_93_001.html).
Briefly, units are given in the form
10**(-14) erg/cm**2/s/Hz, 10**3 Jy Hz
i.e. with exponents denoted by **, division by /, multiplication by a space.
This format is mostly consistent with the AAS standards for online tables in journals (http://grumpy.as.arizona.edu/~gschwarz/unitstandards.html) except for the use of space rather than "." for multiplication and the fact that we do not require the use of SI units.
SI prefixes for units are to be recognized; for instance, the listing of "m" as a known unit for wavelength implies that "cm", "nm", and "um" (with "u" the OGIP convention for rendering "micro") are also acceptable.
Until IVOA generic unit conversion software is mature and widely deployed, it is helpful to interoperable applications to include a representation of the units in "base SI form", including only the base units kg, m, s (and possibly A, sr) with a numeric prefix. Pedro Osuna and Jesus Salgado have proposed a representation in the spirit of dimensional analysis, using the symbols M, L, T to signify kg, m, s respectively and omitting the ** for powers, so that
10**3 Jy Hz
which is equivalent to
10**-23 kg s**-2
is written compactly as
10-23MT-2
This alternate representation is supported for the main model fields (time, spectral coordinate and flux) only.
Although the spectral model is flexible enough to permit different units for each field, as a matter of style we strongly recommend that whenever possible the same units should be used for compatible fields (e.g. flux and error on flux).
UCDs or Uniform Content Descriptors are the IVOA's standardized vocabulary for astronomical concepts. In this document we use UCDs as field attributes (for example, element attributes in XML) to distinguish alternate physics within the same data model roles - for example, to distinguish frequency versus wavelength on the spectral coordinate `X-axis'.
The current list of UCDs is http://cdsweb.u-strasbg.fr/UCD/ucd1p-words.txt with syntax defined in the UCD recommendation http://www.ivoa.net/Documents/latest/UCD.html.
UCDs should be case insenstive.
UTYPE was a concept introduced in VOTABLE to label fields of a hierarchical data model. The word is now used generally to mean a standard identifier for a data model field. They are also case-insensitive and are of the form a.b.c.d where the dots indicate a 'has-a' hierarchy with the leftmost element being the containing model and the rightmost element being the lowest level element referred to. This is quite close to a simple XPATH in an XML schema, but we chose not to use slash instead of dot to emphasize that we are only specifiying the element type, not the exact position in an instance (so no sophisticated query syntax). We use the terms 'data model field' and 'UTYPE' interchangeably.
The simple Packaging model for SSA describes the format of the associated dataset. Allowed values for the format are briefly listed here; Detailed serialization for formats 4 to 6 are not specified; The metadata (format 7) is not returned by the standard SSA call; it instead uses a new getCapabilities option. See the SSA protocol definition document for details of this. These packaging values will be part of the SSA protocol response, and are implicit in the individual serializations. We only discuss formats 1 to 3 in this document.
|
|
The DM fields (or UTYPEs) for the Spectrum DM are tabulated on the following pages. The field names are to be used as the UTYPE values in VOTABLE serializations and in the TUTYPn keys in the FITS serialization.
We specify fields that are MANDATORY (MUST), RECOMMENDED (SHOULD), or OPTIONAL (MAY). MANDATORY fields are in bold. MANDATORY means that the document must provide a value; however, the value may be UNKNOWN (the value exists but is not known) or N/A (not applicable: for example, RA and DEC for a moving object or absolute time for a theory simulation). RECOMMENDED means that a data provider should try to fill the relevant fields if possible, but the document is still compliant if they are omitted. However, particular serializations (FITS, VOTABLE, etc) may amend these requirements by specifying default values for the serialization.
These requirements apply specifically for the spectrum application of this model; the VO may specify different MANDATORY/RECOMMENDED/OPTIONAL requirements for time series and other applications of the same model.
Some optional ID and UCD fields are allowed but are not listed below.
The fields are explained in more detail in the following sections.
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Astronomers use a number of different spectral coordinates to label the electromagnetic spectrum. The cases enumerated by Greisen et al. (2006) are listed below with their UCDs.
MANDATORY: Exactly one Spectrum.Char.SpectralAxis field should be present, with units and one of the UCD values listed below. We distinguish between the VO data model field name (which might be used for VOTABLE UTYPE), the FITS WCS name (provided for comparison only), and the UCD1+ names.
Note 1: For this version, only the first four entries, Wavelength, Frequency, Energy, and spectral channel, should be used for interoperable transmission of data - implementations are not required to understand (convert) the other UCD values.
Note 2: For the velocity cases, the UCD uses a spect.dopplerVeloc tree rather than a src.veloc tree, because the velocity here is really a labelling of a spectral coordinate, and the link to the physical radial velocity of the different emission sources contributing to the spectrum is rather indirect.
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Two instances of the Flux object are supported: Flux and BackgroundModel. The Flux may be either the background-subtracted net flux or the total flux (the source+background), in the latter case hopefully with the BackgroundModel (see below). Net and total flux are distinguished by the `src.net' UCD adjective.
For each of these cases, there are many slightly different physical quantities covered by the general concept of Flux; we distinguish them by their UCD. The table contains a list of flux quantities that applications should expect to read and handle. If you create a Spectrum instance with a flux quantity or flux unit not in the list below, you should expect that applications will be able to propagate it and recognize it, but not be able to merge it or compare it with other Spectrum instances. (For example, an application trying to measure line wavelengths shouldn't care too much that it doesn't understand what the flux units are).
Note in particular the distinction between the unit count (an instrumental value) and the unit photon (used in the photon number flux, i.e. the number of photons incident; photon number flux = energy flux divided by photon energy).
Note: The concept of the "nu L-nu" or "lambda L-lambda" luminosity flux, or equivalently the luminosity per logarithmic energy interval L(log nu), is a distinct concept in the world of spectral energy distributions - and it's a different concept from the bolometric luminosity, which has the same units. The UCD board has not yet approved a UCD expressing this concept; we have to use phys.luminosity and infer the concept from the units. My solution for brightness temperature is also rather questionable.
Note: we propose the UCD spect.continuum to represent continuum flux.
|
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
We optionally allow a BackgroundModel value for each Flux value. We define NetFlux = TotalFlux - BackgroundModel. The name BackgroundModel, rather than Background, reminds us that it is an estimate: often, the BackgroundModel will be generated by taking a flux measurement at another location and rescaling it for any difference in exposure time or extraction aperture.
The BackgroundModel array is required to have the same UCD and units as the Flux array. It represents a model for the expected flux values if the Target had zero flux.
OPTIONAL: There may be at most one BackgroundModel.Value field present. It must have the same UCD as the Flux.
For data with a time-series component, whether regularly sampled or sparse photometry points, the time coordinate is given by an elapsed time in some physical units (e.g. seconds or days) relative to a reference time.
This reference time is given in MJD as the field Spectrum.Char.TimeAxis.Coverage.Location, as described in the Characterization section. For a simple spectrum with no time-resolved data, this is the time of the observation (ideally the midpoint).
For time-resolved data, the time coordinate Spectrum.Data.TimeAxis.Value refers to the midpoint of the sample interval. See the Space-Time Coordinates document for details of time coordinate complexity.
The time unit is specified by a string, and the only valid values for this unit are 's' (seconds) and 'd' (days).
In general we may consider position coordinates as part of the measurement (and possibly varying from point to point), but this capability is not included in the current document. The (celestial) position of the aperture for the spectrum is given in the spatial Spectrum.Char.SpatialAxis.Coverage.Location field. The Spectrum.Char.SpatialAxis.Coverage.Location.Value field is in the coordinates of CoordSys.SpaceFrame. The default is ICRS RA,Dec in decimal degrees.
We include accuracy models for both the coordinates (spectral, spatial and temporal) and the fluxes. The accuracy can appear in two places: in the global characterization, where it represents typical accuracy for the dataset, and in the data points themselves, providing a way to provide per-data-point errors. All the Accuracy fields are optional, both in the per-data-point fields and in the Characterization instances; the per-data fields default to the values in Characterization.
We express the bandpass for each spectral bin as a low and high value for the spectral coordinate, or as a width. The same is done for photometry points, which amounts to approximating a filter by a rectangular bandpass. Time bins are also given as low and high values or as a width. Note that the width values are suitable for Spectrum.Char (the global accuracy) while the bin low/high values only have meaning for Spectrum.Data (the per-data-point values).
Only one of BinSize, or both BinLow and BinHigh, must be present (possibly as a header parameter implying a constant value for each flux point). If absent, the bin limits are assumed to be halfway between the coordinate values and bounded by the range given in Char.*.Coverage.Extent.
In addition to the binning, we allow the model to express uncertainties (which may be larger than the bin width), both statistical and systematic. We allow one or two-sided statistical errors but only one-sided systematic errors. You can specify StatErr, or StatErrHigh/StatErrLow, but not both. Statistical errors which have the same units as the data, and systematic errors which are dimensionless fractions (e.g. a 5 percent systematic error is expressed as 0.05).
For position we have a single statistical error - a two-sided error doesn't make sense for a 2D coordinate. Eventually we may want a full error ellipse, but this is too complicated for the present model.
We also use a very simple error model for the fluxes: we include plus and minus flux errors, and a quality flag. The errors are understood as 1 sigma gaussian errors which are uncorrelated for different points in the spectrum. If the data provider has only upper limit information, it should be represented by setting the flux value and the lower error value equal to the limit, and the upper error value equal to zero (e.g. 5 (+0,-5)). In general applications may choose to render measurements as upper limits if the flux value is less than some multiple (e.g. 3) of the lower error. We also allow a systematic error value, assumed constant across a given spectrum and fully correlated (so that, e.g. it does not enter into estimating spectral slopes).
CLARIFICATION: the two-sided errors StatErrLow and StatErrHigh are the plus/minus ERRORS, not the (value+error, value-error). In other words, if Value = 10 and there is a symmetric uncertainty of 3, the ErrorLow and ErrorHigh are both +3.0, and NOT 7.0, 13.0. This is different from the sampling description BinLow and BinHigh, which give the VALUES at the low and high end of the bin. Thus if the central wavelength of the bin is 4200.0, and the bin size is 10, then the BinLow and BinHigh values are 4195.0, 4205.0 and NOT 10.0, 10.0. Note that because of this, 0.0 is NOT an acceptable default for BinLow and BinHigh, while it IS acceptable (albeit unlikely) for StatErrLow and StatErrHigh.
The StatErrLow, StatErrHigh, SysError fields for SpectralCoord, Time, Sky and Flux are optional; however, omitting these fields indicates that the errors are unknown. Data providers are STRONGLY encouraged to provide explicit error measures whenever possible.
We also include a trivial resolution model: a single number nominally representing a FWHM spectral or time resolution expressed in the same units as the spectral or time coordinate. The default is to assume that the resolution is equal to the BinSize if defined. The spatial (sky) resolution may be useful to know if it exceeds the aperture size; the default is to assume it is equal to the aperture size.
For the spectral characterization, we allow an alternative field called the spectral resolving power: Spectrum.Char.SpectralAxis.ResPower: this is the dimensionless Lambda/DeltaLambda. It is often preferred for spectra because it is often more constant across the spectrum than the resolution. ResPower and Resolution can be interchanged by dividing out Coverage.Location.
Similar quantities can't really be defined for temporal and spatial resolving power since there's no absolute time or spatial scale, so we call the spectral one out as a special case. One could define a temporal or spatial frequency using the bounds - i.e. just the number of resolution elements in the spectrum - but that's a slightly different concept.
The Quality model represents quality by an integer, with the following meanings: 0 is good data, 1 is data which is bad for an unspecified reason (e.g., no data in the sample interval), and other positive integers greater than 1 may be used to flag data which is bad or dubious for specific reasons.
The data provider may also define scalar string-valued metadata fields Quality.2, Quality.3... to define specific quality flags on a per-spectrum basis. Bitmasks, used in some archives such as SDSS, should be remapped to such independent Quality fields.
Quality defaults to zero, i.e. good data.
We also introduce a Calibration field which can have the values ABSOLUTE, RELATIVE or UNCALIBRATED. This is expected to be particularly useful to describe the flux. ABSOLUTE indicates that the values in the data are expected to be correct within the given uncertainty. RELATIVE indicates that although an unknown systematic error is present, the ratio of any two values will be correct. UNCALIBRATED indicates that although the values reflect a measurement of the given UCD, they are modified by an unspecified coordinate-dependent correction. Such values may be useful in the case of a spectrum with ABSOLUTE calibration on the wavelengths but UNCALIBRATED fluxes; the wavelengths of discontinuous features such as spectral lines can be measured on the assumption that the missing calibration function has no sharp discontinuities in the region of interest.
The Calibration fields are present in the CharacterizationAxis elements.
Most of the associated metadata are generic observational metadata that can be applied to future data models, and are not specific to spectra.
The CoordSys object is a simplified instance of the STC CoordSystem object. For XML serializations, it can be replaced by an actual STC CoordSystem instance.
CoordSys consists of 1 or more CoordFrame objects, each of which defines the coordinates for a particular axis. The CoordSys has an overall ID string, which is user-defined and arbitrary. Each CoordFrame also has a type, a UCD and a ReferencePosition; the Reference Position gives the origin of the coordinate system (and thus also its rest frame).
For the space, time, and spectral axes we define specialized CoordFrames for convenience: SpaceFrame, TimeFrame and SpectralFrame. The CoordFrame names (types) for SpaceFrame and TimeFrame must be from a controlled list; for other frames, the type is an arbitrary string.
Note: For compatibility with the Characterization schema, data model elements Spectrum.Char.SpatialAxis.CoordSys, etc. are allowed, but in Spectrum these must be trivial references to the overall Spectrum.CoordSys.
| Token | Meaning | Note |
| UNKNOWN | Unknown origin | |
| RELOCATABLE | Relative origin | Suitable for simulations |
| CUSTOM | Origin specified wrt another system | |
| TOPOCENTER | Location of the observing device | (telescope) |
| BARYCENTER | Solar system barycenter | |
| HELIOCENTER | Center of the Sun | |
| GEOCENTER | Center of the Earth | |
| EMBARYCENTER | Earth-Moon barycenter | |
| MOON | Center of the Moon | |
| MERCURY | Center of Mercury | |
| VENUS | Center of Venus | |
| MARS | Center of Mars | |
| JUPITER | Center of Jupiter | |
| SATURN | Center of Saturn | |
| URANUS | Center of Uranus | |
| NEPTUNE | Center of Neptune | |
| PLUTO | Center of Pluto | |
| LSRK | Kinematic local standard of rest | Redshift frame only |
| LSRD | Dynamic local standard of rest | Redshift frame only |
| GALACTIC_CENTER | Center of the Galaxy | |
| LOCAL_GROUP_CENTER | Barycenter of the Local Group | |
The SpaceFrame has an optional Equinox attribute which is used if the frame name is FK4 or FK5. The allowed frame names for SpaceFrame are listed below.
| Token | Meaning | Parameter(s) |
| UNKNOWN | Unknown frame | |
| CUSTOM | Custom frame | Pole, axis |
| AZ_EL | Azimuth and elevation | |
| BODY | Generic body (eg planet) | |
| ICRS | The ICRS frame | |
| FK4 | FK4 | Equinox |
| FK5 | FK5 | Equinox |
| ECLIPTIC | Ecliptic l,b | Equinox |
| GALACTIC_I | Old galactic LI,BI | |
| GALACTIC_II | Galactic LII,BII | |
| SUPER_GALACTIC | SGL, SGB | |
| MAG | Geomagnetic ref frame | |
| GSE | Geocentric Solar Ecliptic | |
| GSM | Geocentric Solar Magnetic | |
| SM | Solar Magnetic | |
| HGC | Heliographic | |
| HEE | Heliocentric Earth Ecliptic | |
| HEEQ | Heliocentric Earth Equatorial | |
| HCI | Heliocentric Inertial | |
| HCD | Heliocentric of Date | |
| GEO_C | Geocentric corotating | |
| GEO_D | Geodetic ref frame | Spheroid |
| MERCURY_C | Corotating planetocentric | |
| VENUS_C | Corotating planetocentric | |
| LUNA_C | Corotating planetocentric | |
| MARS_C | Corotating planetocentric | |
| JUPITER_C_III | Corotating planetocentric | |
| SATURN_C_III | Corotating planetocentric | |
| URANUS_C_III | Corotating planetocentric | |
| NEPTUNE_C_III | Corotating planetocentric | |
| PLUTO_C | Corotating planetocentric | |
| MERCURY_G | Corotating planetographic | |
| VENUS_G | Corotating planetographic | |
| LUNA_G | Corotating planetographic | |
| MARS_G | Corotating planetographic | |
| JUPITER_G_III | Corotating planetographic | |
| SATURN_G_III | Corotating planetographic | |
| URANUS_G_III | Corotating planetographic | |
| NEPTUNE_G_III | Corotating planetographic | |
| PLUTO_G | Corotating planetographic | |
The TimeFrame is defined by the frame name and the ReferencePosition. Allowed values of the name are given below.
One standard reference time in astronomy is the origin of Julian Day Number on the TT (Terrestrial Time) timescale, BC 4713 Nov 24 at 11:59:27.81 (Gregorian). Using TT is preferable to UTC because it does not contain leap seconds, so the elapsed time in days is just equal to the difference in JD values.
The ISO-8601 calendar format standard does not support dates before AD 1, so cannot express this reference time. Therefore, it is not a suitable format for internal representations of such reference times. However, non-default choices of reference time may be specified in external serializations by a date in ISO-8601 format, e.g. "2004-11-30T11:59:00.01".
In this version of the model we require use of MJD as the time type for absolute times. (ISO dates and JD are other possibilities covered by the STC document). Relative times in a time series may be in other units, relative to the TimeFrame.Zero value.
(Note that in the FITS serialization, the MJDREF keyword allows definition of reference times in decimal days relative to MJD 0.0 = JD 2400000.5.)
| Token | Meaning | Note |
| LOCAL | Relocatable (simulation) time | |
| TT | Terrestrial Time | |
| UTC | Coordinated Universal Time | |
| ET | Ephemeris Time | |
| TDB | Barycentric dynamical time | |
| TCG | Terrestrial Coordinate Time | |
| TCB | Barycentric Coordinate Time | |
| TAI | International Atomic Time | |
| LST | Local Sidereal Time | |
The spectral frame is defined by its ReferencePosition. Once the choice of wavelength versus frequency or energy has been made, the only free parameter is the location at which the spectrum would have the given spectral coordinates. For directly observed data this is the topocenter (location of the observation); spectra may be velocity-corrected to a given velocity frame, which may be defined by the location which is at rest in that velocity frame (e.g. the heliocenter). Strictly, the correction may not be just a velocity shift, but any kind of spectral shift including e.g. gravitational redshifts; it is still true that such a shift corresponds to a location (e.g. surface of a white dwarf star) that can be quoted as a reference position.
Since the frame is defined by its ReferencePosition, the frame name is not important, and will not be significant to software. We suggest that it may be filled by the name of the spectral coordinate, using FITS names such as 'WAVE', 'FREQ' or 'ENER'.
The spectral frame has an optional Redshift attribute to specify a rest frame; it is used only if the the frame's ReferencePosition is "CUSTOM". This redshift is measured in dimensionless units, defined as DeltaLambda/Lambda and may be negative. No specific interpretation of the shift as a cosmological or velocity shift effect is implied; we note for the record that some co-authors object to using the word `redshift' in this generic sense.
When you convert the spectral coordinate to velocity or redshift (relative to some assumed rest-frame spectral feature) you need to record some other metadata. Our field name containing this metadata is RedshiftFrame, but we emphasize that the name redshift does not imply that blueshifts are excluded, merely that, in both galactic and extragalactic astronomy, when a shift is interpreted as a velocity a positive value indicates a shift to the red. The concept of Redshift frame includes both cosmological and local Doppler velocities.
Note that you only use RedshiftFrame if you're measuring things in velocities; a rest-frame spectrum of a redshifted quasar whose spectral axis is in Angstroms will be described by a SpectralFrame. The reason we have BOTH SpectralFrame and RedshiftFrame is to support certain data products, particularly used in spectral line radioastronomy, in which a spectrum (possibly obtained in piecewise spectral regions) is refactored into a set of separate spectral segments centered on different spectral lines; each segment is assigned a velocity axis centered on that line (and the same pixel from the original spectrum can appear in multiple segments each with a different velocity coordinate); you then consider the data as a 2D array with a spectral axis (indexing the segments) and a velocity axis (for each segment/spectral line).
Other coordinate system information needed for velocity spectral coordinates include the observation-fixed spectral frame, the observatory location, the source redshift, and the velocity zero point (in Greisen et al, SSYSOBS, OBSGEO, VELOSYS, RESTFRQ/RESTWAV). However, we omit these in the current model. The only metadata we provide is the Doppler Definition - optical, radio or pseudo-relativistic.
Notes on compatibility with, and differences from, STC 1.0:
OPTIONAL: All CoordSys values are optional , but data providers should take special care to check whether or not the defaults are appropriate for their data. The implications of the defaults are:
The Characterization metadata in this document are consistent with the IVOA Characterization data model draft as of March 2007. The Characterization model has a set of CharacterizationAxis objects. Each CharacterizationAxis describes the axis, and contains a Coverage describing the scope of the data, and optionally a Resolution and a Sampling object.
The CharacterizationAxis is identified by its UCD attribute. Spectrum instances should have Spatial, Time and Spectral characterization axes as well as FluxAxis.
To simplify things for the common axes, we define SpatialAxis, SpectralAxis, TimeAxis objects as special cases of CharacterizationAxis.
The CoordSystem element in CharacterizationAxis is there for compatibility with the Characterization document and, if present, should be a simple reference to the main Spectrum CoordSystem.
The Characterization fields will have a constant value for a given spectrum.
Note: In the SSA protocol/query response, we will restrict the Char units to meters (spectral coord), seconds (time coord), and decimal degrees (spatial), for simplicity and consistency with other parameters. We allow a more general approach for the full Spectrum instance (returned serializations); the units may be as described elsewhere in this document.
The coverage fields will have a constant value for a given spectrum. They describe the region of space, time and spectrum from which the data were taken. In the Characterization model, we define progressively more accurate descriptions of this region: Location gives a single characteristic point, Bounds gives a range within which the data lies, and Support gives the detailed spatial field of view footprint, on/off time ranges (including gaps) and spectral ranges. (A fourth level not yet supported, Sensitivity, will provide detailed depth information: exposure map, time sensitivity variation, spectral transmission curve).
There is a field for giving the effective exposure time (useful for selecting among multiple spectra from the same instrument). The aperture field is important to determine what part of an extended object is contributing to the spectrum; we allow a simple aperture description (Char.SpatialAxis.Coverage.Bounds.Extent) consisting of a single number representing the aperture size in decimal degrees. For a slit spectrum, the effective aperture on the sky is usually the slit width in the cross-dispersion direction, while for a fiber it may be a circular region. For an accurate description, a full region polygon is allowed in the Area field. Note that since the goal of the VO Spectrum description is to describe the data as it is now, not to describe where it came from, our 'aperture' is always the effective extraction aperture, not the original instrument aperture if that is different.
The units of the spectral Coverage.Bounds.Extent (or Coverage.Bounds.Start/Stop) and Coverage.Support should be the same as those of SpectralCoord.
For time, the Coverage.Bounds.Start/Stop is a pair of values giving the start and stop time. Coverage.Bounds.Extent is the total elapsed time (Stop - Start) while Coverage.Support.Extent is the effective exposure time (total length of all observing intervals times any statistical dead-time filling factor). In the full Characterization model, Coverage.Support provides a whole array of start-stop pairs indicating data accumulated over a series of intervals. We may add this to the Spectrum model in a later revision.
The SpatialAxis.Coverage.Location and SpatialAxis.Coverage.Bounds.Extent, TimeAxis.Coverage.Location are required, as are either TimeAxis.Coverage.Bounds.Extent or TimeAxis.Coverage.Bounds.Start and Stop. If Extent is provided, Start and Stop are defined to be (Location - 0.5* Extent, Location +0.5*Extent).
The spectral equivalents, SpectralAxis.Coverage.Location and SpectralAxis.Bounds.Start/Stop, are also required in the model; serializations may decide to omit them since they are easily derived from the data.
The SamplingPrecision.SamplingPrecisionRefVal.FillFactor (previously Coverage.Support.Fill) fields give the filling factor, a statistical way of indicating that an axis is only partly sampled. The full IVOA Characterization data model provides a more detailed SamplingPrecision tree; although we fill only part of this we retain the field names for compatibility.
FillFactor is used for dead time corrections (time axis), statistical corrections for gaps between active pixels (spatial axis), and so on. Its value should be between 0 and 1, with the default being 1. (Although we provide a SpectralAxis FillFactor for symmetry and completeness, we are not aware of any practical application for it).
In the optional Char.SpatialAxis.Coverage.Support.Area we describe the detailed aperture shape in absolute coords on the sky. However, we don't allow a full STC region description. Our simplified region model allows for (1) a circle and (2) a polygon in a string representation: either
circle x0 y0 r
or
polygon x1 y1 x2 y2 x3 y3 ...
for example
circle 233.70 -13.32 0.00043 polygon 233.70 -13.32 233.71 -13.30 ...
where the positions and radii are required to be in degrees, in the coordinate system defined by CoordSys.
The Derived (short for Derived Data) object has useful, and optional, summary information about the spectrum. For now, we include the option of adding signal-to-noise and variability indicators and a measurement of the redshift.
The signal-to-noise is provided mainly as a way for searches to exclude data whose quality is insufficient for a particular study. Data providers may use their own definition, as we do not prescribe a uniform method to calculate it. A suitable method, used by the STScI MAST group, is to define the noise by the median absolute value of the difference between adjacent independent flux values in the spectrum. (The MAST definition multiplies this noise value by a 1.048 correction factor for precise applications). This method describes the high-spectral-frequency noise but does not take into account intermediate-spectral-frequency background `noise'; projects which are background dominated may wish to include this in the noise definition. Furthermore most spectra vary in SNR across their waveband; users should therefore only use this single SNR as a crude selection parameter.
One common piece of derived data for a spectrum is the source redshift. We provide fields for both the redshift measured value and statistical error. As above, we define the redshift to be DeltaLambda/Lambda and it may be positive or negative. The Derived field represents a measurement of the redshift from the data; a field in the Target object is available to store the redshift of the source as known from other means.
We add a further optional measure of accuracy, the Confidence, which expresses a probability between 0 and 1 that the quoted errors do apply. This measure is used in the Sloan spectral service to provide a way of describing the estimated probability that the redshift is completely in error because the lines have been misidentified. Its default value is 1.0.
In general, such a Confidence could be useful for any measurement where the error probability distribution has multiple peaks in parameter space, and could later be added to the standard Accuracy model.
Note that there are two other redshifts in our model: the Target redshift, a useful piece of metadata particularly for extragalactic objects, considered as an externally known property of the target (and so defined even if no lines are visible in the spectrum); and the SpectralFrame redshift, used only if a "rest frame" spectrum is presented and representing the assumed redshift used to shift the spectrum.
The variability amplitude field allows data providers to supply a characteristic amplitude (a precise value is not required). It is dimensionless; a value of 0.2 implies a 20 percent variation around the mean value.
The Curation is an object consistent with the Curation information in the document "Resource Metadata for the Virtual Observatory Version 1.01", although some of the fields from RM curation have been moved to the DataID object, as discussed in the SSAP protocol document.
In Curation, we have added a Reference field for a bibliographic or documentation reference (this can occur multiple times), Rights field (same as Resource.Rights) for public/proprietary, and PublisherDID for a publisher-specified IVORN to the data. The Curation.PublisherDID is the same as the Resource Metadata V1.10 Resource.Identifier.
Version is provided by the publisher or creator and may be any string.
Curation.Publisher is REQUIRED. All other fields are optional.
The Data Identification model gives the dataset ID for a particular spectrum, and its membership of larger collections. All DataId fields are optional.
There are three dataset idenfifiers in the model: one under Curation and two here. All of them are ivo: URIs as specified by the IVOA.
The DataID.CreatorDID is the dataset ID defined internally by the creator and may be entirely different from the DatasetID described above. It is used to identify a particular original exposure in an archive and will not necessarily change even if the VO object in question is a cutout or is otherwise further processed.
The Curation.PublisherDID is a dataset ID defined by a publisher of the data. It may be an internal ID used by the archive.
The DataID.DatasetID may be the same as Curation.PublisherDID; for this field we recommend a journal-based URI such as the IVOA/ADEC/ADS dataset identifier. By agreement between the AAS journals, the ADS and the ADEC (NASA data centers), dataset identifiers, described in http://vo.ads.harvard.edu/dv/ , will be used to link journal articles back to the archival datasets containing the relevant observational data. If analogous but independent systems of URI designation are later adopted by other centers (e.g. by European journals) and accepted by IVOA, they will be suitable in this field.
For example, a dataset held by an archive which curates many missions and telescopes may have an ID allocated by the original mission (CreatorDID), an ID used as an index by the multi-mission archive (PublisherDID), and the ADS-style ID (DatasetID). These may all be different, although we hope that many archives will choose to use the ADS ID as their index.
We introduce the concept of an dataset creation type, which can have one of the following values, described in more detail in the SSAP protocol document.
The dataset is associated with one or more Collections (instrument name, survey name. etc.) indicating some degree of compatibility with other datasets sharing the same Collection properties. Examples of possible Collection values are: "WFC", "Sloan", "BFS Spectrograph", "MSX Galactic Plane Survey".
We also include a DataID.Bandpass, which is a string describing the pectral range. It can be one of the strings in Resource-Service-Metadata's Spectral.Coverage (e.g. "Optical") or Spectral.Coverage.Bandpass (e.g. "B" ). At the moment there is no fixed list of values for the RSM Spectral.Coverage.Bandpass.
For DataSource, see the SSAP protocol document.
In spectral data it is particularly important to be able to specify the target of the observation, which may be an astronomical source or some other target (calibration, diffuse background, etc.). By explicitly including a target model we can not only facilitate searches on particular types of target but also support archives of model spectra for which the Coverage fields may not be relevant. The Target.Name field is required; all other Target fields are optional.
The Target.pos field gives a nominal RA and Dec for the target, for example the catalog position of the source; the Coverage.Location fields in the spectrum indicate the actual telescope pointing position for that spectrum. (An SED might have a single Target object with a known position, but many Spectrum objects with slightly different telescope pointings). Similarly, the Target.redshift is the assumed actual redshift of the astronomical object, if applicable (again, usually from a catalog, NED, etc.), while the redshifts in the Derived objects in the spectrum (segment) indicates a redshift measured from that spectrum. The Target.redshift is normally used to store the cosmological redshift of extragalactic objects, although it may also be used to store the observed redshift of Galactic sources if that information is felt by the data provider to be useful.
At the moment there is no international standard list of valid values for Target class and spectral class. Nevertheless an initial deployment of the VO would gain some benefit from using archive-specific classes, and provide a framework for converging on a standard list.
The Spectrum object contains the Data object with the actual data; the Target and Derived objects; and the standard dataset metadata of CoordSys, Characterization, Curation and DataID. We also add a CustomParams field to allow for propagation of unmodelled application-specific metadata.
In addition, we add an SIDim field for each axis giving the SI units of the values in the Osuna-Salgado dimensional format.
In spectral associations (such as SED applications), the Spectrum model is reused for both Spectrum and TimeSeries and is renamed Segment. The Spectrum object is expected to be generalized to a higher level Dataset object.
Each Spectrum (or Segment) may have a Length attribute giving the number of flux points in the data (in some serializations this value is deduced from the size of the data arrays, while in others it is made explicit).
Each Spectrum (or Segment) may also have a Type attribute indicating whether the data is intended as a TimeSeries (data are same spectral coord, varying times), Photometry (data are different spectral coords with irregular gaps), Spectrum (data are different spectral coords in contiguous bins), or Mixed (some mixture of the above).
This attribute is optional and defaults to Spectrum.
Segments are discussed in more detail in the Spectral Associations document which describes SEDs and other groupings.
The Spectrum model involves objects addressed by the proposed VO Observation and Quantity data models. Although these models have not yet been fully worked out, we may note that a single Spectrum maps to the Observation model, which will include the Curation and Characterization objects. The Flux and the spectral coordinate entries together with their associated errors and quality will be special cases of the Quantity model, as will the simpler individual parameters. The field structure presented here is consistent with current drafts of the models.
The model and serializations defined in this document are extensible in the following sense:
Greisen, EW, Valdes F G, Calabretta M R and Allen S L 2006, A&A 446, 747.
Hanisch, R., (ed)., Resource Metadata for the VO, Version 1.01, 2004 Apr 26.
http://www.ivoa.net/Documents/latest/RM.html
Derriere, S. et al (eds.), UCD, Moving to UCD 1+, 2004 Oct 26.
http://www.ivoa.net/Documents/latest/UCD.html
In the following XML schema, we implement the model fairly directly.
Within a spectrum the data points are kept together in objects called Point.
Also, we have included a CustomParams element to allow site-specific metadata to be added.
The Coverage.Location fields have been collapsed to simple values rather than SEDCoord elements; this should perhaps be extended in a future version.
The Flux object is defined as an example of a more general SEDQuantity object, which is also used for the Sloan spectral service's redshift information.
A SED aggregation model is also included in the schema, as the top level element. This may be ignored until the SED model has been approved by IVOA.
|
|
|
|
|
|
|
|