|
|
International Virtual Observatory Alliance |
MOC – HEALPix Multi-Order
Coverage map
Version 1.0
IVOA Note 12th April 2012
Previous version(s):
None
Authors:
Thomas Boch
Tom Donaldson
Pierre Fernique
Wil O'Mullane
Martin Reinecke
Mark Taylor
Editor:
Pierre Fernique
![]()
Abstract
This note describes a simple and
powerful method to specify sky regions. The goal is to have a way for providing
very fast comparisons and data access methods. The principle is based on HEALPix sky tessellation. It boils down to defining a list
of sky cells, grouped hierarchically. This method has been tested and validated
in two tools: Aladin and Topcat,
and in a front-end tool for VizieR catalogs.
Status of This Document
This is an IVOA Note. This is the first release of this document.
This is an IVOA Note expressing suggestions from and opinions of
the authors. It
is intended to
share best practices, possible approaches, or other perspectives
on
interoperability with the Virtual Observatory. The document describes a
standard created by IVOA partners and is not a formal IVOA standard.
A list of current IVOA Recommendations and other technical documents can be found at http://www.ivoa.net/Documents/.
The authors would like to thank all HEALPix
developers for their great help in this study.
Contents
Table of Contents
Acknowledgements.................................................................................................... 2
1 Introduction................................................................................................................ 3
1.1 The idea................................................................................................................. 3
1.2 Examples............................................................................................................... 4
1.3 HEALPix tessellation........................................................................................... 5
1.3.1 Equal areas....................................................................................................... 5
1.3.2 Computing time................................................................................................. 5
1.3.3 Accuracy........................................................................................................... 5
1.3.4 Standard............................................................................................................ 5
1.4 Examples of use cases....................................................................................... 5
1.4.1 Visualize the coverage...................................................................................... 6
1.4.2 Compare the coverage of multiple data sets.................................................... 6
1.4.3 Remote cross-match (TOPCat use case)........................................................ 6
1.4.4 Multi-service positional search.......................................................................... 6
1.5 Scope and performance..................................................................................... 7
1.5.1 Scope of usage................................................................................................. 7
1.5.2 MOC resolution................................................................................................. 7
1.5.3 Size of MOC..................................................................................................... 8
1.5.4 MOC performance............................................................................................ 8
2 MOC encoding........................................................................................................... 9
2.1 MOC specificity..................................................................................................... 9
2.1.1 Celestial reference............................................................................................ 9
2.1.2 NESTED numbering......................................................................................... 9
2.1.3 Well-formed MOC encoding.......................................................................... 10
2.1.4 Sky coordinates and HEALPix cells............................................................... 10
2.2 Two MOC encoding formats............................................................................ 10
2.2.1 JSON MOC.................................................................................................... 11
2.2.2 FITS MOC....................................................................................................... 11
2.3 Existing implementations and libraries......................................................... 12
2.4 Links to VO registry............................................................................................ 12
References................................................................................................................. 13
Appendix A – MOC size vs MOC resolution......................................................... 15
Many VO protocols are based on simple cone queries
(CONE-SEARCH, SIA, SSA ...). This is mainly due to the fact that most
astronomical databases have been built for answering such requests
efficiently. But when the queries concern more complex sky regions (rectangles,
polygons, ...) the methods used are diverse, difficult
to compare or not interoperable, often complex and usually slow. This fact inevitably hampers the
development of VO tools for global comparisons, such as generic catalogs
cross-match, computation of data set intersections, etc..
In fact, the VO has no efficient method to correlate two data sets as soon as
they are not integrated in the same data base.
The method we describe in this paper attempts to influence this situation by providing a basis for
unique, simple and effective comparison of sky coverage data or “footprints”. We called it ′Multi-Order Coverage Map′ or ′MOC′.
This method has been tested
and validated in two of the main tools of the VO: Aladin
and Topcat. It has been used to describe all
available data sets provided by CDS (all VizieR
tables and Simbad). Such a system shared by
several data centers, or even being accepted as a standard at the IVOA level,
would significantly improve in the short term a lot of VO tools.
The goal is to get a spherical geometry system for sky regions providing
very fast and accurate union, interection and
equality operations. For this, the key idea is to base such a system on a
regular and hierarchical partitioning of the sphere.
We opted for the HEALPix tessalisation
[1] that divides the sphere into 12 diamonds, each of them sub-divided into 4
diamonds recursively. Thus the sphere at level 1 will consist of 48 diamonds,
192 diamonds at level 2, 768 at level 3 and so on.
A region of the sky will be determined by the list of the diamonds involved.
Union, intersection and equality on such lists are straighforward
and particularly fast if this list has been previously sorted.
To reduce the volume of such lists, four consecutive diamonds will be replaced
by their "parent", recursively. The deepest level will be defined by
the resolution required for describing the data set.

In order to illustrate the MOC usage, we represent in this
illustration an excerpt of GLIMPSE survey and its corresponding MOC map
GALEX AIS MOC (HEALPix order 8 – 70,000 cells) and SDSS (HEALPix order 10,- 225,000 cells):

Defining a sky region by a subset of regular sky tessellation is not a new idea. The advantage of sharing a homogeneous tessellation method has already been described in several articles, notably in the VO context [2], [3].
There are several methods of sphere partitioning. The three most commonly
used in astronomy are QBOX, HTM and HEALPix.
QBOX is projecting the sphere onto the 6 faces of a cube, each of them being
recursively subdivided into 4 squares. This method is the basis of the indexing
system of VizieR [5]. HTM is splitting the sphere into 8
triangles (octahedron), each of them being recursively subdivided into 4
triangles [4]. This method is notably used by the SLOAN database.
A comparison
between HTM and HEALPix has been already presented in
[4] in 2001.
By construction, HEALPix consists of diamonds with equal spherical surfaces. Thus the area of a region is trivial to compute. This is not the case either for QBOX or HTM for which the surface computation is a complex and necessarily slow task compared to HEALPix.
Unlike HTM, HEALPix has the peculiarity that the computing time
does not depend on the hierarchical level. Conversely HTM requires the
knowledge of the “parent” cell to determine the four sub-cells. It means that
in HEALPix the computation time is constant.
The best implementation is presently at 0.3 arcsec for HTM (level 20) [3], at 75 arcsec for QBOX [5]HEALPix provides a library which allows the calculation accuracy to 0.4 mas (level 29). [9]
There exist many HEALPix libraries: C++, Java, Fortran, IDL ... Also, there is a standard for encoding the HEALPix maps into a simple FITS file. And HEALPix was selected for several missions such as WMAP, Planck and GAIA soon. The HEALPix site is hosted at JPL [6].
These four points provide good reasons to choose HEALPix.
Such a system has a lot of potential use cases. We describe here the four main use cases for which there are already prototype implementations.
First of all, the MOC structure allows to visualize easily any data set coverage. The MOC meshing nature provides an easy interpretation. With a simple 4 diamond corner drawing approximation such as shown in illustration 1.2, the time required for drawing such coverage is typically a few milliseconds. If necessary, thanks to its hierarchical structure this drawing time can be limited by stopping the drawing process at a coarser level than the best resolution.
The computation of data set intersections is trivial (simple list comparisons). The result of such operations can be used in further operations. For instance it is straightforward to compute the sky region determined by the intersection of two MOC surveys, and after that, query a data base such as Simbad for retrieving objects inside this intersection. Further, we can imagine that the data base uses itself an internal spatial index based on HEALPix. In this case all intermediate sky computations will be removed and the response time will be the best that we can imagine.
One popular feature in TOPCAT [7] is the multiple cone search, which allows one to perform a remote positional cross-match between a local table and a remote one (accessible through the CS protocol). If the remote table has a sparse coverage, TOPCAT will fire up unnecessary queries for positions not covered by the table. Performance can be improved by providing TOPCAT with a way to retrieve the global coverage of the queried table.
An attractive if simple form of virtual observatory user query is to request all records from the whole VO at a given sky position. While this is in principle possible by dispatching one narrow positional query to every registered Cone Search, SSA or SIA service, in practice the number of queries required leads to an unacceptable load on both clients and services, and moreover most of these queries will deliver no results because most services lack coverage in the queried region. This "All-VO" search capability was provided in early versions of the AstroGrid AstroScope tool, and subsequently deprecated because of these resource usage issues. If footprint information was available for all registered services, only those services with coverage in the region of interest, and hence which might provide a non-empty result set, could be easily identified, providing a great reduction in the number of service queries needed for a comprehensive search. MOC offers the opportunity to provide this coverage information in a uniform way and to store it locally or centrally in a way which supports querying coverage at a given point for a large number of services.
MOC is typically designed for defining coverage regions of data sets, notably pixel surveys or source catalogs.
Even though MOC allows to describe regions on the sky, it is not designed to accurately define regions. For this use case, STC is better suited[1]. For instance, MOC is clearly not designed to draw instrument fields of view or such kind of overlays.
Conversely, the comparison of two regions will definitely be easier and
faster using MOC rather than STC. The main STC issue for comparisons is due to
the fact that there is no canonical way of expressing a region. Thus, a basic
equality test with STC may raise a complex spherical geometry problem. And this
kind of problem can be really difficult to implement in a
software. Conversely, a MOC region is always described in one way – a
list of cells – comparison algorithms are simple and fast.
The MOC resolution is determined by the greatest HEALPix level used for defining the region. It depends on
the accuracy chosen for the region definition.
As data sets do not follow the HEALPix cell
borders, a MOC is generally an upper-approximation of the data set coverage. The quality of this approximation will be directly
dependent on the chosen MOC resolution.
Note: There is no problem to compare two MOCs with
different resolutions.
This table provides the HEALPix cell angular
resolution for each HEALPix level.
|
Level |
Best cell resolution(*) |
|
Level |
Best cell resolution(*) |
|
0 |
58.63° |
|
15 |
6.442” |
|
1 |
29.32° |
|
16 |
3.221” |
|
2 |
14.66° |
|
17 |
1.61” |
|
3 |
7.329° |
|
18 |
805.2mas |
|
4 |
3.665° |
|
19 |
402.6mas |
|
5 |
1.832° |
|
20 |
201.3mas |
|
6 |
54.97' |
|
21 |
100.6mas |
|
7 |
27.48' |
|
22 |
50.32mas |
|
8 |
13.74' |
|
23 |
25.16mas |
|
9 |
6.871' |
|
24 |
12.58mas |
|
10 |
3.435' |
|
25 |
6.291mas |
|
11 |
1.718' |
|
26 |
3.145mas |
|
12 |
51.53” |
|
27 |
1.573mas |
|
13 |
25.77” |
|
28 |
786.3µas |
|
14 |
12.88” |
|
29 |
393.2µas |
(*) mean value. The HEALPix cell has
constant area, not constant linear dimensions.
The MOC describes a sky region as an explicit list of cells. The coding
size can vary a lot from a few bytes to several kilobytes and more if required.
As MOC is hierarchical, its size mainly depends of two factors:
1. The chosen MOC resolution;
2. The geometry of the region
(“compactness”).
Concretely, in case of MOCs describing pixel survey coverages,
the number of cells is mainly dependent on the size of the coverage perimeter
(size of borders).
In the case of MOCs describing source catalog coverages,
the number of cells is mainly dependent on the distribution and the density of
the catalog (for instance, for a catalog with a low density and a high
distribution, it is possible to choose a very good MOC resolution without
increasing significantly the MOC size – see appendix A).
The required MOC operations are classical set operations: equality, union, intersection... The algorithms for implementing this kind of operation are based on list manipulations. Good performance can be obtained by dichotomic algorithms on sorted lists.
To verify the power of such a system, a Java/TOMCAT prototype accessible via a Web browser has been developed. Its role is to load the 7,000 MOCs (VizieR catalog coverages) and wait for user requests.This prototype loads the MOCs at maximum level 8 corresponding to cells of 14 arcmin. At this level, the size of the 7,000 Mocs takes approximatively 90MB of memory without compression. The user request takes the form of an URL providing an unique HEALPix number. The goal is to retrieve the list of catalogs which intersect this cell. Similarly, it is possible to provide as parameter query, not a unique HEALPix cell, but a full MOC. The performances of such a prototype are provided in the table below. For instance less than 100 ms are required for retrieving the list of catalogs intersecting GALEX MOC.
|
2.5s |
for loading 7,000 VizieR MOCs (level 8) |
|
7ms |
for HEALPix query (amongst 7,000 VizieR tables, which tables are concerned by this cell?) |
|
90ms |
for MOC query (amongst 7,000 VizieR tables, which tables intersect this MOC (4000 cells)? |
|
30ms |
for generating MOC intersection (SDSS level 10 & GALEX level 8) |
(realized
on a basic laptop)
We can imagine that a data base using the same HEALPix tessellation for its spatial index will be very efficiently interoperable with a such method.
As HEALPix is already used for other purposes, it has its own vocabulary. A HEALPix level is called an ORDER and a cell number is called a NPIX.
HEALPix allows three celestial references: galactic, equatorial or ecliptic reference. Allowing various references would limit a lot the possibility to compare efficiently MOCs:there is no equivalence between HEALPix cell described in a reference and a cell, or a list of sub-cells expressed in another reference system. Consequently, the MOC definition is restricted to equatorial reference, and more precisely, ICRS reference. This choice has been motivated by the fact that most catalogs are natively provided with equatorial coordinates. And also by the fact that this restriction does not change anything for whole sky surveys (WMAP, PLANCK...)
Another restriction concerns the numbering scheme used in MOC for specifying the cell numbers. HEALPix supports both "NESTED" and "RING" numbering. It is possible to convert the cell numbers from one system to the other, but it takes time. For fast comparisons the MOC must use the NESTED numbering scheme only.
The NESTED HEALPIX numbering consists of enumerating all cells in a specific order. For instance, at level 1, there are 48 diamonds (12x4) enumerated from 0 to 47. In the NESTED numbering scheme, the 4 sub- diamonds of diamond M have the numbers: (Mx4)+3, (Mx4)+1, (Mx4)+2, (Mx4) in reading order. And reciprocally, the father number of diamond N is N/4.

Note: The level 0 is a special case, it contains only 12 diamonds enumerated from 0 to 11.
To keep the canonical property of MOC, the redundant cells are not allowed and the hierarchical encoding principle must be respected. Concretely, it is not allowed to encode 4 sibling cells instead of their parent cell (only at level 0 for the 12 first diamonds). By this simple constraint, there is one and only one way for describing a same region.
The way to determine which HEALPix cell contains such or such sky coordinates is described in the reference HEALPix article [1]. Several libraries already implement these algorithms [6]. For all tests concerning this note we have used the most recent java HEALPix library [9] written in the framework of the Gaia project.
These libraries are only required to generate or draw MOCs, and also to compare them with sky coordinates . But there is no need of these libraries for comparing MOCs.
In order to easily manipulate such HEALPix cell lists, ie list of NPIX, two complementary encoding formats are defined.
The first one is based on ASCII text string following the JSON convention [13] – easy to write, to read and to parse. Its main feature is its versatility rather than performance.
The second one is designed to be processed as quick as possible by a computer (no parsing procedure required, loading or writing in one fast step,...). It is an adaptation of the HEALPix standard for storing a classical HEALPix map as a FITS binary table.
A MOC “provider” must be able to deliver MOCs in both formats, a MOC “consumer” can only support one format.
A JSON MOC is written
as a simple ASCII stream following this syntax:
{ “order”:[npix,npix,...],
“order”:[npix, npix...],
... } .
The values must be fully sorted and the MOC must be well-formed.
Example of a JSON MOC
{“1“:[1,2,4], “2“:[12,13,14,21,23,25]}
The FITS MOC format consists of a single binary table in which each value merges both the level and the number of the cell according to the NUNIQ HEALPix coding method[2]. Depending on the maximum level, this table should be coded in 32-bits (maximum level less than 14) or 64-bits signed integers.
uniq = 4 * (4^order) + npix
The inverse operation is:
order = log2(uniq/4) /2
npix = uniq – 4 * (4^order)
The FITS values must be fully sorted and the MOC must be well-formed.
As the FITS MOC is derived from a more generic standard dedicated to HEALPix maps, the table FITS header must contains these following keywords/value.
PIXTYPE =
'HEALPIX ' / HEALPix magic code
ORDERING= 'NUNIQ ' / Order and NpixNest
coding method
COORDSYS= 'C ' / ICRS
reference frame
It must also contain the MOC dedicated keyword “HPXMOC” which provides the resolution of the MOC. This keyword can be used to distinguish a classical HEALPix map from a MOC.
Example of FITS MOC headers:
SIMPLE = T
BITPIX = 8
NAXIS = 0
EXTEND = T
END
XTENSION= 'BINTABLE' / HEALPix Multi Order Coverage map
BITPIX = 8
NAXIS = 2
NAXIS1 = 4
NAXIS2 = 16461
PCOUNT = 0
GCOUNT = 1
TFIELDS = 1
TFORM1 = '1J '
TTYPE1 = 'NPIX ' / HEALPix UNIQ pixel number
PIXTYPE = 'HEALPIX ' / HEALPix magic code
ORDERING= 'NUNIQ ' / Order and NpixNest coding method
COORDSYS= 'C ' / ICRS reference frame
HPXMOC = 12 / MOC resolution (best order)
END
Note:The FITS MOC is derived from the standard dedicated to HEALPix maps (partial definition ie a list a couples Cell_number/Pixel_value). However there are two specificities: 1- Only the cell number is stored, there is no associated pixel value. 2 – The HEALPix map concerns only one specific level. In FITS MOC, the level and the cell number are coded together.
Presently a Java library is available under GPL v3 license, supporting the two MOC encoding formats. It allows basic operations: equality test, union, intersection... [10].
1. Aladin tool [11] has implemented this java library in order to display MOCs on a background sky.
2. The TOPCAT and STILTS tools have implemented it for improving their multi-conesearch cross match facility. The cone corresponding to each source query is compared against the MOC of the service in order to consider only the sources present in the intersection of the data sets. This will reduce considerably the number of unnecessary queries.
3. A front-end of VizieR and Simbad has been deployed for providing the MOC associated to all VizieR tables, and Simbad data base – see next section.
We developed a web service getMoc (accessible through http://alasky.u-strasbg.fr/footprints/getMoc ) which delivers a MOC for a given cone search service. The cone search is identified by its base URL.
Example: http://alasky.u-strasbg.fr/footprints/getMoc?baseUrl=http%3A%2F%2Fvizier.u-strasbg.fr%2Fviz-bin%2Fvotable%2F-A%3F-source%3DIII%2F105%26 will provide the MOC file for the cone search service which base URL is http://vizier.u-strasbg.fr/viz-bin/votable/-A?-source=III/105
The getMoc
service can deliver MOCs for all 7,000+ VizieR cone
search services, and also for a couple of other services (SDSS DR8, UKIDSS DR7
provided by the Royal Observatory, Edinburgh)
This service is currently used by TOPCAT when performing multiple cone search,
in order to reduce the number of remote queries.
This mechanism is not limited to cone search and could be extended for other
kind of data (images or spectra data for instance).
In the mid-term, we advocate an extension of the registry schema which would
allow, for a given registry resource, to point to a MOC file describing the
resource coverage.
[1] Gorski K. et al ., HEALPix: A Framework for High-Resolution
Discretization and Fast Analysis of Data Distributed on the Sphere, The
Astrophysical Journal, Volume 622, Issue 2, pp. 759-771. 2005, 2005ApJ...622..759G
[2] O'Mullane, Gorski K., Sky, Indexation, Pixelization,
and the VO – Astronomical Data Analysis Software and Systems XIV – 347, 3 –
2005 2005ASPC..347....3O
[3] Budavàri T. et al., Searchable
sky coverage of astronomical observations: footprints and exposures -
Astronomical Society of the Pacific, 122, 1375 – 2010 2010PASP..122.1375B
[4] O'Mullane
et al., Splitting the
Sky - HTM and HEALPix, Proceedings of the MPA/ESO/MPE Workshop, Garching,
638 – 2001 2001misk.conf..638O
[5] VizieR QBOX , Ochsenbein F. - http://cdsarc.u-strasbg.fr/doc/man//qbox.htx
[6] HEALPix Web site (JPL) – http://healpix.jpl.nasa.gov/
[7] Taylor M. B., "TOPCAT
& STIL: Starlink Table/VOTable
Processing Software", Astronomical Data Analysis Software and Systems
XIV, 347, p29, 2005 2005ASPC..347...29T
[8] Space-Time Coordinate Metadata
for the Virtual Observatory - http://www.ivoa.net/Documents/latest/STC.html
[9] Hivon
E., Goski K., Reinecke M.
et al. - HEALPix java library - http://sourceforge.net/projects/healpix/
[10] Fernique
P., Thomas B. - MOC java library - http://cds.u-strasbg.fr/resources/lib/exe/fetch.php?media=mocsrc.jar
[11] Bonnarel
F., Fernique P. et al., The ALADIN interactive sky
atlas. A reference tool for identification of astronomical sources - Astron. Astrophys., Suppl. Ser., 143, 33-40 (2000) - April(I) 2000. 2000A&AS..143...33B
[12] Boch T. - CDS MOC server – http://alasky.u-strasbg.fr/footprints/list/
[13]
JavaScript Object Notation –
RFC 4627
The MOC size of source
catalogs depends of the density and the distribution of these sources over the
sky. We provide in this appendix four illustrating examples (Bright Star, Tycho, GCP, WISE) for which the MOCs are provided at resolution 6,7,8
and 9.
·
Bright Star Catalog
[V/50]: 9110 sources
|
Order |
Number of cells |
Increasing factor |
|
|
6 |
7939 |
|
|
|
7 |
8630 |
x 1.09 |
|
|
8 |
8842 |
x 1.02 |
|
|
9 |
8934 |
x 1.01 |
|
·
General Catalog of
Photometric Data [II/167]:165 896 sources
|
Order |
Number of cells |
Increasing factor |
|
|
6 |
19998 |
|
|
|
7 |
65616 |
x 3.28 |
|
|
8 |
99879 |
x 1.52 |
|
·
Tycho [I/239]:1 058 332
sources
|
Order |
Number of cells |
Increasing factor |
|
|
6 |
783 |
|
|
|
7 |
32320 |
x 41.27 |
|
|
8 |
347064 |
x 10.74 |
|
|
9 |
790483 |
x 2.28 |
|
WISE preliminary data
catalog [II/307]: 257 310 278 sources
|
Order |
Number of cells |
Increasing factor |
|
|
6 |
1322 |
|
|
|
7 |
2845 |
x 2.15 |
|
|
8 |
5709 |
x 2 |
|
|
9 |
12231 |
x 2.32 |
|