More on REGION implementation

Kona Andrews kea at roe.ac.uk
Thu Mar 1 06:09:55 PST 2007


Dear colleagues,

A few more comments on the implementation of REGION.

At the moment, our DataSet Access component does not support queries using
the REGION construct.  However, I would hope to support it in the DSA 
back-end using the same method by which we implement Cone Search (i.e. a 
CIRCLE REGION), where possible.

When a Cone Search request comes in, it is translated into ADQL/sql by 
populating a template string.  The template contains the basic ADQL/sql 
circle-region query, together with some placeholder terms that are replaced
by the appropriate values/expressions.  The replacement is carried out 
using simple string substitution from within Java, using values
calculated by the Java code.

For small-radius queries we use a haversine formula, and for larger-radius 
queries we use a greatcircle formula.  For each of these two cases, we
have four separate templates covering all possible combinations of column 
units and database trig units (radians/degrees).   This gives 8 templates
in total.  While having 8 different templates is a bit of a pain 
maintenance-wise, hopefully the templates don't actually *need* maintenance
once correctly defined, and the java code is a bit more streamlined as a
result.  However, the units issue could in principle be dealt with in 
Java, instead of by multiple templates, if that was preferred.

For example, this is the template for a haversine query where the column 
units are in radians and the DBMS trig functions are in degrees:
-----------------------------

SELECT * FROM INSERT_NAME_TABLE AS a WHERE
   (
      (a."INSERT_NAME_DEC" <= INSERT_VALUE_MAX_DEC_RAD)  AND
      (a."INSERT_NAME_DEC" >= INSERT_VALUE_MIN_DEC_RAD)
   )
   INSERT_RA_CLIP_CONDITION
   AND
   (
      (2.0 * ASIN(SQRT(
         POWER(SIN(((DEGREES(a."INSERT_NAME_DEC") -
               INSERT_VALUE_DEC_DEG) / 2.0)), 2)  +
           (
              (COS(DEGREES(a."INSERT_NAME_DEC") )  *
                 (
                    COS(INSERT_VALUE_DEC_DEG)  *
                    POWER(SIN(((DEGREES(a."INSERT_NAME_RA") -
                        INSERT_VALUE_RA_DEG) / 2.0)), 2)
                 )
              )
           )
      ) ) )
      < INSERT_VALUE_CIRCRADIUS_DEG
  );

-----------------------------
All the terms that begin with INSERT_ are substituted by appropriate
values using string substitution in the Java code;   the term 
INSERT_RA_CLIP_CONDITION is replaced by an entire expression representing 
an appropriate boxcut constraint on RA to make the search more efficient.

Where possible, we would hope to implement other conditions in a similar
way (obviously, using templates expressing just the necessary ADQL/s WHERE 
clauses rather than entire queries).  This approach wouldn't work for arbitrary 
user-defined polygons, but for simple shapes with fixed geometry it is 
reasonably effective.

I would concur with previous comments that it is important to be able to
specify explicitly which columns should be used in the region calculation,
to avoid ambiguity.

Note that as the above example shows, we are not explicitly dealing with
the question of the coordinate system(s) of the input query and the
database columns;  we assume that the coordinate system is the same for
both.  This assumption may not be correct, which is one of the reasons
why we are seeking clarification on the issue of default coordinate
systems in the VO at large.

All the best,
Kona
-- 
Kona Andrews        kea at roe.ac.uk
AstroGrid Project   http://www.astrogrid.org
IfA, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ



More information about the voql-teg mailing list