Region restrictions

Kona Andrews kea at roe.ac.uk
Fri Nov 3 02:21:26 PST 2006


Dear colleagues,

I am appending below some detailed feedback from my colleague
Jeff Lusted, whose advice I sought on the subject of the Region 
construct. 

His feedback reflects a lot of practical experience with the
issues of intertranslating ADQL/xml, ADQL/sql and SQL, and
the practical issues involved in constructing a graphical
Query Builder for ADQL, so I hope it will be helpful.

Message from Jeff follows.
------------------------------------------

My initial reaction, after having read the mails you sent me, is to
stick with the Region construct as it stands, ie: not as a separate
clause. I'll go along with most of what Benjamin says, especially on not
deciding the standard on implementations issues/specials.

But these things need to be qualified!

The problem with Region is that everything I've seen about Region so far
gives no details of what table(s) is/are involved or what columns will
be involved in the search! Tables have to be inferred from the context,
and columns by the implementation. I suspect the implementation will
decide whether this is healpix, htm or plain ra & dec. But if the
latter, what if there is more than one column that can be construed as
ra (or dec)? With tables, I assume with the current region spec, that
region will be applied to each table in the query.

There is also the complication of whether such a separate Region clause
could be included in sub-queries, which after all are simply other forms
of select.
 
If the Region is not to be some super-filter clause, then a table (or 
tables!) need to be specified, perhaps with an optional indication of 
columns. Something along the lines of...

t.Region( 'Circle etc ', t.ra, t,dec )
or
t.Region( 'Circle etc ', t.htm )

in BNF...

 <region_predicate> ::=
    <table_reference>
    <dot>
    REGION 
    <left_paren>
    <region_specification> 
    [ <comma>
      <column_reference>[ { <comma> <column_reference> }... ] ]
    <right_paren>
 
Perhaps the column specification might be too much for some. I'm easy,
but it might be really handy to specify columns to be used in certain
situations.

A better shorthand might be to include the list of tables within the
predicate (see example regarding derived table later).

Having a Region clause as some super-filter presupposes that the query
is only interested in one region of the sky. I think this may well be
the case but not for certain. It may be, for instance, that a query is
interested in comparing stars of a certain character between different
regions. 

Another complication in this area is that the spec (STC schema) allows
for semantics against one or more regions, but presumably against the
same "set" of tables. Thus a Region can quote another region inside it.
There is for instance a Region of unionType that is the union of two or
more regions. This is contained wholely within the Region construct in
the schema, which takes some imagination).

Sorry about the above points. They do need serious thought. A possible
compromise (worth exploring to see whether it resolves some of the
conundrums) is to allow for derived tables. Currently we do not seem to
support this idea within adql/s or adql/x as far as I can see. But it is
there in SQL92! So, here is the bnf for a table reference...


<table_reference> ::=
      <table_name> [ [ AS ] <correlation_name>
          [ <left_paren> <derived_column_list> <right_paren> ] ]
    | <derived_table> [ AS ] <correlation_name>
          [ <left_paren> <derived_column_list> <right_paren> ]
    | <joined_table

where derived table is defined as ....

<derived_table> ::= <table_subquery

It seems to me as if region is supposed to be some form of special
derived table, which obviously acts as a form of filter. So, the table
subquery might look like...

Select *
>From Region( 'Circle etc', catalogue_A ) as Circle_A,
     Region( 'Ellipse etc', catalogue_B ) as Circle_B
     Where
          Circle_A.Q_Z_ABS >= 0.975 ;


I think this could be thought through and made to work. It may even be
possible to quote one region inside another if the spec is carefully
worked out. I suppose if you take this seriously, Region must be thought
of as a search condition and not as a completely separate clause. The
latter might be easier for the easy cases, but I think lacks
flexibility.

One final point. I don't think in the long run the string based
semantics will work (eg: 'Circle J2000 181.3 -0.76 6.5' ). Given the
complexity, we need to break things down into the constituent parts as
the STC schema attempts to do.


-- 
Kona Andrews        kea at roe.ac.uk
AstroGrid Project   http://www.astrogrid.org
IfA, Royal Observatory, Blackford Hill, Edinburgh EH9 3HJ



More information about the voql-teg mailing list