separating TAP and query languages
dtody at nrao.edu
Mon Feb 23 09:48:04 PST 2009
We have discussed this issue many times within NVO over the past
couple of years. We plan some advanced applications (cross match
portal, CASJOBs-like DBMS query portal) which will require ADQL and
eventually all the Grid functionality as well (UWS, VOSI, integrated
VOSpace, SSO authentication).
However, while it is not difficult to munge a basic SELECT statement
from ADQL to whatever the local DBMS requires, the spatial query
extensions have only recently been defined and have hardly even
been prototyped at this point, or used for real world catalog
queries and cross-matches. My guess is that this technology will
not be generally usable and mature until somewhere around ADQL 2.0.
Similarly VOSpace (required for the CASJOBs use case at least) has not
been used seriously yet and in any case will undergo major revisions
in process of moving from SOAP to REST. Again, serious prototyping
is needed and it will probably not be until VOSpace 2.0 that this
technology is mature. Integration of VOSpace with services like TAP
is something we are only starting to look at. UWS and SSO are just
coming on the scene as well.
Realistically it will take another year or two for this technology to
be prototyped in real applications, mature, go through the standards
process, and be sufficiently deployed in the community to be useful.
Given the complexity the only way this can be done is if the major
projects provide ready to use toolkits (we are all working on them
of course), however this also will take time to be fully developed.
Hence, at least within NVO we want something which addresses the
most common use cases, for use to build real applications for our
user community while all this advanced technology evolves. Roy's
"multicone" use case is part of this (part of param query that is).
A multicone type capability combined with one or two simple range
restrictions on table fields is sufficient for the first stage of
most catalog cross match use cases, and can be refined further on the
client side with more sophisticated algorithms. ADQL is not required,
and the simpler (PQ) interface and the higher level of abstraction
make it fairly easy to implement a robust capability.
What is wrong with such a two phase approach? We provide a simple
robust solution now which addresses 90% of the most common use cases
(mainly catalog access, cross-matching). Basic SELECT in ADQL can
be provided as well and will work reliably for simple use cases.
Meanwhile we prototype the more advanced use cases and continue to
evolve and standardize this complex technology. Within NVO at least
this is the approach we intend to follow.
On Mon, 23 Feb 2009, Roy Williams wrote:
>> How do you know that these things cost "years of effort"? Have you spent
>> years implementing them?
> I'm thinking of ADQL. Years to *robust* implementation (not just prototype).
> I think of 3-page ADQL expressions full of variable names and metadata.
> Getting the types and metadata right between SQLServer and ADQL. Proper error
> returns that tell the user how to fix it, rather than just
> NullPointerException. A REGION with a thousand holes intersecting one with a
> million holes. All the SOAP and VOSI and authentication and service logs to
> deal with. Documentation that you can read at different levels. To a data
> provider, implementing the ADQL monolith is VERY complex and needs weeks just
> to understand the dozen required documents.
> The point of the Multicone service is that you*don't* need to do ADQL! The
> point is that it gets 90% of what the astronomer wants for 10% of the effort!
> And it can be made robust as well.
More information about the dal