TAP and large resultsets
kea at roe.ac.uk
Mon Jan 29 05:14:05 PST 2007
Hi Doug, all,
I don't disagree that paging is useful for simple clients - but I
do argue that there are (primarily sociological) problems with making
We have a hard time already in persuading third-party adopters
to install our components, even though they can be installed with
minimal space usage and locked-down read-only JDBC access to their
datasets. If we tell them they need to provide either a (potentially
very) large disk cache or (even worse!) write access to their DBMS
for temporary tables, many of them simply won't install the component -
so no TAP at all.
I'm not arguing that we won't/shouldn't implement paging in our own
TAP services - just that we need to allow providers to switch
the paging function off if they wish - and that therefore it
shouldn't be a compulsory part of TAP.
On Sun, Jan 28, 2007 at 10:15:13PM -0700, Doug Tody wrote:
> Hi Kona, All -
> I agree that a fully streamed query could be a powerful way to deal
> with large queries, and we should consider supporting this. However,
> a fully streamed query is not fully general (e.g., no ORDER BY or
> anything else which requires management of the full result set on
> the server; can't handle all cases), and it is semantically complex
> for the client to be able to deal with potentially very large query
> responses. Another key point with paged queries is that we do this
> in part to attempt to make things more responsive given a slow
> Internet. Often the client will abort the whole operation after
> receiving the first page or so of the result set, and repeat the
> operation with different parameters.
> For very large queries we need advanced techniques such as use
> of asynchronous operations and VOStore, or a streaming query.
> For "modest" size queries one might define a upper limit for the size
> of the result set managed by the server without resorting to the more
> complex managed techniques, plus some options for how the client
> can get at the result set. This could include either making the
> upper limit small enough to return it in one go in interactive times
> (the most basic interface, a la cone search), or some scheme based
> on automated server side caching. If the result set is cached on the
> server, then it can be returned either via paging or via a streaming
> transfer (as we already do for other large datasets such as images).
> So long as the server does not have to manage writeable storage on
> behalf of a client, caching result sets on the server is not necessarily
> very complicated. TAP already assumes that a DBMS is involved, so it
> is not so difficult to store a result set in a temporary table managed
> transparently by the server, and deleted after some interval.
> I agree that for the simplest possible service we probably do not have
> to require that it support paged queries, however, having a simple way
> to deal with queries up to the point where we get into grid techniques,
> while still providing reasonable interactive performance, is important.
> - Doug
More information about the voql-teg