Re: gzipped images in SIAP 1.0

From: Doug Tody <dtody-at-nrao.edu>
Date: Sun, 27 May 2007 16:40:34 -0600 (MDT)


Hi All -

I missed this conversation earlier, which occurred while I was still on travel. This is relevant not only for SIAP 1.0, but also for SSAP, for which the problem is exactly the same, and in principle can affect any future DAL protocol, hence it is worthwhile to look at this carefully now. Note also that SSAP adds a COMPRESS parameter (discussed further below), and attempts to deal more carefully with the issue of compression than did SIAP 1.0.

This message is a bit long as I deal with three distinct but related issues (these are separated by double blank lines below).

On Tue, 22 May 2007, Roy Williams wrote:
> I would like to know if it is possible for a compliant SIAP to return
> *compressed* FITS images.

Having read through all the postings, I think we need to keep two issues clearly separated here: 1) what the client application wants to see, and 2) what happens at the level of the HTTP protocol.

Compression can refer either to the data product which is returned to the client application (this is visible to the client), or at the level of the HTTP protocol, which already has built-in capabilities for compression (this is handled at the HTTP client-server level and is transparent to the client application). Hence we need to distinguish between compression as seen by the "client" or "client application", and compressesion as seen by the "HTTP-level client code": these are not the same thing.

In general, anything which is a request parameter (such as FORMAT) refers to *what the client app wants to see*. This is independent of the underlying protocol. Currently we use HTTP, but if we were to use something else, a request parameter such as FORMAT should still have the same semantics.

Hence if the client app asks for FORMAT=image/fits, it is illegal for it to end up with a GZIP-compressed FITS file. However, it is ok for the service to return a compressed file, so long as this is handled transparently at the level of the HTTP protocol. That is, when we fetch the data (as others have already pointed out), the HTTP response headers can be

     Content-Type: image/fits
     Content-Encoding: gzip

in which case the HTTP-level client code should transparently unzip the byte stream as it reads the data. NOTE though, it may not be advisable for the service to do this unless the HTTP-level client code issues an Accept-Encoding header which explicitly states that the client code can handle optional stream-level compression.

On Tue, 22 May 2007, Roy Williams wrote:
> -- Does anyone remeber the intention of the comma-delimited list of MIME
> types? Should my code look for "application/x-gzip,image/fits"
> Or maybe the other way around?

If this refers to the FORMAT query parameter, which takes a list of MIME types, then this tells the service to describe, in the query response, only images with the given MIME types. If it can't generate the requested MIME type it should return nothing. Hence if the client asks for image/jpeg and the client cannot return a JPEG, the response will be a null query (REQUEST_STATUS=OK and no data).

As I mentioned earlier, SSAP adds a new parameter COMPRESS, which attempts to deal more explicitly with the issue of whole-file (gzip-style) compression. Since this is a request parameter, this refers to compression *as seen by the client application*. If the client enables compression (it is disabled by default) then the service is permitted to return compressed files, or not, as it sees fit. If the client app enables compression it has to be able to deal with the returned data optionally being returned using whole-file compression. For this to work we have to limit the compression options to only widely-implemented algorithms such as gzip.

In this case the data product itself (the Content-Type) is compressed, independent of what is happening to the data stream at the HTTP level. What the returned MIME type should be is not entirely clear: it could be application/x-gzip or maybe something like image/fits;encoding=gzip.

This feature allows compressed files to be passed through the protocol unchanged, and manipulated by applications in compressed form. I think generic whole-file compression of this sort is a separate issue from something like tiled images with Rice compression; the latter is an internal feature of FITS. We might enable it at the protocol level in the future, but currently there is no attempt to support this.

Probably COMPRESS has not had enough discussion - even though it has been in the SSAP specification for about a year. What do folks think: do we need both HTTP-level transparent data stream compression, and actual dataset-level whole file compression?

Received on 2007-05-28Z00:41:34