From arots at head.cfa.harvard.edu Thu Apr 1 07:20:52 2004 From: arots at head.cfa.harvard.edu (Arnold Rots) Date: Thu, 1 Apr 2004 10:20:52 -0500 (EST) Subject: New UCD1+ for SIA protocol In-Reply-To: <200403241946.i2OJkEu1003422@xebec.cfa.harvard.edu> Message-ID: <200404011520.i31FKqLc016068@xebec.cfa.harvard.edu> I was asked to write something about the SCORE parameter for SIAP: SCORE This parameter is intended to aid the user in selecting an image in cases where a SIAP query would return multiple images. This is especially intended for non-expert users and for users who have no need, for their present purposes, to check through the metadata provided in other parameters. The criteria for determining the SCORE are entirely up to the service provider, since different considerations come into play for different archives. Some hints are provided below. We implicitly trust the providers to choose the best algorithm for the majority of the users, based on the fact that they know their data properties better than anybody else. Output SCORE is a floating point value that ranks returned images according to their relevance for the query as perceived by the server. This is meant to aid the client (especially the non-specialist client) in choosing from a list of images in case more than one image satisfies the query criteria. The scale is always relative and only meaningful in the context of the result in which it is provided. The highest number represents the "best" image available that satisfies the query. There is no specified range. It may measure things like exposure time, image quality, proximity to the specified position, resolution, etc. Input SCORE is a string and may assume two values: SCORE=TOP If multiple images satisfy the query criteria, only the one with the highest value of SCORE (in output) is returned. SCORE=ALL (default) All images satisfying the query criteria are returned. This issue is important for archives of pointed observations where dozens of images may be available satisfying a single query. Expert users may want to see the metadata on all of them and choose, but less sophisticated users and more general services (the issue came up in some of our prototypes/demos as a real problem!) are more likely get annoyed and say: just get me what you think is the best one. And that's what this is about. Note that SCORE is not necessarily a measure of data quality: a 1000 s exposure of high-quality data may still score lower than a 100000 s exposure of somewhat poorer quality. Nor does it say anything about how well each image satisfies the query: all returned images are expected to match the query's criteria. The question really is: among all these matches, how well do we think the user will like them, or how well will they fit the user's purposes? I would like to emphasize two things: 1. The scoring scale should be relative and only have meaning within a particular response list. I.e., you cannot necessarily compare the scores that come from two different queries and deduce any meaningful conclusion. The scoring may not even be linear. 2. The user is free to like or to dislike, to trust or to distrust the scores that are returned. If the user does not like the scores, (s)he should look at the actual metadata returned with each image in the response. Designing a scoring algorithm takes some thought and will be very mission/observatory specific. To illustrate this let me list the considerations that went into the Chandra archive's algorithm: - Exposure time; clearly, longer exposures produce higher-quality images - Instrument: ACIS-I/S, HRC-I/S; this is based on the different sensitivities and spectral responses - Co-allignment with requested position; Chandra's PSF quickly gets worse off-axis - Image resolution; we have two canned images at different resolutions and with different FOV. One might consider adding other factors, such as seeing/aspect quality. It is likely that we will continue to fine-tune the algorithm. I don't think it is particularly useful to publish the algorithm or its description. Frankly, I suspect that those who rely on SCORE don't really care and that those who care wnat to inspect the metadata to make their own decision. There is no UCD that fits SCORE, it's a context-dependent thing and, besides, it has a different datatype on input and output. - Arnold -------------------------------------------------------------------------- Arnold H. Rots Chandra X-ray Science Center Smithsonian Astrophysical Observatory tel: +1 617 496 7701 60 Garden Street, MS 67 fax: +1 617 495 7356 Cambridge, MA 02138 arots at head.cfa.harvard.edu USA http://hea-www.harvard.edu/~arots/ -------------------------------------------------------------------------- From derriere at newb6.u-strasbg.fr Mon Apr 26 07:49:30 2004 From: derriere at newb6.u-strasbg.fr (Sebastien Derriere) Date: Mon, 26 Apr 2004 16:49:30 +0200 Subject: towards UCD1+ Message-ID: <408D217A.58F8CB30@astro.u-strasbg.fr> Hello, I have placed on the IVOA wiki a PDF version of the latest document on UCD: http://www.ivoa.net/internal/IVOA/IvoaUCD/WD-UCD-20040426.pdf This document describes the motivations and procedure to evolve the first generation of UCD into a new scheme called UCD1+. The new scheme has been quite successfully tested on VizieR. It also includes inputs from other fields, like the latest SIAP discussions. As the UCD1+ vocabulary is likely to evolve faster than the reference document itself, the reference lists are available online, together with old versions and history of changes: http://vizier.u-strasbg.fr/UCD/lists/ The document currently has a Working Draft status, but the idea is to promote it to an IVOA Proposed Recommendation. This can be done once agreement is reached within the discussion group, if I understand well the IVOA document lifecycle. So... time for discussion now ! Sebastien. -- _______ / ~ /, Sebastien Derriere mailto:derriere at astro.u-strasbg.fr / ~~~~ // Observatoire de Strasbourg Phone +33 (0) 390 242 444 /______// 11, rue de l'universite Telefax +33 (0) 390 242 417 (______(/ F-67000 Strasbourg France From mchill at dial.pipex.com Wed Apr 28 01:48:19 2004 From: mchill at dial.pipex.com (Martin Hill) Date: Wed, 28 Apr 2004 09:48:19 +0100 Subject: towards UCD1+ In-Reply-To: <408D217A.58F8CB30@astro.u-strasbg.fr> References: <408D217A.58F8CB30@astro.u-strasbg.fr> Message-ID: <408F6FD3.6020009@dial.pipex.com> Hi again I agree entirely that UCDs should be used to describe what a value represents rather than how it relates to other things. I think that combining UCDs with data models will give us everything we need. English/typos: Could we use 'phrase' instead of 'words' since really 'phys.temp' consists of two words 'phys' and 'temp' assembled together? Sec 3.3 two cases of "applies a magnitude" I think should read "applies to a magnitude"? Most of these are requests for explanation rather than a request for change: An astronomical question: given UCDs are not supposed to describe units, why do we still have UCDS like 'phot.mag'? Should this be 'phys.brightness', where the units are magnitudes or janskies or ergs or whatever? How are words assembled - eg, why pos.eq.ra instead of pos.ra? How will they be extended? We need more detail on how the UCDs are built up. Using the example of instrument temperature maximum: why is the 'stat.max' added at the end, instead of say, 'phys.temp;stat.max;instr'. Indeed if 'stat.max' is secondary, shouldn't the error be 'phys.temp;stat.error' and we add a 'stat.value' word? Or should we say that order does not matter? If order *is* important this could all get rather fraught. What is the conceptual difference between a word and a phrase? ie, why phys.temp;stat.max;instr instead of 'instr.temp.max', the latter being built from UCD words 'instr' 'temp' and 'max'? Requests for change: Personally I would rather see full names used instead of abbreviations (eg 'physical' instead of 'phys', 'position' instead of 'pos') for anything other than really common terms (ra, dec, max, min). Abbreviations make an insignificant difference to software performance, and can cause confusion especially when non-astronomers are involved in development. Cheers, Martin Sebastien Derriere wrote: > Hello, > > I have placed on the IVOA wiki a PDF version of the latest document > on UCD: http://www.ivoa.net/internal/IVOA/IvoaUCD/WD-UCD-20040426.pdf > This document describes the motivations and procedure to evolve > the first generation of UCD into a new scheme called UCD1+. > The new scheme has been quite successfully tested on VizieR. It also > includes inputs from other fields, like the latest SIAP discussions. > As the UCD1+ vocabulary is likely to evolve faster than the reference > document itself, the reference lists are available online, together with > old > versions and history of changes: > http://vizier.u-strasbg.fr/UCD/lists/ > > The document currently has a Working Draft status, but the idea is > to promote it to an IVOA Proposed Recommendation. This can be done > once agreement is reached within the discussion group, if I understand > well the IVOA document lifecycle. > So... time for discussion now ! > > Sebastien. -- Martin Hill Software Engineer AstroGrid @ ROE Tel: +44 7901 55 24 66 www.astrogrid.org From arots at head.cfa.harvard.edu Thu Apr 1 07:20:52 2004 From: arots at head.cfa.harvard.edu (Arnold Rots) Date: Thu, 1 Apr 2004 10:20:52 -0500 (EST) Subject: New UCD1+ for SIA protocol In-Reply-To: <200403241946.i2OJkEu1003422@xebec.cfa.harvard.edu> Message-ID: <200404011520.i31FKqLc016068@xebec.cfa.harvard.edu> I was asked to write something about the SCORE parameter for SIAP: SCORE This parameter is intended to aid the user in selecting an image in cases where a SIAP query would return multiple images. This is especially intended for non-expert users and for users who have no need, for their present purposes, to check through the metadata provided in other parameters. The criteria for determining the SCORE are entirely up to the service provider, since different considerations come into play for different archives. Some hints are provided below. We implicitly trust the providers to choose the best algorithm for the majority of the users, based on the fact that they know their data properties better than anybody else. Output SCORE is a floating point value that ranks returned images according to their relevance for the query as perceived by the server. This is meant to aid the client (especially the non-specialist client) in choosing from a list of images in case more than one image satisfies the query criteria. The scale is always relative and only meaningful in the context of the result in which it is provided. The highest number represents the "best" image available that satisfies the query. There is no specified range. It may measure things like exposure time, image quality, proximity to the specified position, resolution, etc. Input SCORE is a string and may assume two values: SCORE=TOP If multiple images satisfy the query criteria, only the one with the highest value of SCORE (in output) is returned. SCORE=ALL (default) All images satisfying the query criteria are returned. This issue is important for archives of pointed observations where dozens of images may be available satisfying a single query. Expert users may want to see the metadata on all of them and choose, but less sophisticated users and more general services (the issue came up in some of our prototypes/demos as a real problem!) are more likely get annoyed and say: just get me what you think is the best one. And that's what this is about. Note that SCORE is not necessarily a measure of data quality: a 1000 s exposure of high-quality data may still score lower than a 100000 s exposure of somewhat poorer quality. Nor does it say anything about how well each image satisfies the query: all returned images are expected to match the query's criteria. The question really is: among all these matches, how well do we think the user will like them, or how well will they fit the user's purposes? I would like to emphasize two things: 1. The scoring scale should be relative and only have meaning within a particular response list. I.e., you cannot necessarily compare the scores that come from two different queries and deduce any meaningful conclusion. The scoring may not even be linear. 2. The user is free to like or to dislike, to trust or to distrust the scores that are returned. If the user does not like the scores, (s)he should look at the actual metadata returned with each image in the response. Designing a scoring algorithm takes some thought and will be very mission/observatory specific. To illustrate this let me list the considerations that went into the Chandra archive's algorithm: - Exposure time; clearly, longer exposures produce higher-quality images - Instrument: ACIS-I/S, HRC-I/S; this is based on the different sensitivities and spectral responses - Co-allignment with requested position; Chandra's PSF quickly gets worse off-axis - Image resolution; we have two canned images at different resolutions and with different FOV. One might consider adding other factors, such as seeing/aspect quality. It is likely that we will continue to fine-tune the algorithm. I don't think it is particularly useful to publish the algorithm or its description. Frankly, I suspect that those who rely on SCORE don't really care and that those who care wnat to inspect the metadata to make their own decision. There is no UCD that fits SCORE, it's a context-dependent thing and, besides, it has a different datatype on input and output. - Arnold -------------------------------------------------------------------------- Arnold H. Rots Chandra X-ray Science Center Smithsonian Astrophysical Observatory tel: +1 617 496 7701 60 Garden Street, MS 67 fax: +1 617 495 7356 Cambridge, MA 02138 arots at head.cfa.harvard.edu USA http://hea-www.harvard.edu/~arots/ -------------------------------------------------------------------------- From derriere at newb6.u-strasbg.fr Mon Apr 26 07:49:30 2004 From: derriere at newb6.u-strasbg.fr (Sebastien Derriere) Date: Mon, 26 Apr 2004 16:49:30 +0200 Subject: towards UCD1+ Message-ID: <408D217A.58F8CB30@astro.u-strasbg.fr> Hello, I have placed on the IVOA wiki a PDF version of the latest document on UCD: http://www.ivoa.net/internal/IVOA/IvoaUCD/WD-UCD-20040426.pdf This document describes the motivations and procedure to evolve the first generation of UCD into a new scheme called UCD1+. The new scheme has been quite successfully tested on VizieR. It also includes inputs from other fields, like the latest SIAP discussions. As the UCD1+ vocabulary is likely to evolve faster than the reference document itself, the reference lists are available online, together with old versions and history of changes: http://vizier.u-strasbg.fr/UCD/lists/ The document currently has a Working Draft status, but the idea is to promote it to an IVOA Proposed Recommendation. This can be done once agreement is reached within the discussion group, if I understand well the IVOA document lifecycle. So... time for discussion now ! Sebastien. -- _______ / ~ /, Sebastien Derriere mailto:derriere at astro.u-strasbg.fr / ~~~~ // Observatoire de Strasbourg Phone +33 (0) 390 242 444 /______// 11, rue de l'universite Telefax +33 (0) 390 242 417 (______(/ F-67000 Strasbourg France From mchill at dial.pipex.com Wed Apr 28 01:48:19 2004 From: mchill at dial.pipex.com (Martin Hill) Date: Wed, 28 Apr 2004 09:48:19 +0100 Subject: towards UCD1+ In-Reply-To: <408D217A.58F8CB30@astro.u-strasbg.fr> References: <408D217A.58F8CB30@astro.u-strasbg.fr> Message-ID: <408F6FD3.6020009@dial.pipex.com> Hi again I agree entirely that UCDs should be used to describe what a value represents rather than how it relates to other things. I think that combining UCDs with data models will give us everything we need. English/typos: Could we use 'phrase' instead of 'words' since really 'phys.temp' consists of two words 'phys' and 'temp' assembled together? Sec 3.3 two cases of "applies a magnitude" I think should read "applies to a magnitude"? Most of these are requests for explanation rather than a request for change: An astronomical question: given UCDs are not supposed to describe units, why do we still have UCDS like 'phot.mag'? Should this be 'phys.brightness', where the units are magnitudes or janskies or ergs or whatever? How are words assembled - eg, why pos.eq.ra instead of pos.ra? How will they be extended? We need more detail on how the UCDs are built up. Using the example of instrument temperature maximum: why is the 'stat.max' added at the end, instead of say, 'phys.temp;stat.max;instr'. Indeed if 'stat.max' is secondary, shouldn't the error be 'phys.temp;stat.error' and we add a 'stat.value' word? Or should we say that order does not matter? If order *is* important this could all get rather fraught. What is the conceptual difference between a word and a phrase? ie, why phys.temp;stat.max;instr instead of 'instr.temp.max', the latter being built from UCD words 'instr' 'temp' and 'max'? Requests for change: Personally I would rather see full names used instead of abbreviations (eg 'physical' instead of 'phys', 'position' instead of 'pos') for anything other than really common terms (ra, dec, max, min). Abbreviations make an insignificant difference to software performance, and can cause confusion especially when non-astronomers are involved in development. Cheers, Martin Sebastien Derriere wrote: > Hello, > > I have placed on the IVOA wiki a PDF version of the latest document > on UCD: http://www.ivoa.net/internal/IVOA/IvoaUCD/WD-UCD-20040426.pdf > This document describes the motivations and procedure to evolve > the first generation of UCD into a new scheme called UCD1+. > The new scheme has been quite successfully tested on VizieR. It also > includes inputs from other fields, like the latest SIAP discussions. > As the UCD1+ vocabulary is likely to evolve faster than the reference > document itself, the reference lists are available online, together with > old > versions and history of changes: > http://vizier.u-strasbg.fr/UCD/lists/ > > The document currently has a Working Draft status, but the idea is > to promote it to an IVOA Proposed Recommendation. This can be done > once agreement is reached within the discussion group, if I understand > well the IVOA document lifecycle. > So... time for discussion now ! > > Sebastien. -- Martin Hill Software Engineer AstroGrid @ ROE Tel: +44 7901 55 24 66 www.astrogrid.org