From amsr at jb.man.ac.uk Fri May 2 07:43:21 2003 From: amsr at jb.man.ac.uk (Anita Richards) Date: Fri, 2 May 2003 15:43:21 +0100 (BST) Subject: Pre-meeting UCD question Message-ID: Ray Plante's 'Requirements for the Future' is a very neat summary, and I am especially glad he has put on record the need for * Accessible documentation and * Backward compatibility - and I think it is a good basis for the Cambridge discussion There are a couple of things which have come up previously: The use of UCDs other than for column content descriptors - e.g. for the ResourceMetadata which summarises the content of a dataset for the Registry. Should these be taken from exactly the same set of UCDs? Or should these embody their context (e.g. by reserving superclass UCDs, such as the first element if we use Guy Rixon's atoms...)? How to qualify UCDs? At the moment we have some 'degenerate' UCDs, e.g. SPEC_WAVELENGTH in the IDHA model for both the high and low bandpass limits. This means you have to evaluate both, if you just want data above a certain frequency (say). Then we have some overspecified, e.g. all the PHOT ones for U B V R I, RADIO_1.6, _1.4 etc. In some cases this might be solved just by adding MAX/MIN or equivalent (LONG and SHORT for the bandpass as MAX/MIN is ambiguous unless you know if it is freq. or wavelength). However with XML we can be more intelligent, as has been pointed out, and give them properties or attributes or values. How do we cope with the cases where the required information (ie another UCD?) is elsewhere in the same data set, e.g. another column or in the header? e.g. for associating cumulative errors with a data point, or for realising that an entire catalogue is at 1.4 GHz or in the Cousins photometry convention? That is, we need not only to be able to select on the basis of UCDs, but to be able to interpret their properties at the Registry level. In some cases we might need to evaluate the data they describe at the Registry (as in the present bandpass example I gave), but perhaps that can be avoided, or should be, if possible, so that you only have to dive into the dataset itself when you are answering the query in detail? This can be summed up as saying that we want UCDs to describe data but we should avoid as much as possible using UCDs as data. Thoughts on some of David's comments: Units and accuracy When we are using UCDs (assuming this includes the ResourceMetadata) in the Registry, we do not need high accuracy as long as we err on the side of inclusion (so a catalogue spanning 10 - 3001 GHz would be both radio and IR). If we want catalogues with photometry, we do not need to know what units the flux is measured in. However if we want a certain level of accuracy we do need units (at least if we are to avoid quality factors etc. which will be very arbitrary) - but here, the Registry could standardise everything to one sort of unit for each quantity as the conversion does not need to be accurate (e.g. noise < 0.0001 Jy derived from a certain limiting magnitude) as long as it is rounded down/up as appropriate. This is analogous to Ray's squinting. I think it is crucial that do not tie UCDs to certain units or whatever, as we then make it difficult to interconvert. Any conversion errors should be carried in proper UCDs for errors - thus, if you ask for x-ray photometry in counts, you will get a result which will probably ahve greater accuracy than if you ask for it in Jy - but the latter is vital if you want to compare it with data at other wavelengths. Cheers a - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Dr. Anita M. S. Richards, AVO Astronomer MERLIN/VLBI National Facility, University of Manchester, Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K. tel +44 (0)1477 572683 (direct); 571321 (switchboard); 571618 (fax). From aam at astro.caltech.edu Sun May 11 22:00:09 2003 From: aam at astro.caltech.edu (Ashish Mahabal) Date: Sun, 11 May 2003 22:00:09 -0700 (PDT) Subject: Some comments on UCDs Message-ID: Hi, Since I will not be personally present when many of you meet in Europe (now), I decided to send in my two cents to the group. Over the last few months, talking to several VO people from US and Europe, it is clear that everyone agrees that something like the UCDs is great. Not everyone agrees on (1) the number of UCDs, (2) the hierarchical nature they are currently arranged in, (3) mechanisms of adding UCDs, ... Suggestions range from leave UCDs as they are to completely replace them with alternate mechanisms (some of which have at least partially been thought of). Having worked with UCDs for making Topic Maps based on UCDs in catalogs (examples at http://www.astro.caltech.edu/~aam/science/topicmaps/ucd.html - a specialised application based on these will be ready soon), I have come to like the ease with which one can use UCDs to build structure around them (though they certainly have their limitations, as is seen by some "gaps" in some of the topic maps). I have not been required to deal with the entire set of UCDs (e.g. Atomic Data, Orbit etc. related UCDs) as I have not dealt with all kinds of catalogs. But the hierarchical structure seems very useable. Having error terms on various individual UCDs will be good and some work could go in there. Of the 1400 or so UCDs, the 100 most accessed catalogs at Vizier use only about 250 UCDs in all. Allowing people to make their own UCDs (just like standards should not be a plural, not everyone should create their own Universals) will bring us back to square one even if what the personal UCDs mean is well documented. My R magnitude UCD is GunnR and some one elses is GR (both having used the Thuan-Gunn system). Okay so far. But then I also use GR to mean Gamma Ratio (whatever that is), and all hell breaks loose. A lot of time may have to be spent in writing more intelligent software to deal with these issues which could in the first place have been avoided. So should there be "UCD police" who decide what new UCDs should get added? Perhaps not. But everybody concerned can try to come up with a list of what is currently missiing, and what will probably be needed in the next 10 years (and later on there will be surprises of course), and simply add the resulting list. And keep a mechanism for adding more. Should the list be changed from a hierarchy to something else? That should not really matter much. (Broadly speaking - software should be able to handle that). Should the UCDs serve as an ontology for astronomy? I beileve that in a sense it is. However, I also believe that anything that is related to astronomy (e.g. astronomers, instruments, observatories etc.) should also be part of that ontology and could be represented by UCDs in the hierarchy (adding a few hundred more terms). These can then be used in many other tables (derived data products). So a growing branch could stay open there. More later. Cheers and wishing you all a good time, -ashish PS: Does some one plan to email session summaries to the group? Ashish Mahabal, Caltech Astronomy, Pasadena, CA 91125 http://www.astro.caltech.edu/~aam aam at astro.caltech.edu Lawyer: How far apart were the vehicles at the time of the collision? From ael at star.le.ac.uk Fri May 16 04:14:44 2003 From: ael at star.le.ac.uk (Tony Linde) Date: Fri, 16 May 2003 12:14:44 +0100 Subject: MyUCDs Message-ID: <20030516111444.F2F465C479@smtp.us2.messagingengine.com> I've posted a document on the IVOA wiki which summarises my understanding of the UCD discussion in the plenary session yesterday. Perhaps some people could peruse that and provide feedback on this mailing list about any errors or differences of opinion on the implications that I see. I hope it helps others with any of their own misunderstandings. http://www.ivoa.net/twiki/bin/view/IVOA/TonyOnUCDs Cheers, Tony. __ Tony Linde Phone: +44 (0)116 223 1292 AstroGrid Project Manager Fax: +44 (0)116 252 3311 Dept of Physics & Astronomy Mobile: +44 (0)7753 603356 University of Leicester Email: ael at star.le.ac.uk Leicester, UK LE1 7RH Web: http://www.astrogrid.org From pfo at star.le.ac.uk Fri May 16 05:42:43 2003 From: pfo at star.le.ac.uk (Patricio F. Ortiz) Date: Fri, 16 May 2003 13:42:43 +0100 (BST) Subject: MyUCDs In-Reply-To: <20030516111444.F2F465C479@smtp.us2.messagingengine.com> Message-ID: On Fri, 16 May 2003, Tony Linde wrote: > I've posted a document on the IVOA wiki which summarises my understanding > of the UCD discussion in the plenary session yesterday. Perhaps some > people could peruse that and provide feedback on this mailing list about > any errors or differences of opinion on the implications that I see. I > hope it helps others with any of their own misunderstandings. > > http://www.ivoa.net/twiki/bin/view/IVOA/TonyOnUCDs Here go my comments to Tony's document: > My understanding of UCDs > > The way this came up was my question in the plenary UCD session about > how we can identify columns within a table uniquely. Basically, the > answer was that UCDs will not solve this problem and are not intended > to do so. This page will summarise what I now understand as the purpose > of UCDs and some of the implications of this. > > UCDs as Data Types > > Comment was made that UCDs can be considered as data types, so a column > in a table has a data type of, POS_EQ_RA, say. I assume that the > reasons for having UCDs as data types are to allow: > > * operations on columns: comparison, addition, subtraction, > multiplication, etc plus specific astronomical operations > * conversion between data types: eg converting between equitorial and > galactic coordinates As their name say, UCDs are meant to describe the content of a column uniquely. Within any given table, column names do not repeat (unless in exceptional cases where the authors want to repeat a column for easy reading. The problem arose when column names became degenerate when analyzing different tables/catalogues. The most common scenario for astronomy was a column called "Mag" , meant to represent the brightness of a celestial object in some photometric system. The problem was/is that "Mag" was "overloaded" and just comparing two columns labeled "Mag" was a cry for trouble. UCDs were introduced to break the degeneracy. column (catalog=c1 name=Mag unit=mag UCD=PHOT_JHN_V) has nothing to do with column (catalog=c2 name=Mag unit=mag UCD=PHOT_STR_B) > Do we thus need (or already have) some hierarchical structure of the > UCDs based on allowable operations? In normal data types, we have > numerical types, subdivided by integral and floating point, subdivided > by storage size etc; one can add all numerical types but (generally) > cannot add a number and a string (without pre-defining what such an > addition will do). Although one can add numerical types and not add strings or strings and numbers, not all numbers should be allowed to be added. One should not allow adding a velocity measured in km/s to a right ascention in equatorial coordinates (an angle). Whatever mechanism is used to perform table combination should be provided with this kind of knowledge. > Aligned to that: should we define the operations that can be performed > on the individual data types (UCDs), the rules for those operations > given specific types, and the type resulting from such operations. Yes and no, thinking of all combination is unrealistic, plus, it's only in adding/subtracting where you have problems. No-one could impede you to take ratios of columns or multiply columns. As Peter Quinn pointed out in the plenary, some decisions should always be the responsibility of the astronomer. > UCDs as Keywords > > In this context, the UCDs is part of the metadata for a table. It > indicates the type of data held in a table, so having POS_EQ_RA > identified with a table says that this table includes positional data > in equitorial coordinates. That said, maybe the UCD for the table > should include POS_EQ instead (since it is unlikely that it'll have RA > without DEC). Unlikely but not impossible. I've seen tables in which that's the case. UCD's were meant to be attached to columns, talking about a table's UCD is IMHO a confusion. A Table can have a set of UCDs attached to it (like a list), which may be shorter than the number of columns in the table if some columns happen to be representative of the same physical quantity therefore they DO have the same UCD. > So the idea of being able to query which resources have POS_EQ* makes sense. Sure it does. Not all catalogues contain RA-Dec. > UCDs as Pointers into Data Model > > This was a very interesting comment, that UCDs can be seen as a pointer > into the data model (DM). How this might be implemented and how > feasible it is is still open. I guess there are two potential problem > areas: > > * a UCD refers to multiple DM points (classes, objects or whatever > they are called) this is likely but does indicate areas in which the > UCDs are not the lowest level of metadata > > * one DM point is referred to by several UCDs > if this occurs, it would indicate that the DM requires further analysis Yes and no. Yes if we are talking about the so called "core-UCDs", which are the UCDs which can be attached to a column. No if we develop the concept of an "alias-UCD", which represents a list of UCDs design for discovery purposes. I'd say that a DM should have one "core-UCD" and 0 or more "alias-UCDs" > I suspect that, as the DM expands and covers more areas of astronomy, > we will need a more efficient version of UCDs that accurately maps to > the DM; the current '_' separated textual names will have limited > extensibility (even with the additional modifiers agreed at this > meeting). Most likely. > Unique Column Identification > > Given that we cannot use UCDs as unique column identifiers, how do we do this? > It seems that the only possible unique identifier for a column in a > table is the resourceID of the table (from the Registry) plus the > columnName (for explanation of resourceID, see the discussion on this > in the Registry mailing list: > http://www.ivoa.net/forum/registry/0091.htm and related messages). Absolutely, Within a table, the column name IS a unique column identifier. The pair catalogueName.columnName is in 99.9% of the cases unique, and for the future, one can impose that columnName be unique within a table (most DBMS won't be so happy if you assign 2 columns with the same name). If we ever converge to assign unique IDs to catalogues, then cataloqueUniqueID.columnName is unique Adding a UCD to this structure would be redundant eg, cataloqueUniqueID.columnName.columnUCD > So, to summarise the discussion from the plenary session, a query can > be sent to a table with either UCDs or column names or a mixture of > both. If a UCD is included in a query, the data source can resolve this > if there is only one column with that UCD or there are multiple columns > but one has the modifier MAIN attached to only one of the column UCDs. > Otherwise the query will fail. Let me include here another element before the submission of a query to the resource handling catalogues: The registry. Any registry will contain metadata about the services listed, ie, the catalogues. The registry would know (among other things), - catalogueName (possibly catalogueUniqueID) - catalogueTitle - catalogueKeywords - catalogueAuthor - number of columns - number of records - column names, UCDs, units - name ID,MAIN --- - RA POS_EQ_RA,MAIN h:m:s - Dec POS_EQ_DEC,MAIN d:m:s .... - Vmag PHOT_JHN_V mag - Bmag PHOT_JHN_B mag - z REDSHIFT --- IMHO, a query can be accurately formulated for any given resource after consulting the registry. It should be decided at the registry level what we want to extract from any given catalogue, therefore, the query received by the resource handling that catalogue has to make no decision, and what's best, the query should not even be submitted if we know ahead of time that it will fail. > Conclusion > > I hope people will provide feedback on the mailing list to these > comments. I reiterate that they are only my understanding of what was > said and my belief of the implications. Another use of UCDs would be to discover resources (catalogues) which contain certain QUANTITIES. Scenario 1 If I formulate a query in the line "select catalogueName where UCDs include REDSHIT POS_EQ_RA,MAIN POS_EQ_DEC,MAIN PHOT_JHN_V from registryXX" I should get back a listing of catalogues which contain that information (plus probably the column names, units and UCDs which make the catalogue satisfy our request). Titles and keywords are short handles and do not always tell about all is listed in a catalogue. The astronomer should know what to do with this information, perhaps s/he will pick up a few of these catalogues and submit a query to them. Scenario 2 "select ucd;ID,MAIN ucd;POS_EQ_RA,MAIN ucd;POS_EQ_RA,MAIN ucd;PHOT_JHN_V ucd;REDSHIFT from [catalogueName where UCDs include REDSHIFT PHOT_JHN_V POS_EQ_RA,MAIN POS_EQ_DEC,MAIN from registryXX] where ucd;REDSHIFT > 2.5" The user doesn't know ahead of times if any ocatalogue exists to satisfy the query, but if it exist, s/he would like to print the equivalent of Name, RA, Dec, Vmag, and z This query should first find out a list of catalogues which contain all those UCDs and the name of their respective columns. Once this is known, individual queries should be sent to each of the catalogues requesting the columns which identify with the UCDs the user requested and which satisfy redshift > 2.5. Note that I've used a very loose notation above on purpose Cheers, Patricio --- Patricio F. Ortiz pfo at star.le.ac.uk AstroGrid project Department of Physics and Astronomy University of Leicester Tel: +44 (0)116 252 2015 LE1 7RH, UK From mleoni at eso.org Mon May 19 07:48:43 2003 From: mleoni at eso.org (Marco C. Leoni) Date: Mon, 19 May 2003 16:48:43 +0200 Subject: MyUCDs References: Message-ID: <3EC8EECB.1000604@eso.org> Hi Patricio, one question about the point below: > Let me include here another element before the submission of a > query to the resource handling catalogues: The registry. > > Any registry will contain metadata about the services listed, ie, > the catalogues. > The registry would know (among other things), > - catalogueName (possibly catalogueUniqueID) > - catalogueTitle > - catalogueKeywords > - catalogueAuthor > - number of columns > - number of records > - column names, UCDs, units > - name ID,MAIN --- > - RA POS_EQ_RA,MAIN h:m:s > - Dec POS_EQ_DEC,MAIN d:m:s > .... > - Vmag PHOT_JHN_V mag > - Bmag PHOT_JHN_B mag > - z REDSHIFT --- > are you sure the Registry will include all these information? Perhaps "column names" and "units" will be resolved by the service provider, and not by the registry itself. Cheers, Marco From pfo at star.le.ac.uk Mon May 19 08:17:13 2003 From: pfo at star.le.ac.uk (Patricio F. Ortiz) Date: Mon, 19 May 2003 16:17:13 +0100 (BST) Subject: MyUCDs In-Reply-To: <3EC8EECB.1000604@eso.org> Message-ID: On Mon, 19 May 2003, Marco C. Leoni wrote: > Hi Patricio, > one question about the point below: > > > Let me include here another element before the submission of a > > query to the resource handling catalogues: The registry. > > > > Any registry will contain metadata about the services listed, ie, > > the catalogues. > > The registry would know (among other things), > > - catalogueName (possibly catalogueUniqueID) > > - catalogueTitle > > - catalogueKeywords > > - catalogueAuthor > > - number of columns > > - number of records > > - column names, UCDs, units > > - name ID,MAIN --- > > - RA POS_EQ_RA,MAIN h:m:s > > - Dec POS_EQ_DEC,MAIN d:m:s > > .... > > - Vmag PHOT_JHN_V mag > > - Bmag PHOT_JHN_B mag > > - z REDSHIFT --- > > > > are you sure the Registry will include all these information? > Perhaps "column names" and "units" will be resolved by the service > provider, and not by the registry itself. > > Cheers, > Marco Hi Marco, well, I would have thought so. It's done already by vizier meta-information tables (which could be considered a local registry) and surely by other services. It's a good point though, how much is a registry supposed to know? In the scheme that Keith Noddle presented in Cambridge, I would expect at least the local register (read service provider's) should know about about this. Whether the higher level registers decide to collect this information is to be seen, but if you ask *me*, I would say yes, they should keep this information. We are not talking about a huge data volume here; the advantages are large though, as you don't have to go everywhere with your query. I just had to deal with getting a mortgage and run from bank to bank filling up applications not knowing if I satisfied all the conditions they asked; then a consultant phoned us and we dealt with him, he found out who would lend us the money and it worked fine. The analogy seems quite appropriate for our situation in VO. An astronomer wants to formulate a general query, he or she has no idea where this could be done, so s/he asks a service to lauch the query urbe et orbi... Surely the data will come back to the astronomer, but if we apply a filtering system which can select the services which may give a positive answer (and weed off the ones with a negative), then the number of transactions diminishes and the response time is shorter. An intermediate solution would be to send the query to broker-services urbe et orbi, and let those services do the filtering and send the query to places which may give a positive answer. IMHO not using this meta-information would be a real waste. Cheers, Patricio --- Patricio F. Ortiz pfo at star.le.ac.uk AstroGrid project Department of Physics and Astronomy University of Leicester Tel: +44 (0)116 252 2015 LE1 7RH, UK From mleoni at eso.org Mon May 19 09:10:16 2003 From: mleoni at eso.org (Marco C. Leoni) Date: Mon, 19 May 2003 18:10:16 +0200 Subject: MyUCDs & Registry References: Message-ID: <3EC901E8.40307@eso.org> Patricio F. Ortiz wrote: >On Mon, 19 May 2003, Marco C. Leoni wrote: > > >>Hi Patricio, >> one question about the point below: >> >> >> >>> Let me include here another element before the submission of a >>> query to the resource handling catalogues: The registry. >>> >>> Any registry will contain metadata about the services listed, ie, >>> the catalogues. >>> The registry would know (among other things), >>> - catalogueName (possibly catalogueUniqueID) >>> - catalogueTitle >>> - catalogueKeywords >>> - catalogueAuthor >>> - number of columns >>> - number of records >>> - column names, UCDs, units >>> - name ID,MAIN --- >>> - RA POS_EQ_RA,MAIN h:m:s >>> - Dec POS_EQ_DEC,MAIN d:m:s >>> .... >>> - Vmag PHOT_JHN_V mag >>> - Bmag PHOT_JHN_B mag >>> - z REDSHIFT --- >>> >>> >>> >>are you sure the Registry will include all these information? >>Perhaps "column names" and "units" will be resolved by the service >>provider, and not by the registry itself. >> >>Cheers, >> Marco >> >> > >Hi Marco, > >well, I would have thought so. It's done already by vizier meta-information >tables (which could be considered a local registry) and surely by other >services. It's a good point though, how much is a registry supposed to >know? In the scheme that Keith Noddle presented in Cambridge, I would >expect at least the local register (read service provider's) should know >about about this. Whether the higher level registers decide to collect this >information is to be seen, but if you ask *me*, I would say yes, they >should keep this information. We are not talking about a huge data volume >here; the advantages are large though, as you don't have to go everywhere >with your query. > >I just had to deal with getting a mortgage and run from bank to bank >filling up applications not knowing if I satisfied all the conditions they >asked; then a consultant phoned us and we dealt with him, he found out >who would lend us the money and it worked fine. The analogy seems quite >appropriate for our situation in VO. An astronomer wants to formulate a >general query, he or she has no idea where this could be done, so s/he >asks a service to lauch the query urbe et orbi... Surely the data will >come back to the astronomer, but if we apply a filtering system which >can select the services which may give a positive answer (and weed off >the ones with a negative), then the number of transactions diminishes >and the response time is shorter. > >An intermediate solution would be to send the query to broker-services urbe >et orbi, and let those services do the filtering and send the query to >places which may give a positive answer. > >IMHO not using this meta-information would be a real waste. > >Cheers, > >Patricio > >--- >Patricio F. Ortiz pfo at star.le.ac.uk >AstroGrid project >Department of Physics and Astronomy >University of Leicester Tel: +44 (0)116 252 2015 >LE1 7RH, UK > Patricio, I agree that a filtering system would be nice, and in fact a registry is supposed to do that (in my opinion): if necessary it will send an "urbi et orbi" query and give back to the astronomer only the relevant results, where this means compare them with the original requirements (e.g. exclude all the "no info about it" answers). The same before sending the query, if the registry has enough information about services (this is not the case if we think of registries containing only references to other registries, but this will simply add one more level to the structure). What I meant before is that probably we need only UCDs in the registry without any unique column names: when the query reaches the service then it will be translated and UCDs mapped to comlumn names. - On the other side, if the user already knows the unique column name then using it will remove any control about UCDs. - Last possibility, giving both unique column name and UCD will create a well-defined detailed query, with specific result even in the "urbi et orbi" case. Cheers, Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: From ael at star.le.ac.uk Mon May 19 08:14:37 2003 From: ael at star.le.ac.uk (Tony Linde) Date: Mon, 19 May 2003 16:14:37 +0100 Subject: MyUCDs In-Reply-To: <3EC8EECB.1000604@eso.org> Message-ID: <003401c31e19$5c9d3c50$1001a8c0@brolga> The Registry should answer the question 'what are the column names for resource X?', however that question might be phrased. Whether the registry retains that data for the resource or retrieves it from the resource itself when it is asked the question is entirely up to the registry builder. Cheers, Tony. > -----Original Message----- > From: Marco C. Leoni [mailto:mleoni at eso.org] > Sent: 19 May 2003 15:49 > To: Patricio F. Ortiz > Cc: ucd at ivoa.net > Subject: Re: MyUCDs > > > Hi Patricio, > one question about the point below: > > > Let me include here another element before the submission of a > > query to the resource handling catalogues: The registry. > > > > Any registry will contain metadata about the services > listed, ie, > > the catalogues. > > The registry would know (among other things), > > - catalogueName (possibly catalogueUniqueID) > > - catalogueTitle > > - catalogueKeywords > > - catalogueAuthor > > - number of columns > > - number of records > > - column names, UCDs, units > > - name ID,MAIN --- > > - RA POS_EQ_RA,MAIN h:m:s > > - Dec POS_EQ_DEC,MAIN d:m:s > > .... > > - Vmag PHOT_JHN_V mag > > - Bmag PHOT_JHN_B mag > > - z REDSHIFT --- > > > > are you sure the Registry will include all these information? > Perhaps "column names" and "units" will be resolved by the service > provider, and not by the registry itself. > > Cheers, > Marco > From pfo at star.le.ac.uk Mon May 19 10:25:28 2003 From: pfo at star.le.ac.uk (Patricio F. Ortiz) Date: Mon, 19 May 2003 18:25:28 +0100 (BST) Subject: MyUCDs & Registry In-Reply-To: <3EC901E8.40307@eso.org> Message-ID: On Mon, 19 May 2003, Marco C. Leoni wrote: > >Hi Marco, > > > >well, I would have thought so. It's done already by vizier meta-information > >tables (which could be considered a local registry) and surely by other > >services. It's a good point though, how much is a registry supposed to > >know? In the scheme that Keith Noddle presented in Cambridge, I would > >expect at least the local register (read service provider's) should know > >about about this. Whether the higher level registers decide to collect this > >information is to be seen, but if you ask *me*, I would say yes, they > >should keep this information. We are not talking about a huge data volume > >here; the advantages are large though, as you don't have to go everywhere > >with your query. > > > >I just had to deal with getting a mortgage and run from bank to bank > >filling up applications not knowing if I satisfied all the conditions they > >asked; then a consultant phoned us and we dealt with him, he found out > >who would lend us the money and it worked fine. The analogy seems quite > >appropriate for our situation in VO. An astronomer wants to formulate a > >general query, he or she has no idea where this could be done, so s/he > >asks a service to lauch the query urbe et orbi... Surely the data will > >come back to the astronomer, but if we apply a filtering system which > >can select the services which may give a positive answer (and weed off > >the ones with a negative), then the number of transactions diminishes > >and the response time is shorter. > > > >An intermediate solution would be to send the query to broker-services urbe > >et orbi, and let those services do the filtering and send the query to > >places which may give a positive answer. > > > >IMHO not using this meta-information would be a real waste. > > > >Cheers, > > > >Patricio > > > >--- > >Patricio F. Ortiz pfo at star.le.ac.uk > >AstroGrid project > >Department of Physics and Astronomy > >University of Leicester Tel: +44 (0)116 252 2015 > >LE1 7RH, UK > > > > Patricio, > > I agree that a filtering system would be nice, and in fact a > registry is supposed to do that (in my opinion): if necessary it will > send an "urbi et orbi" query and give back to the astronomer only the > relevant results, where this means compare them with the original > requirements (e.g. exclude all the "no info about it" answers). The same > before sending the query, if the registry has enough information about > services (this is not the case if we think of registries containing only > references to other registries, but this will simply add one more level > to the structure). > > What I meant before is that probably we need only UCDs in the registry > without any unique column names: when the query reaches the service then > it will be translated and UCDs mapped to comlumn names. I see your point now. However, remember that what's unique within a table is a column name, not UCDs, therefore, a UCD-driven request may expand to more columns than UCDs (just think of catalogues generated from sextractor) and generate columns which are absolutely meaningless for a user. > - On the other side, if the user already knows the unique column name > then using it will remove any control about UCDs. Unique name within a table, because names are not unique across tables and bear different meaning, the very reason UCDs were introduced. > - Last possibility, giving both unique column name and UCD will create a > well-defined detailed query, with specific result even in the "urbi et > orbi" case. True, but then we may start getting partial answers If I look for (colname=redshift, ucd=REDSHIFT) I would get those catalogues where redshift is called redshift but would live out cases like (colname=z, ucd=REDSHIFT). I think we should put together a list of questions and see what we need to solve them. I can see several families of questions, from the very specialized to the very broad, from the one where the user knows what resource to consult to the ones where the user wants to discover resources. Searching the metadata at any level can prove quite productive! I don't know about dropping the units from a registry (at any level). One problem in datamining is to make sure that one compares apples with apples, so if for instance, I want to compare the extension of galaxies it is important to know whether the units are in arcmin, degrees, arcsec or whatever. Say that I lauch a query where I request all data about galaxies around a particular location and I want to list ID, RA, DEC, diameter, maybe as a user I would like to force consulting catalogues which quote diameters in arcmin only. How do we do that? Alternatively, I could make it part of the query that I want the diameters expressed in arcmin regardless of how they are measured... Food for thought, I'll put some queries together and hopefully we can construct a bunch of cases to explore the needs. Cheers, Patricio --- Patricio F. Ortiz pfo at star.le.ac.uk AstroGrid project Department of Physics and Astronomy University of Leicester Tel: +44 (0)116 252 2015 LE1 7RH, UK From roy at cacr.caltech.edu Mon May 19 16:25:23 2003 From: roy at cacr.caltech.edu (Roy Williams) Date: Mon, 19 May 2003 16:25:23 -0700 Subject: Cambridge presentations References: <200305181119.h4IBJOKF011031@urania.cfa.harvard.edu> Message-ID: <000901c31e5d$ef233680$6b91d783@cacr.caltech.edu> Dear UCD list The presentation of the UCD working group from Cambridge is at http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.ppt http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.pdf There is a new UCD Steering Committee (below) who will be implementing some of the recommendations in the above, with the objective of making a draft document available that the next IVOA meeting in October. Roy -------------------------- UCD Steering Committee: Sebastien Derriere, CDS, F Norman Gray, Starlink UK Jonathan McDowell, CfA, US Francois Ochsenbein, CDS, F Pedro Osuna, ESA, E Andrea Preite-Martinez, CDS, F Guy Rixon, Cambridge, UK Roy Williams, Caltech, US -------- Caltech Center for Advanced Computing Research roy at cacr.caltech.edu 626 395 3670 From francois at vizir.u-strasbg.fr Mon May 19 16:25:00 2003 From: francois at vizir.u-strasbg.fr (Francois Ochsenbein) Date: Tue, 20 May 2003 01:25:00 +0200 Subject: patricio UCD/Tony In-Reply-To: Your message of Fri, 16 May 2003 16:00:33 +0200 (MEST) . Message-ID: <200305192325.h4JNP0707168@vizir.u-strasbg.fr> Following Tony's post on "My understanding of UCDs", I would like to thank first Tony for his action -- it helps to clarify the discussions -- and I support Patricio's details. I would like to add a few more details: 1. UCDs as Data Types: yes, UCDs help to define what is a "legal" operation, but they do not replace the units. 2. UCDs as Keywords: Querying from UCDs to retrieve which resources do match a set of UCDs is one of the possible applications of the UCDs. For example, there is a version of VizieR which retrieves the catalogues containing a set of UCD at http://vizier.u-strasbg.fr/viz-bin/VizieR?-ucd Note that this URL does more than retrieving resources: it can retrieve directly the results by specifying e.g. PHOT_STR* in one of the UCD input specification, and a position on the sky, the search find those objects close to the position and which have a Stroemgren photometry. The UCDs allow more than just retrieving catalogues having this parameter: it allow to specify CONSTRAINTS on that UCD (Patricio's Scenario 2) something like Select ID, POS_EQ_RA_MAIN, POS_EQ_DEC_MAIN, REDSHIFT_HC From [all catalogues] Where REDSHIFT_HC > 3 PROVIDING that a rule is defined to specify what to do in case of multiple columns matching a UCD. The default rule could be something like for, say, the REDSHIFT_HC parameter: -> if a REDSHIFT_HC has a qualifier MAIN , use this one; -> otherwise, use the first non-null value of the possible REDSHIFT_HCs. The real advantage in using UCDs for actual queries is to be able to use GENERIC queries without having first to ask for the details of the catalogues (at the registry level or from the resource server). And when thousands of catalogues are involved, it makes a difference ! Notice also that the rule of the "first non-null" value is not that easy to write as an SQL statement... 3. UCDs as Pointers into Data Model I'm not sure to understand the problem -- why could not a DM point refer to a set of UCDs ? --Francois ================================================================================ Francois Ochsenbein ------ Observatoire Astronomique de Strasbourg 11, rue de l'Universite F-67000 STRASBOURG Phone: +33-(0)390 24 24 29 Email: francois at astro.u-strasbg.fr (France) Fax: +33-(0)390 24 24 32 ================================================================================ From mleoni at eso.org Tue May 20 00:39:41 2003 From: mleoni at eso.org (Marco C. Leoni) Date: Tue, 20 May 2003 09:39:41 +0200 Subject: Cambridge presentations References: <200305181119.h4IBJOKF011031@urania.cfa.harvard.edu> <000901c31e5d$ef233680$6b91d783@cacr.caltech.edu> Message-ID: <3EC9DBBD.60703@eso.org> Hi Roy, just one thing: could please add myself to the list? Thanks a lot! Cheers, Marco Roy Williams wrote: >Dear UCD list > >The presentation of the UCD working group from Cambridge is at >http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.ppt >http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.pdf > >There is a new UCD Steering Committee (below) who will be implementing some >of the recommendations in the above, with the objective of making a draft >document available that the next IVOA meeting in October. > >Roy >-------------------------- >UCD Steering Committee: >Sebastien Derriere, CDS, F >Norman Gray, Starlink UK >Jonathan McDowell, CfA, US >Francois Ochsenbein, CDS, F >Pedro Osuna, ESA, E >Andrea Preite-Martinez, CDS, F >Guy Rixon, Cambridge, UK >Roy Williams, Caltech, US > >-------- >Caltech Center for Advanced Computing Research >roy at cacr.caltech.edu >626 395 3670 > From ael at star.le.ac.uk Tue May 20 01:58:44 2003 From: ael at star.le.ac.uk (Tony Linde) Date: Tue, 20 May 2003 09:58:44 +0100 Subject: patricio UCD/Tony In-Reply-To: <200305192325.h4JNP0707168@vizir.u-strasbg.fr> Message-ID: <007a01c31eae$04cf8ec0$1001a8c0@brolga> > 1. UCDs as Data Types: > yes, UCDs help to define what is a "legal" operation, but they do not > replace the units. Agreed, but units are not sufficient to define legal operations. Just because two columns have units of 'time' does not mean they can necessarily be combined in any meaningful way. But two columns from different tables with POS_EQ_RA,MAIN can be compared with a meaningful result, even if each has different units. This is why I think that if UCDs are content descriptors, it makes sense to define rules about: which units are valid for a specific type of data; how one converts from one unit to another; which operations are valid on that data and on combinations of data with different UCDs and which UCDs result from those operations (eg POS* can be multiplied by FREQ* to get DENSITY - ridiculous but I'll let the astros come up with real examples). > Note that this URL does more than retrieving resources: it can > retrieve directly the results by specifying e.g. PHOT_STR* in one But this is not a *standard* Registry function. > PROVIDING that a rule is defined to specify what to do in case of > multiple columns matching a UCD. The default rule could be Therein lies the rub! > use GENERIC queries without having first to ask for the details of > the catalogues (at the registry level or from the resource server). But the Registry can determine which of the catalogues being searched have duplicate UCDs which are not easily resolvable and can then ask the user to pick the appropriate columns - even in a set of '000s of catalogues, this is likely to only be a small number (I hope!). > Notice also that the rule of the "first non-null" value is not > that easy to write as an SQL statement... It doesn't need to be easy. The dataset will receive a VOQL query and will translate UCDs in that to column names in a SQL query - it is in this translation that resolution of the correct column name will happen, not in the sql query. > 3. UCDs as Pointers into Data Model > I'm not sure to understand the problem -- why could not a DM point > refer to a set of UCDs ? One could have UCDs of CURRENCY, FLOAT and INTEGER all pointing to a DM entity of 'number'; this is not a problem but does indicate that perhaps the data model needs to be resolved to a lower level if UCDs exist for which there are no unique entities. Cheers, Tony. > -----Original Message----- > From: Francois Ochsenbein [mailto:francois at vizir.u-strasbg.fr] > Sent: 20 May 2003 00:25 > To: ucd at ivoa.net > Subject: Re: patricio UCD/Tony > > ... From roy at cacr.caltech.edu Mon May 19 10:16:10 2003 From: roy at cacr.caltech.edu (Roy Williams) Date: Mon, 19 May 2003 10:16:10 -0700 Subject: MyUCDs & Registry References: <3EC901E8.40307@eso.org> Message-ID: <007701c31e2a$57c5b3e0$6b91d783@cacr.caltech.edu> Patrick and Marco > are you sure the Registry will include all these information? > Perhaps "column names" and "units" will be resolved by > the service provider, and not by the registry itself. My understanding is that every registry entry is based on the Resource and Service Metadata (RSM) document (*). Everything in the registry has the basic stuff like title and publisher and description, also spectral and sky coverage. Then there are extension pieces, and so far we only have a good idea of extending to services. The metadata for these includes the URL, the mime type of what comes back, and other stuff that defines standard VO services (eg SIA, Cone etc). Whta is missing from the RSM is an extension that deal specifically with tables. It would include much of the VOTable header information I assume, including the UCDs. I think that now would be a good time to sketch an XML schema for catalog metadata. We already have curation and coverage from the Resource base, but then how is the extension document different from a VOTable header? Roy (*) http://www.ivoa.net/internal/IVOA/IvoaResReg/ResourceServiceMetadataV7.doc -------- Caltech Center for Advanced Computing Research roy at cacr.caltech.edu 626 395 3670 > Any registry will contain metadata about the services listed, ie, > the catalogues. > The registry would know (among other things), > - catalogueName (possibly catalogueUniqueID) > - catalogueTitle > - catalogueKeywords > - catalogueAuthor > - number of columns > - number of records > - column names, UCDs, units > - name ID,MAIN --- > - RA POS_EQ_RA,MAIN h:m:s > - Dec POS_EQ_DEC,MAIN d:m:s > .... > - Vmag PHOT_JHN_V mag > - Bmag PHOT_JHN_B mag > - z REDSHIFT --- > are you sure the Registry will include all these information? Perhaps "column names" and "units" will be resolved by the service provider, and not by the registry itself. Cheers, Marco From rplante at poplar.ncsa.uiuc.edu Mon May 19 10:24:28 2003 From: rplante at poplar.ncsa.uiuc.edu (Ray Plante) Date: Mon, 19 May 2003 12:24:28 -0500 (CDT) Subject: MyUCDs & Registry In-Reply-To: <3EC901E8.40307@eso.org> Message-ID: Hi Guys, The current registry model allows for catalogs to be registered individually and that the descriptions stored in the registry could contain all the information that Patricio listed. The main question is how fine-grained you want the registry to be and, thus, whether catalogs are within your limit. I think we heard at the interop meeting last week sufficient interest in registering catalogs. So then the next step would be to model the catalog description, using the generic Resource metadata as a starting point. We would ask ourselves what additional information do we want in the description. Column UCDs and names would be an example of that type of information. I'll note that in our prototype schema for describing SIA services, we included a description of the columns returned by an image query. This was done using the VOTable FIELD elements; thus, it included both the local names and the UCDs (along with the IDs, type, and anything else you can stick in a FIELD element and its children). This could be done for a catalog description as well (regardless of whether it is actually available in VOTable format). hope this helps, Ray On Mon, 19 May 2003, Marco C. Leoni wrote: > > > Patricio F. Ortiz wrote: > > >On Mon, 19 May 2003, Marco C. Leoni wrote: > > > > > >>Hi Patricio, > >> one question about the point below: > >> > >> > >> > >>> Let me include here another element before the submission of a > >>> query to the resource handling catalogues: The registry. > >>> > >>> Any registry will contain metadata about the services listed, ie, > >>> the catalogues. > >>> The registry would know (among other things), > >>> - catalogueName (possibly catalogueUniqueID) > >>> - catalogueTitle > >>> - catalogueKeywords > >>> - catalogueAuthor > >>> - number of columns > >>> - number of records > >>> - column names, UCDs, units > >>> - name ID,MAIN --- > >>> - RA POS_EQ_RA,MAIN h:m:s > >>> - Dec POS_EQ_DEC,MAIN d:m:s > >>> .... > >>> - Vmag PHOT_JHN_V mag > >>> - Bmag PHOT_JHN_B mag > >>> - z REDSHIFT --- > >>> > >>> > >>> > >>are you sure the Registry will include all these information? > >>Perhaps "column names" and "units" will be resolved by the service > >>provider, and not by the registry itself. > >> > >>Cheers, > >> Marco > >> > >> > > > >Hi Marco, > > > >well, I would have thought so. It's done already by vizier meta-information > >tables (which could be considered a local registry) and surely by other > >services. It's a good point though, how much is a registry supposed to > >know? In the scheme that Keith Noddle presented in Cambridge, I would > >expect at least the local register (read service provider's) should know > >about about this. Whether the higher level registers decide to collect this > >information is to be seen, but if you ask *me*, I would say yes, they > >should keep this information. We are not talking about a huge data volume > >here; the advantages are large though, as you don't have to go everywhere > >with your query. > > > >I just had to deal with getting a mortgage and run from bank to bank > >filling up applications not knowing if I satisfied all the conditions they > >asked; then a consultant phoned us and we dealt with him, he found out > >who would lend us the money and it worked fine. The analogy seems quite > >appropriate for our situation in VO. An astronomer wants to formulate a > >general query, he or she has no idea where this could be done, so s/he > >asks a service to lauch the query urbe et orbi... Surely the data will > >come back to the astronomer, but if we apply a filtering system which > >can select the services which may give a positive answer (and weed off > >the ones with a negative), then the number of transactions diminishes > >and the response time is shorter. > > > >An intermediate solution would be to send the query to broker-services urbe > >et orbi, and let those services do the filtering and send the query to > >places which may give a positive answer. > > > >IMHO not using this meta-information would be a real waste. > > > >Cheers, > > > >Patricio > > > >--- > >Patricio F. Ortiz pfo at star.le.ac.uk > >AstroGrid project > >Department of Physics and Astronomy > >University of Leicester Tel: +44 (0)116 252 2015 > >LE1 7RH, UK > > > > Patricio, > > I agree that a filtering system would be nice, and in fact a > registry is supposed to do that (in my opinion): if necessary it will > send an "urbi et orbi" query and give back to the astronomer only the > relevant results, where this means compare them with the original > requirements (e.g. exclude all the "no info about it" answers). The same > before sending the query, if the registry has enough information about > services (this is not the case if we think of registries containing only > references to other registries, but this will simply add one more level > to the structure). > > What I meant before is that probably we need only UCDs in the registry > without any unique column names: when the query reaches the service then > it will be translated and UCDs mapped to comlumn names. > - On the other side, if the user already knows the unique column name > then using it will remove any control about UCDs. > - Last possibility, giving both unique column name and UCD will create a > well-defined detailed query, with specific result even in the "urbi et > orbi" case. > > > Cheers, > Marco > From greene at stsci.edu Mon May 19 10:43:23 2003 From: greene at stsci.edu (Gretchen Greene) Date: Mon, 19 May 2003 13:43:23 -0400 Subject: MyUCDs & Registry In-Reply-To: Message-ID: <000a01c31e2e$25517490$2bfaa782@stsci.edu> If the registry is going to supply 'Subject' , i.e. keywords for targeting specific astronomical topics, then I think it would equally serve to provide the UCDs and their associated tags (units, type, etc.) for refining queries. Not to sound harsh, my only question is why do we have to consider generating more schema files to do this? UCD's are fundamental descriptors in my mind and could be included in the base terms as optional elements. I guess the discretion comes from service providers with column descriptions not mapped into UCD's? -Gretchen -----Original Message----- From: Ray Plante [mailto:rplante at poplar.ncsa.uiuc.edu] Sent: Monday, May 19, 2003 1:24 PM To: ucd at ivoa.net Cc: registry at ivoa.net Subject: Re: MyUCDs & Registry Hi Guys, The current registry model allows for catalogs to be registered individually and that the descriptions stored in the registry could contain all the information that Patricio listed. The main question is how fine-grained you want the registry to be and, thus, whether catalogs are within your limit. I think we heard at the interop meeting last week sufficient interest in registering catalogs. So then the next step would be to model the catalog description, using the generic Resource metadata as a starting point. We would ask ourselves what additional information do we want in the description. Column UCDs and names would be an example of that type of information. I'll note that in our prototype schema for describing SIA services, we included a description of the columns returned by an image query. This was done using the VOTable FIELD elements; thus, it included both the local names and the UCDs (along with the IDs, type, and anything else you can stick in a FIELD element and its children). This could be done for a catalog description as well (regardless of whether it is actually available in VOTable format). hope this helps, Ray On Mon, 19 May 2003, Marco C. Leoni wrote: > > > Patricio F. Ortiz wrote: > > >On Mon, 19 May 2003, Marco C. Leoni wrote: > > > > > >>Hi Patricio, > >> one question about the point below: > >> > >> > >> > >>> Let me include here another element before the submission of a > >>> query to the resource handling catalogues: The registry. > >>> > >>> Any registry will contain metadata about the services listed, ie, > >>> the catalogues. > >>> The registry would know (among other things), > >>> - catalogueName (possibly catalogueUniqueID) > >>> - catalogueTitle > >>> - catalogueKeywords > >>> - catalogueAuthor > >>> - number of columns > >>> - number of records > >>> - column names, UCDs, units > >>> - name ID,MAIN --- > >>> - RA POS_EQ_RA,MAIN h:m:s > >>> - Dec POS_EQ_DEC,MAIN d:m:s > >>> .... > >>> - Vmag PHOT_JHN_V mag > >>> - Bmag PHOT_JHN_B mag > >>> - z REDSHIFT --- > >>> > >>> > >>> > >>are you sure the Registry will include all these information? > >>Perhaps "column names" and "units" will be resolved by the service > >>provider, and not by the registry itself. > >> > >>Cheers, > >> Marco > >> > >> > > > >Hi Marco, > > > >well, I would have thought so. It's done already by vizier meta-information > >tables (which could be considered a local registry) and surely by other > >services. It's a good point though, how much is a registry supposed to > >know? In the scheme that Keith Noddle presented in Cambridge, I would > >expect at least the local register (read service provider's) should know > >about about this. Whether the higher level registers decide to collect this > >information is to be seen, but if you ask *me*, I would say yes, they > >should keep this information. We are not talking about a huge data volume > >here; the advantages are large though, as you don't have to go everywhere > >with your query. > > > >I just had to deal with getting a mortgage and run from bank to bank > >filling up applications not knowing if I satisfied all the conditions they > >asked; then a consultant phoned us and we dealt with him, he found out > >who would lend us the money and it worked fine. The analogy seems quite > >appropriate for our situation in VO. An astronomer wants to formulate a > >general query, he or she has no idea where this could be done, so s/he > >asks a service to lauch the query urbe et orbi... Surely the data will > >come back to the astronomer, but if we apply a filtering system which > >can select the services which may give a positive answer (and weed off > >the ones with a negative), then the number of transactions diminishes > >and the response time is shorter. > > > >An intermediate solution would be to send the query to broker-services urbe > >et orbi, and let those services do the filtering and send the query to > >places which may give a positive answer. > > > >IMHO not using this meta-information would be a real waste. > > > >Cheers, > > > >Patricio > > > >--- > >Patricio F. Ortiz pfo at star.le.ac.uk > >AstroGrid project > >Department of Physics and Astronomy > >University of Leicester Tel: +44 (0)116 252 2015 > >LE1 7RH, UK > > > > Patricio, > > I agree that a filtering system would be nice, and in fact a > registry is supposed to do that (in my opinion): if necessary it will > send an "urbi et orbi" query and give back to the astronomer only the > relevant results, where this means compare them with the original > requirements (e.g. exclude all the "no info about it" answers). The same > before sending the query, if the registry has enough information about > services (this is not the case if we think of registries containing only > references to other registries, but this will simply add one more level > to the structure). > > What I meant before is that probably we need only UCDs in the registry > without any unique column names: when the query reaches the service then > it will be translated and UCDs mapped to comlumn names. > - On the other side, if the user already knows the unique column name > then using it will remove any control about UCDs. > - Last possibility, giving both unique column name and UCD will create a > well-defined detailed query, with specific result even in the "urbi et > orbi" case. > > > Cheers, > Marco > From rplante at poplar.ncsa.uiuc.edu Mon May 19 11:42:58 2003 From: rplante at poplar.ncsa.uiuc.edu (Ray Plante) Date: Mon, 19 May 2003 13:42:58 -0500 (CDT) Subject: MyUCDs & Registry In-Reply-To: <000a01c31e2e$25517490$2bfaa782@stsci.edu> Message-ID: Hi Gretchen, On Mon, 19 May 2003, Gretchen Greene wrote: > If the registry is going to supply 'Subject' , i.e. keywords for > targeting specific astronomical topics, then I think it would equally > serve to provide the UCDs and their associated tags (units, type, etc.) > for refining queries. > > Not to sound harsh, my only question is why do we have to consider > generating more schema files to do this? UCD's are fundamental > descriptors in my mind and could be included in the base terms as > optional elements. I guess the discretion comes from service providers > with column descriptions not mapped into UCD's? There is precedence for integrating it into the generic resource metadata: the current VOResource contains Coverage. However, it was pointed out last week that it is unclear, for example, what Coverage means when it applies to an organization. It was suggested (and I plan to look at this) that Coverage only be associated with descriptions of Data Collections and Services. I mention this because it illustrates a basic modeling issue. When UCDs are associated with a description of an SIA service, it can be made clear what the role the UCDs play in the service (i.e. these are the UCDs associated with the columns returned from an image query). Similarly, the connection is clear when made part of a Catalog description. However, if they are associated with a generic resource, their role is ambiguous. That is, what does it mean to search for Organizations based on UCDs? > my only question is why do we have to consider > generating more schema files to do this? Your question suggests that dealing with multiple schema files has additional overhead costs associated with it (as opposed to just have one schema). Do you see this as a problem? We put the metadata that is not purely generic Resource metadata into a separate schema because it makes for a friendlier extension mechanism. If we add new metadata that, say, specifically describes Catalogs into the VOResource schema, it affects all users of this schema regardless of whether they care about Catalogs. (They may have to change their software to cope with the new version.) However, if we put the metadata into its own schema that extends VOResource, it only affects those that want to describe Catalogs. cheers, Ray From greene at stsci.edu Mon May 19 12:21:40 2003 From: greene at stsci.edu (Gretchen Greene) Date: Mon, 19 May 2003 15:21:40 -0400 Subject: MyUCDs & Registry In-Reply-To: Message-ID: <000c01c31e3b$dff0a660$2bfaa782@stsci.edu> Thanks for the insights Ray, I realize I missed out on some discussions that occurred at the IVOA. Still, in building this first registry with the existing schemas, I am finding that there are subtleties to the implementation and my concern is in chasing down multiple schema versions and metadata definitions prohibits efficient prototyping. For example, the existing repositories (OAI, previous Cone registry, etc.) all have varying elements and content. Now which schema files to I pursue to bring about uniformity? This is only the very beginning. The other point is that there are not huge numbers of entries/resources yet in these registries (remember I'm used to the GSC2 billions) and so it seems a little odd to create a highly designed configuration before ever implementing one even though the design ideas are good ones. So...I cheer on the designing but encourage people to start working with the standard schemas/files and see how they interface before going too far. -Gretchen -----Original Message----- From: Ray Plante [mailto:rplante at poplar.ncsa.uiuc.edu] Sent: Monday, May 19, 2003 2:43 PM To: Gretchen Greene Cc: ucd at ivoa.net; registry at ivoa.net Subject: RE: MyUCDs & Registry Hi Gretchen, On Mon, 19 May 2003, Gretchen Greene wrote: > If the registry is going to supply 'Subject' , i.e. keywords for > targeting specific astronomical topics, then I think it would equally > serve to provide the UCDs and their associated tags (units, type, etc.) > for refining queries. > > Not to sound harsh, my only question is why do we have to consider > generating more schema files to do this? UCD's are fundamental > descriptors in my mind and could be included in the base terms as > optional elements. I guess the discretion comes from service providers > with column descriptions not mapped into UCD's? There is precedence for integrating it into the generic resource metadata: the current VOResource contains Coverage. However, it was pointed out last week that it is unclear, for example, what Coverage means when it applies to an organization. It was suggested (and I plan to look at this) that Coverage only be associated with descriptions of Data Collections and Services. I mention this because it illustrates a basic modeling issue. When UCDs are associated with a description of an SIA service, it can be made clear what the role the UCDs play in the service (i.e. these are the UCDs associated with the columns returned from an image query). Similarly, the connection is clear when made part of a Catalog description. However, if they are associated with a generic resource, their role is ambiguous. That is, what does it mean to search for Organizations based on UCDs? > my only question is why do we have to consider > generating more schema files to do this? Your question suggests that dealing with multiple schema files has additional overhead costs associated with it (as opposed to just have one schema). Do you see this as a problem? We put the metadata that is not purely generic Resource metadata into a separate schema because it makes for a friendlier extension mechanism. If we add new metadata that, say, specifically describes Catalogs into the VOResource schema, it affects all users of this schema regardless of whether they care about Catalogs. (They may have to change their software to cope with the new version.) However, if we put the metadata into its own schema that extends VOResource, it only affects those that want to describe Catalogs. cheers, Ray From dmink at cfa.harvard.edu Mon May 19 12:40:44 2003 From: dmink at cfa.harvard.edu (Doug Mink) Date: Mon, 19 May 2003 15:40:44 -0400 Subject: MyUCDs & Registry References: <000c01c31e3b$dff0a660$2bfaa782@stsci.edu> Message-ID: <3EC9333C.CFC31104@cfa.harvard.edu> Gretchen Greene wrote: > The other point is that there are not huge numbers of entries/resources > yet in these registries (remember I'm used to the GSC2 billions) and so > it seems a little odd to create a highly designed configuration before > ever implementing one even though the design ideas are good ones. At the Center for Astrophysics, we tried gathering descriptions of as many types of archived data as we could find hanging around here and tried to get knowledgeable people to describe them in a uniform way which we developed in parallel to the larger VO effort. We thought that working from the archives up could be complementary to the grand design work that many others were doing. We do not have a VO standard interface to the information we collected, but I think that the data could be translated into the protocols which are being discussed. We would be interested in what other people who have data they want to publish to the VO think about the parameters we have been able to collect. Check it out at http://tdc-www.harvard.edu/vo -Doug Mink From schaaff at newb6.u-strasbg.fr Tue May 27 03:00:00 2003 From: schaaff at newb6.u-strasbg.fr (Andre Schaaff) Date: Tue, 27 May 2003 12:00:00 +0200 Subject: New Web Service at CDS Message-ID: <3ED33720.8424CD86@astro.u-strasbg.fr> Hello, A UCD resolver is now available as a Web Service at CDS. See http://cdsweb.u-strasbg.fr/cdsws.gml for more details and an example of use. Regards, Andr? -------------- next part -------------- A non-text attachment was scrubbed... Name: schaaff.vcf Type: text/x-vcard Size: 209 bytes Desc: Card for Andre Schaaff URL: From amsr at jb.man.ac.uk Fri May 2 07:43:21 2003 From: amsr at jb.man.ac.uk (Anita Richards) Date: Fri, 2 May 2003 15:43:21 +0100 (BST) Subject: Pre-meeting UCD question Message-ID: Ray Plante's 'Requirements for the Future' is a very neat summary, and I am especially glad he has put on record the need for * Accessible documentation and * Backward compatibility - and I think it is a good basis for the Cambridge discussion There are a couple of things which have come up previously: The use of UCDs other than for column content descriptors - e.g. for the ResourceMetadata which summarises the content of a dataset for the Registry. Should these be taken from exactly the same set of UCDs? Or should these embody their context (e.g. by reserving superclass UCDs, such as the first element if we use Guy Rixon's atoms...)? How to qualify UCDs? At the moment we have some 'degenerate' UCDs, e.g. SPEC_WAVELENGTH in the IDHA model for both the high and low bandpass limits. This means you have to evaluate both, if you just want data above a certain frequency (say). Then we have some overspecified, e.g. all the PHOT ones for U B V R I, RADIO_1.6, _1.4 etc. In some cases this might be solved just by adding MAX/MIN or equivalent (LONG and SHORT for the bandpass as MAX/MIN is ambiguous unless you know if it is freq. or wavelength). However with XML we can be more intelligent, as has been pointed out, and give them properties or attributes or values. How do we cope with the cases where the required information (ie another UCD?) is elsewhere in the same data set, e.g. another column or in the header? e.g. for associating cumulative errors with a data point, or for realising that an entire catalogue is at 1.4 GHz or in the Cousins photometry convention? That is, we need not only to be able to select on the basis of UCDs, but to be able to interpret their properties at the Registry level. In some cases we might need to evaluate the data they describe at the Registry (as in the present bandpass example I gave), but perhaps that can be avoided, or should be, if possible, so that you only have to dive into the dataset itself when you are answering the query in detail? This can be summed up as saying that we want UCDs to describe data but we should avoid as much as possible using UCDs as data. Thoughts on some of David's comments: Units and accuracy When we are using UCDs (assuming this includes the ResourceMetadata) in the Registry, we do not need high accuracy as long as we err on the side of inclusion (so a catalogue spanning 10 - 3001 GHz would be both radio and IR). If we want catalogues with photometry, we do not need to know what units the flux is measured in. However if we want a certain level of accuracy we do need units (at least if we are to avoid quality factors etc. which will be very arbitrary) - but here, the Registry could standardise everything to one sort of unit for each quantity as the conversion does not need to be accurate (e.g. noise < 0.0001 Jy derived from a certain limiting magnitude) as long as it is rounded down/up as appropriate. This is analogous to Ray's squinting. I think it is crucial that do not tie UCDs to certain units or whatever, as we then make it difficult to interconvert. Any conversion errors should be carried in proper UCDs for errors - thus, if you ask for x-ray photometry in counts, you will get a result which will probably ahve greater accuracy than if you ask for it in Jy - but the latter is vital if you want to compare it with data at other wavelengths. Cheers a - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Dr. Anita M. S. Richards, AVO Astronomer MERLIN/VLBI National Facility, University of Manchester, Jodrell Bank Observatory, Macclesfield, Cheshire SK11 9DL, U.K. tel +44 (0)1477 572683 (direct); 571321 (switchboard); 571618 (fax). From aam at astro.caltech.edu Sun May 11 22:00:09 2003 From: aam at astro.caltech.edu (Ashish Mahabal) Date: Sun, 11 May 2003 22:00:09 -0700 (PDT) Subject: Some comments on UCDs Message-ID: Hi, Since I will not be personally present when many of you meet in Europe (now), I decided to send in my two cents to the group. Over the last few months, talking to several VO people from US and Europe, it is clear that everyone agrees that something like the UCDs is great. Not everyone agrees on (1) the number of UCDs, (2) the hierarchical nature they are currently arranged in, (3) mechanisms of adding UCDs, ... Suggestions range from leave UCDs as they are to completely replace them with alternate mechanisms (some of which have at least partially been thought of). Having worked with UCDs for making Topic Maps based on UCDs in catalogs (examples at http://www.astro.caltech.edu/~aam/science/topicmaps/ucd.html - a specialised application based on these will be ready soon), I have come to like the ease with which one can use UCDs to build structure around them (though they certainly have their limitations, as is seen by some "gaps" in some of the topic maps). I have not been required to deal with the entire set of UCDs (e.g. Atomic Data, Orbit etc. related UCDs) as I have not dealt with all kinds of catalogs. But the hierarchical structure seems very useable. Having error terms on various individual UCDs will be good and some work could go in there. Of the 1400 or so UCDs, the 100 most accessed catalogs at Vizier use only about 250 UCDs in all. Allowing people to make their own UCDs (just like standards should not be a plural, not everyone should create their own Universals) will bring us back to square one even if what the personal UCDs mean is well documented. My R magnitude UCD is GunnR and some one elses is GR (both having used the Thuan-Gunn system). Okay so far. But then I also use GR to mean Gamma Ratio (whatever that is), and all hell breaks loose. A lot of time may have to be spent in writing more intelligent software to deal with these issues which could in the first place have been avoided. So should there be "UCD police" who decide what new UCDs should get added? Perhaps not. But everybody concerned can try to come up with a list of what is currently missiing, and what will probably be needed in the next 10 years (and later on there will be surprises of course), and simply add the resulting list. And keep a mechanism for adding more. Should the list be changed from a hierarchy to something else? That should not really matter much. (Broadly speaking - software should be able to handle that). Should the UCDs serve as an ontology for astronomy? I beileve that in a sense it is. However, I also believe that anything that is related to astronomy (e.g. astronomers, instruments, observatories etc.) should also be part of that ontology and could be represented by UCDs in the hierarchy (adding a few hundred more terms). These can then be used in many other tables (derived data products). So a growing branch could stay open there. More later. Cheers and wishing you all a good time, -ashish PS: Does some one plan to email session summaries to the group? Ashish Mahabal, Caltech Astronomy, Pasadena, CA 91125 http://www.astro.caltech.edu/~aam aam at astro.caltech.edu Lawyer: How far apart were the vehicles at the time of the collision? From ael at star.le.ac.uk Fri May 16 04:14:44 2003 From: ael at star.le.ac.uk (Tony Linde) Date: Fri, 16 May 2003 12:14:44 +0100 Subject: MyUCDs Message-ID: <20030516111444.F2F465C479@smtp.us2.messagingengine.com> I've posted a document on the IVOA wiki which summarises my understanding of the UCD discussion in the plenary session yesterday. Perhaps some people could peruse that and provide feedback on this mailing list about any errors or differences of opinion on the implications that I see. I hope it helps others with any of their own misunderstandings. http://www.ivoa.net/twiki/bin/view/IVOA/TonyOnUCDs Cheers, Tony. __ Tony Linde Phone: +44 (0)116 223 1292 AstroGrid Project Manager Fax: +44 (0)116 252 3311 Dept of Physics & Astronomy Mobile: +44 (0)7753 603356 University of Leicester Email: ael at star.le.ac.uk Leicester, UK LE1 7RH Web: http://www.astrogrid.org From pfo at star.le.ac.uk Fri May 16 05:42:43 2003 From: pfo at star.le.ac.uk (Patricio F. Ortiz) Date: Fri, 16 May 2003 13:42:43 +0100 (BST) Subject: MyUCDs In-Reply-To: <20030516111444.F2F465C479@smtp.us2.messagingengine.com> Message-ID: On Fri, 16 May 2003, Tony Linde wrote: > I've posted a document on the IVOA wiki which summarises my understanding > of the UCD discussion in the plenary session yesterday. Perhaps some > people could peruse that and provide feedback on this mailing list about > any errors or differences of opinion on the implications that I see. I > hope it helps others with any of their own misunderstandings. > > http://www.ivoa.net/twiki/bin/view/IVOA/TonyOnUCDs Here go my comments to Tony's document: > My understanding of UCDs > > The way this came up was my question in the plenary UCD session about > how we can identify columns within a table uniquely. Basically, the > answer was that UCDs will not solve this problem and are not intended > to do so. This page will summarise what I now understand as the purpose > of UCDs and some of the implications of this. > > UCDs as Data Types > > Comment was made that UCDs can be considered as data types, so a column > in a table has a data type of, POS_EQ_RA, say. I assume that the > reasons for having UCDs as data types are to allow: > > * operations on columns: comparison, addition, subtraction, > multiplication, etc plus specific astronomical operations > * conversion between data types: eg converting between equitorial and > galactic coordinates As their name say, UCDs are meant to describe the content of a column uniquely. Within any given table, column names do not repeat (unless in exceptional cases where the authors want to repeat a column for easy reading. The problem arose when column names became degenerate when analyzing different tables/catalogues. The most common scenario for astronomy was a column called "Mag" , meant to represent the brightness of a celestial object in some photometric system. The problem was/is that "Mag" was "overloaded" and just comparing two columns labeled "Mag" was a cry for trouble. UCDs were introduced to break the degeneracy. column (catalog=c1 name=Mag unit=mag UCD=PHOT_JHN_V) has nothing to do with column (catalog=c2 name=Mag unit=mag UCD=PHOT_STR_B) > Do we thus need (or already have) some hierarchical structure of the > UCDs based on allowable operations? In normal data types, we have > numerical types, subdivided by integral and floating point, subdivided > by storage size etc; one can add all numerical types but (generally) > cannot add a number and a string (without pre-defining what such an > addition will do). Although one can add numerical types and not add strings or strings and numbers, not all numbers should be allowed to be added. One should not allow adding a velocity measured in km/s to a right ascention in equatorial coordinates (an angle). Whatever mechanism is used to perform table combination should be provided with this kind of knowledge. > Aligned to that: should we define the operations that can be performed > on the individual data types (UCDs), the rules for those operations > given specific types, and the type resulting from such operations. Yes and no, thinking of all combination is unrealistic, plus, it's only in adding/subtracting where you have problems. No-one could impede you to take ratios of columns or multiply columns. As Peter Quinn pointed out in the plenary, some decisions should always be the responsibility of the astronomer. > UCDs as Keywords > > In this context, the UCDs is part of the metadata for a table. It > indicates the type of data held in a table, so having POS_EQ_RA > identified with a table says that this table includes positional data > in equitorial coordinates. That said, maybe the UCD for the table > should include POS_EQ instead (since it is unlikely that it'll have RA > without DEC). Unlikely but not impossible. I've seen tables in which that's the case. UCD's were meant to be attached to columns, talking about a table's UCD is IMHO a confusion. A Table can have a set of UCDs attached to it (like a list), which may be shorter than the number of columns in the table if some columns happen to be representative of the same physical quantity therefore they DO have the same UCD. > So the idea of being able to query which resources have POS_EQ* makes sense. Sure it does. Not all catalogues contain RA-Dec. > UCDs as Pointers into Data Model > > This was a very interesting comment, that UCDs can be seen as a pointer > into the data model (DM). How this might be implemented and how > feasible it is is still open. I guess there are two potential problem > areas: > > * a UCD refers to multiple DM points (classes, objects or whatever > they are called) this is likely but does indicate areas in which the > UCDs are not the lowest level of metadata > > * one DM point is referred to by several UCDs > if this occurs, it would indicate that the DM requires further analysis Yes and no. Yes if we are talking about the so called "core-UCDs", which are the UCDs which can be attached to a column. No if we develop the concept of an "alias-UCD", which represents a list of UCDs design for discovery purposes. I'd say that a DM should have one "core-UCD" and 0 or more "alias-UCDs" > I suspect that, as the DM expands and covers more areas of astronomy, > we will need a more efficient version of UCDs that accurately maps to > the DM; the current '_' separated textual names will have limited > extensibility (even with the additional modifiers agreed at this > meeting). Most likely. > Unique Column Identification > > Given that we cannot use UCDs as unique column identifiers, how do we do this? > It seems that the only possible unique identifier for a column in a > table is the resourceID of the table (from the Registry) plus the > columnName (for explanation of resourceID, see the discussion on this > in the Registry mailing list: > http://www.ivoa.net/forum/registry/0091.htm and related messages). Absolutely, Within a table, the column name IS a unique column identifier. The pair catalogueName.columnName is in 99.9% of the cases unique, and for the future, one can impose that columnName be unique within a table (most DBMS won't be so happy if you assign 2 columns with the same name). If we ever converge to assign unique IDs to catalogues, then cataloqueUniqueID.columnName is unique Adding a UCD to this structure would be redundant eg, cataloqueUniqueID.columnName.columnUCD > So, to summarise the discussion from the plenary session, a query can > be sent to a table with either UCDs or column names or a mixture of > both. If a UCD is included in a query, the data source can resolve this > if there is only one column with that UCD or there are multiple columns > but one has the modifier MAIN attached to only one of the column UCDs. > Otherwise the query will fail. Let me include here another element before the submission of a query to the resource handling catalogues: The registry. Any registry will contain metadata about the services listed, ie, the catalogues. The registry would know (among other things), - catalogueName (possibly catalogueUniqueID) - catalogueTitle - catalogueKeywords - catalogueAuthor - number of columns - number of records - column names, UCDs, units - name ID,MAIN --- - RA POS_EQ_RA,MAIN h:m:s - Dec POS_EQ_DEC,MAIN d:m:s .... - Vmag PHOT_JHN_V mag - Bmag PHOT_JHN_B mag - z REDSHIFT --- IMHO, a query can be accurately formulated for any given resource after consulting the registry. It should be decided at the registry level what we want to extract from any given catalogue, therefore, the query received by the resource handling that catalogue has to make no decision, and what's best, the query should not even be submitted if we know ahead of time that it will fail. > Conclusion > > I hope people will provide feedback on the mailing list to these > comments. I reiterate that they are only my understanding of what was > said and my belief of the implications. Another use of UCDs would be to discover resources (catalogues) which contain certain QUANTITIES. Scenario 1 If I formulate a query in the line "select catalogueName where UCDs include REDSHIT POS_EQ_RA,MAIN POS_EQ_DEC,MAIN PHOT_JHN_V from registryXX" I should get back a listing of catalogues which contain that information (plus probably the column names, units and UCDs which make the catalogue satisfy our request). Titles and keywords are short handles and do not always tell about all is listed in a catalogue. The astronomer should know what to do with this information, perhaps s/he will pick up a few of these catalogues and submit a query to them. Scenario 2 "select ucd;ID,MAIN ucd;POS_EQ_RA,MAIN ucd;POS_EQ_RA,MAIN ucd;PHOT_JHN_V ucd;REDSHIFT from [catalogueName where UCDs include REDSHIFT PHOT_JHN_V POS_EQ_RA,MAIN POS_EQ_DEC,MAIN from registryXX] where ucd;REDSHIFT > 2.5" The user doesn't know ahead of times if any ocatalogue exists to satisfy the query, but if it exist, s/he would like to print the equivalent of Name, RA, Dec, Vmag, and z This query should first find out a list of catalogues which contain all those UCDs and the name of their respective columns. Once this is known, individual queries should be sent to each of the catalogues requesting the columns which identify with the UCDs the user requested and which satisfy redshift > 2.5. Note that I've used a very loose notation above on purpose Cheers, Patricio --- Patricio F. Ortiz pfo at star.le.ac.uk AstroGrid project Department of Physics and Astronomy University of Leicester Tel: +44 (0)116 252 2015 LE1 7RH, UK From mleoni at eso.org Mon May 19 07:48:43 2003 From: mleoni at eso.org (Marco C. Leoni) Date: Mon, 19 May 2003 16:48:43 +0200 Subject: MyUCDs References: Message-ID: <3EC8EECB.1000604@eso.org> Hi Patricio, one question about the point below: > Let me include here another element before the submission of a > query to the resource handling catalogues: The registry. > > Any registry will contain metadata about the services listed, ie, > the catalogues. > The registry would know (among other things), > - catalogueName (possibly catalogueUniqueID) > - catalogueTitle > - catalogueKeywords > - catalogueAuthor > - number of columns > - number of records > - column names, UCDs, units > - name ID,MAIN --- > - RA POS_EQ_RA,MAIN h:m:s > - Dec POS_EQ_DEC,MAIN d:m:s > .... > - Vmag PHOT_JHN_V mag > - Bmag PHOT_JHN_B mag > - z REDSHIFT --- > are you sure the Registry will include all these information? Perhaps "column names" and "units" will be resolved by the service provider, and not by the registry itself. Cheers, Marco From pfo at star.le.ac.uk Mon May 19 08:17:13 2003 From: pfo at star.le.ac.uk (Patricio F. Ortiz) Date: Mon, 19 May 2003 16:17:13 +0100 (BST) Subject: MyUCDs In-Reply-To: <3EC8EECB.1000604@eso.org> Message-ID: On Mon, 19 May 2003, Marco C. Leoni wrote: > Hi Patricio, > one question about the point below: > > > Let me include here another element before the submission of a > > query to the resource handling catalogues: The registry. > > > > Any registry will contain metadata about the services listed, ie, > > the catalogues. > > The registry would know (among other things), > > - catalogueName (possibly catalogueUniqueID) > > - catalogueTitle > > - catalogueKeywords > > - catalogueAuthor > > - number of columns > > - number of records > > - column names, UCDs, units > > - name ID,MAIN --- > > - RA POS_EQ_RA,MAIN h:m:s > > - Dec POS_EQ_DEC,MAIN d:m:s > > .... > > - Vmag PHOT_JHN_V mag > > - Bmag PHOT_JHN_B mag > > - z REDSHIFT --- > > > > are you sure the Registry will include all these information? > Perhaps "column names" and "units" will be resolved by the service > provider, and not by the registry itself. > > Cheers, > Marco Hi Marco, well, I would have thought so. It's done already by vizier meta-information tables (which could be considered a local registry) and surely by other services. It's a good point though, how much is a registry supposed to know? In the scheme that Keith Noddle presented in Cambridge, I would expect at least the local register (read service provider's) should know about about this. Whether the higher level registers decide to collect this information is to be seen, but if you ask *me*, I would say yes, they should keep this information. We are not talking about a huge data volume here; the advantages are large though, as you don't have to go everywhere with your query. I just had to deal with getting a mortgage and run from bank to bank filling up applications not knowing if I satisfied all the conditions they asked; then a consultant phoned us and we dealt with him, he found out who would lend us the money and it worked fine. The analogy seems quite appropriate for our situation in VO. An astronomer wants to formulate a general query, he or she has no idea where this could be done, so s/he asks a service to lauch the query urbe et orbi... Surely the data will come back to the astronomer, but if we apply a filtering system which can select the services which may give a positive answer (and weed off the ones with a negative), then the number of transactions diminishes and the response time is shorter. An intermediate solution would be to send the query to broker-services urbe et orbi, and let those services do the filtering and send the query to places which may give a positive answer. IMHO not using this meta-information would be a real waste. Cheers, Patricio --- Patricio F. Ortiz pfo at star.le.ac.uk AstroGrid project Department of Physics and Astronomy University of Leicester Tel: +44 (0)116 252 2015 LE1 7RH, UK From mleoni at eso.org Mon May 19 09:10:16 2003 From: mleoni at eso.org (Marco C. Leoni) Date: Mon, 19 May 2003 18:10:16 +0200 Subject: MyUCDs & Registry References: Message-ID: <3EC901E8.40307@eso.org> Patricio F. Ortiz wrote: >On Mon, 19 May 2003, Marco C. Leoni wrote: > > >>Hi Patricio, >> one question about the point below: >> >> >> >>> Let me include here another element before the submission of a >>> query to the resource handling catalogues: The registry. >>> >>> Any registry will contain metadata about the services listed, ie, >>> the catalogues. >>> The registry would know (among other things), >>> - catalogueName (possibly catalogueUniqueID) >>> - catalogueTitle >>> - catalogueKeywords >>> - catalogueAuthor >>> - number of columns >>> - number of records >>> - column names, UCDs, units >>> - name ID,MAIN --- >>> - RA POS_EQ_RA,MAIN h:m:s >>> - Dec POS_EQ_DEC,MAIN d:m:s >>> .... >>> - Vmag PHOT_JHN_V mag >>> - Bmag PHOT_JHN_B mag >>> - z REDSHIFT --- >>> >>> >>> >>are you sure the Registry will include all these information? >>Perhaps "column names" and "units" will be resolved by the service >>provider, and not by the registry itself. >> >>Cheers, >> Marco >> >> > >Hi Marco, > >well, I would have thought so. It's done already by vizier meta-information >tables (which could be considered a local registry) and surely by other >services. It's a good point though, how much is a registry supposed to >know? In the scheme that Keith Noddle presented in Cambridge, I would >expect at least the local register (read service provider's) should know >about about this. Whether the higher level registers decide to collect this >information is to be seen, but if you ask *me*, I would say yes, they >should keep this information. We are not talking about a huge data volume >here; the advantages are large though, as you don't have to go everywhere >with your query. > >I just had to deal with getting a mortgage and run from bank to bank >filling up applications not knowing if I satisfied all the conditions they >asked; then a consultant phoned us and we dealt with him, he found out >who would lend us the money and it worked fine. The analogy seems quite >appropriate for our situation in VO. An astronomer wants to formulate a >general query, he or she has no idea where this could be done, so s/he >asks a service to lauch the query urbe et orbi... Surely the data will >come back to the astronomer, but if we apply a filtering system which >can select the services which may give a positive answer (and weed off >the ones with a negative), then the number of transactions diminishes >and the response time is shorter. > >An intermediate solution would be to send the query to broker-services urbe >et orbi, and let those services do the filtering and send the query to >places which may give a positive answer. > >IMHO not using this meta-information would be a real waste. > >Cheers, > >Patricio > >--- >Patricio F. Ortiz pfo at star.le.ac.uk >AstroGrid project >Department of Physics and Astronomy >University of Leicester Tel: +44 (0)116 252 2015 >LE1 7RH, UK > Patricio, I agree that a filtering system would be nice, and in fact a registry is supposed to do that (in my opinion): if necessary it will send an "urbi et orbi" query and give back to the astronomer only the relevant results, where this means compare them with the original requirements (e.g. exclude all the "no info about it" answers). The same before sending the query, if the registry has enough information about services (this is not the case if we think of registries containing only references to other registries, but this will simply add one more level to the structure). What I meant before is that probably we need only UCDs in the registry without any unique column names: when the query reaches the service then it will be translated and UCDs mapped to comlumn names. - On the other side, if the user already knows the unique column name then using it will remove any control about UCDs. - Last possibility, giving both unique column name and UCD will create a well-defined detailed query, with specific result even in the "urbi et orbi" case. Cheers, Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: From ael at star.le.ac.uk Mon May 19 08:14:37 2003 From: ael at star.le.ac.uk (Tony Linde) Date: Mon, 19 May 2003 16:14:37 +0100 Subject: MyUCDs In-Reply-To: <3EC8EECB.1000604@eso.org> Message-ID: <003401c31e19$5c9d3c50$1001a8c0@brolga> The Registry should answer the question 'what are the column names for resource X?', however that question might be phrased. Whether the registry retains that data for the resource or retrieves it from the resource itself when it is asked the question is entirely up to the registry builder. Cheers, Tony. > -----Original Message----- > From: Marco C. Leoni [mailto:mleoni at eso.org] > Sent: 19 May 2003 15:49 > To: Patricio F. Ortiz > Cc: ucd at ivoa.net > Subject: Re: MyUCDs > > > Hi Patricio, > one question about the point below: > > > Let me include here another element before the submission of a > > query to the resource handling catalogues: The registry. > > > > Any registry will contain metadata about the services > listed, ie, > > the catalogues. > > The registry would know (among other things), > > - catalogueName (possibly catalogueUniqueID) > > - catalogueTitle > > - catalogueKeywords > > - catalogueAuthor > > - number of columns > > - number of records > > - column names, UCDs, units > > - name ID,MAIN --- > > - RA POS_EQ_RA,MAIN h:m:s > > - Dec POS_EQ_DEC,MAIN d:m:s > > .... > > - Vmag PHOT_JHN_V mag > > - Bmag PHOT_JHN_B mag > > - z REDSHIFT --- > > > > are you sure the Registry will include all these information? > Perhaps "column names" and "units" will be resolved by the service > provider, and not by the registry itself. > > Cheers, > Marco > From pfo at star.le.ac.uk Mon May 19 10:25:28 2003 From: pfo at star.le.ac.uk (Patricio F. Ortiz) Date: Mon, 19 May 2003 18:25:28 +0100 (BST) Subject: MyUCDs & Registry In-Reply-To: <3EC901E8.40307@eso.org> Message-ID: On Mon, 19 May 2003, Marco C. Leoni wrote: > >Hi Marco, > > > >well, I would have thought so. It's done already by vizier meta-information > >tables (which could be considered a local registry) and surely by other > >services. It's a good point though, how much is a registry supposed to > >know? In the scheme that Keith Noddle presented in Cambridge, I would > >expect at least the local register (read service provider's) should know > >about about this. Whether the higher level registers decide to collect this > >information is to be seen, but if you ask *me*, I would say yes, they > >should keep this information. We are not talking about a huge data volume > >here; the advantages are large though, as you don't have to go everywhere > >with your query. > > > >I just had to deal with getting a mortgage and run from bank to bank > >filling up applications not knowing if I satisfied all the conditions they > >asked; then a consultant phoned us and we dealt with him, he found out > >who would lend us the money and it worked fine. The analogy seems quite > >appropriate for our situation in VO. An astronomer wants to formulate a > >general query, he or she has no idea where this could be done, so s/he > >asks a service to lauch the query urbe et orbi... Surely the data will > >come back to the astronomer, but if we apply a filtering system which > >can select the services which may give a positive answer (and weed off > >the ones with a negative), then the number of transactions diminishes > >and the response time is shorter. > > > >An intermediate solution would be to send the query to broker-services urbe > >et orbi, and let those services do the filtering and send the query to > >places which may give a positive answer. > > > >IMHO not using this meta-information would be a real waste. > > > >Cheers, > > > >Patricio > > > >--- > >Patricio F. Ortiz pfo at star.le.ac.uk > >AstroGrid project > >Department of Physics and Astronomy > >University of Leicester Tel: +44 (0)116 252 2015 > >LE1 7RH, UK > > > > Patricio, > > I agree that a filtering system would be nice, and in fact a > registry is supposed to do that (in my opinion): if necessary it will > send an "urbi et orbi" query and give back to the astronomer only the > relevant results, where this means compare them with the original > requirements (e.g. exclude all the "no info about it" answers). The same > before sending the query, if the registry has enough information about > services (this is not the case if we think of registries containing only > references to other registries, but this will simply add one more level > to the structure). > > What I meant before is that probably we need only UCDs in the registry > without any unique column names: when the query reaches the service then > it will be translated and UCDs mapped to comlumn names. I see your point now. However, remember that what's unique within a table is a column name, not UCDs, therefore, a UCD-driven request may expand to more columns than UCDs (just think of catalogues generated from sextractor) and generate columns which are absolutely meaningless for a user. > - On the other side, if the user already knows the unique column name > then using it will remove any control about UCDs. Unique name within a table, because names are not unique across tables and bear different meaning, the very reason UCDs were introduced. > - Last possibility, giving both unique column name and UCD will create a > well-defined detailed query, with specific result even in the "urbi et > orbi" case. True, but then we may start getting partial answers If I look for (colname=redshift, ucd=REDSHIFT) I would get those catalogues where redshift is called redshift but would live out cases like (colname=z, ucd=REDSHIFT). I think we should put together a list of questions and see what we need to solve them. I can see several families of questions, from the very specialized to the very broad, from the one where the user knows what resource to consult to the ones where the user wants to discover resources. Searching the metadata at any level can prove quite productive! I don't know about dropping the units from a registry (at any level). One problem in datamining is to make sure that one compares apples with apples, so if for instance, I want to compare the extension of galaxies it is important to know whether the units are in arcmin, degrees, arcsec or whatever. Say that I lauch a query where I request all data about galaxies around a particular location and I want to list ID, RA, DEC, diameter, maybe as a user I would like to force consulting catalogues which quote diameters in arcmin only. How do we do that? Alternatively, I could make it part of the query that I want the diameters expressed in arcmin regardless of how they are measured... Food for thought, I'll put some queries together and hopefully we can construct a bunch of cases to explore the needs. Cheers, Patricio --- Patricio F. Ortiz pfo at star.le.ac.uk AstroGrid project Department of Physics and Astronomy University of Leicester Tel: +44 (0)116 252 2015 LE1 7RH, UK From roy at cacr.caltech.edu Mon May 19 16:25:23 2003 From: roy at cacr.caltech.edu (Roy Williams) Date: Mon, 19 May 2003 16:25:23 -0700 Subject: Cambridge presentations References: <200305181119.h4IBJOKF011031@urania.cfa.harvard.edu> Message-ID: <000901c31e5d$ef233680$6b91d783@cacr.caltech.edu> Dear UCD list The presentation of the UCD working group from Cambridge is at http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.ppt http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.pdf There is a new UCD Steering Committee (below) who will be implementing some of the recommendations in the above, with the objective of making a draft document available that the next IVOA meeting in October. Roy -------------------------- UCD Steering Committee: Sebastien Derriere, CDS, F Norman Gray, Starlink UK Jonathan McDowell, CfA, US Francois Ochsenbein, CDS, F Pedro Osuna, ESA, E Andrea Preite-Martinez, CDS, F Guy Rixon, Cambridge, UK Roy Williams, Caltech, US -------- Caltech Center for Advanced Computing Research roy at cacr.caltech.edu 626 395 3670 From francois at vizir.u-strasbg.fr Mon May 19 16:25:00 2003 From: francois at vizir.u-strasbg.fr (Francois Ochsenbein) Date: Tue, 20 May 2003 01:25:00 +0200 Subject: patricio UCD/Tony In-Reply-To: Your message of Fri, 16 May 2003 16:00:33 +0200 (MEST) . Message-ID: <200305192325.h4JNP0707168@vizir.u-strasbg.fr> Following Tony's post on "My understanding of UCDs", I would like to thank first Tony for his action -- it helps to clarify the discussions -- and I support Patricio's details. I would like to add a few more details: 1. UCDs as Data Types: yes, UCDs help to define what is a "legal" operation, but they do not replace the units. 2. UCDs as Keywords: Querying from UCDs to retrieve which resources do match a set of UCDs is one of the possible applications of the UCDs. For example, there is a version of VizieR which retrieves the catalogues containing a set of UCD at http://vizier.u-strasbg.fr/viz-bin/VizieR?-ucd Note that this URL does more than retrieving resources: it can retrieve directly the results by specifying e.g. PHOT_STR* in one of the UCD input specification, and a position on the sky, the search find those objects close to the position and which have a Stroemgren photometry. The UCDs allow more than just retrieving catalogues having this parameter: it allow to specify CONSTRAINTS on that UCD (Patricio's Scenario 2) something like Select ID, POS_EQ_RA_MAIN, POS_EQ_DEC_MAIN, REDSHIFT_HC From [all catalogues] Where REDSHIFT_HC > 3 PROVIDING that a rule is defined to specify what to do in case of multiple columns matching a UCD. The default rule could be something like for, say, the REDSHIFT_HC parameter: -> if a REDSHIFT_HC has a qualifier MAIN , use this one; -> otherwise, use the first non-null value of the possible REDSHIFT_HCs. The real advantage in using UCDs for actual queries is to be able to use GENERIC queries without having first to ask for the details of the catalogues (at the registry level or from the resource server). And when thousands of catalogues are involved, it makes a difference ! Notice also that the rule of the "first non-null" value is not that easy to write as an SQL statement... 3. UCDs as Pointers into Data Model I'm not sure to understand the problem -- why could not a DM point refer to a set of UCDs ? --Francois ================================================================================ Francois Ochsenbein ------ Observatoire Astronomique de Strasbourg 11, rue de l'Universite F-67000 STRASBOURG Phone: +33-(0)390 24 24 29 Email: francois at astro.u-strasbg.fr (France) Fax: +33-(0)390 24 24 32 ================================================================================ From mleoni at eso.org Tue May 20 00:39:41 2003 From: mleoni at eso.org (Marco C. Leoni) Date: Tue, 20 May 2003 09:39:41 +0200 Subject: Cambridge presentations References: <200305181119.h4IBJOKF011031@urania.cfa.harvard.edu> <000901c31e5d$ef233680$6b91d783@cacr.caltech.edu> Message-ID: <3EC9DBBD.60703@eso.org> Hi Roy, just one thing: could please add myself to the list? Thanks a lot! Cheers, Marco Roy Williams wrote: >Dear UCD list > >The presentation of the UCD working group from Cambridge is at >http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.ppt >http://www.ivoa.net/internal/IVOA/IvoaUCD/Cambridge-ucd.pdf > >There is a new UCD Steering Committee (below) who will be implementing some >of the recommendations in the above, with the objective of making a draft >document available that the next IVOA meeting in October. > >Roy >-------------------------- >UCD Steering Committee: >Sebastien Derriere, CDS, F >Norman Gray, Starlink UK >Jonathan McDowell, CfA, US >Francois Ochsenbein, CDS, F >Pedro Osuna, ESA, E >Andrea Preite-Martinez, CDS, F >Guy Rixon, Cambridge, UK >Roy Williams, Caltech, US > >-------- >Caltech Center for Advanced Computing Research >roy at cacr.caltech.edu >626 395 3670 > From ael at star.le.ac.uk Tue May 20 01:58:44 2003 From: ael at star.le.ac.uk (Tony Linde) Date: Tue, 20 May 2003 09:58:44 +0100 Subject: patricio UCD/Tony In-Reply-To: <200305192325.h4JNP0707168@vizir.u-strasbg.fr> Message-ID: <007a01c31eae$04cf8ec0$1001a8c0@brolga> > 1. UCDs as Data Types: > yes, UCDs help to define what is a "legal" operation, but they do not > replace the units. Agreed, but units are not sufficient to define legal operations. Just because two columns have units of 'time' does not mean they can necessarily be combined in any meaningful way. But two columns from different tables with POS_EQ_RA,MAIN can be compared with a meaningful result, even if each has different units. This is why I think that if UCDs are content descriptors, it makes sense to define rules about: which units are valid for a specific type of data; how one converts from one unit to another; which operations are valid on that data and on combinations of data with different UCDs and which UCDs result from those operations (eg POS* can be multiplied by FREQ* to get DENSITY - ridiculous but I'll let the astros come up with real examples). > Note that this URL does more than retrieving resources: it can > retrieve directly the results by specifying e.g. PHOT_STR* in one But this is not a *standard* Registry function. > PROVIDING that a rule is defined to specify what to do in case of > multiple columns matching a UCD. The default rule could be Therein lies the rub! > use GENERIC queries without having first to ask for the details of > the catalogues (at the registry level or from the resource server). But the Registry can determine which of the catalogues being searched have duplicate UCDs which are not easily resolvable and can then ask the user to pick the appropriate columns - even in a set of '000s of catalogues, this is likely to only be a small number (I hope!). > Notice also that the rule of the "first non-null" value is not > that easy to write as an SQL statement... It doesn't need to be easy. The dataset will receive a VOQL query and will translate UCDs in that to column names in a SQL query - it is in this translation that resolution of the correct column name will happen, not in the sql query. > 3. UCDs as Pointers into Data Model > I'm not sure to understand the problem -- why could not a DM point > refer to a set of UCDs ? One could have UCDs of CURRENCY, FLOAT and INTEGER all pointing to a DM entity of 'number'; this is not a problem but does indicate that perhaps the data model needs to be resolved to a lower level if UCDs exist for which there are no unique entities. Cheers, Tony. > -----Original Message----- > From: Francois Ochsenbein [mailto:francois at vizir.u-strasbg.fr] > Sent: 20 May 2003 00:25 > To: ucd at ivoa.net > Subject: Re: patricio UCD/Tony > > ... From roy at cacr.caltech.edu Mon May 19 10:16:10 2003 From: roy at cacr.caltech.edu (Roy Williams) Date: Mon, 19 May 2003 10:16:10 -0700 Subject: MyUCDs & Registry References: <3EC901E8.40307@eso.org> Message-ID: <007701c31e2a$57c5b3e0$6b91d783@cacr.caltech.edu> Patrick and Marco > are you sure the Registry will include all these information? > Perhaps "column names" and "units" will be resolved by > the service provider, and not by the registry itself. My understanding is that every registry entry is based on the Resource and Service Metadata (RSM) document (*). Everything in the registry has the basic stuff like title and publisher and description, also spectral and sky coverage. Then there are extension pieces, and so far we only have a good idea of extending to services. The metadata for these includes the URL, the mime type of what comes back, and other stuff that defines standard VO services (eg SIA, Cone etc). Whta is missing from the RSM is an extension that deal specifically with tables. It would include much of the VOTable header information I assume, including the UCDs. I think that now would be a good time to sketch an XML schema for catalog metadata. We already have curation and coverage from the Resource base, but then how is the extension document different from a VOTable header? Roy (*) http://www.ivoa.net/internal/IVOA/IvoaResReg/ResourceServiceMetadataV7.doc -------- Caltech Center for Advanced Computing Research roy at cacr.caltech.edu 626 395 3670 > Any registry will contain metadata about the services listed, ie, > the catalogues. > The registry would know (among other things), > - catalogueName (possibly catalogueUniqueID) > - catalogueTitle > - catalogueKeywords > - catalogueAuthor > - number of columns > - number of records > - column names, UCDs, units > - name ID,MAIN --- > - RA POS_EQ_RA,MAIN h:m:s > - Dec POS_EQ_DEC,MAIN d:m:s > .... > - Vmag PHOT_JHN_V mag > - Bmag PHOT_JHN_B mag > - z REDSHIFT --- > are you sure the Registry will include all these information? Perhaps "column names" and "units" will be resolved by the service provider, and not by the registry itself. Cheers, Marco From rplante at poplar.ncsa.uiuc.edu Mon May 19 10:24:28 2003 From: rplante at poplar.ncsa.uiuc.edu (Ray Plante) Date: Mon, 19 May 2003 12:24:28 -0500 (CDT) Subject: MyUCDs & Registry In-Reply-To: <3EC901E8.40307@eso.org> Message-ID: Hi Guys, The current registry model allows for catalogs to be registered individually and that the descriptions stored in the registry could contain all the information that Patricio listed. The main question is how fine-grained you want the registry to be and, thus, whether catalogs are within your limit. I think we heard at the interop meeting last week sufficient interest in registering catalogs. So then the next step would be to model the catalog description, using the generic Resource metadata as a starting point. We would ask ourselves what additional information do we want in the description. Column UCDs and names would be an example of that type of information. I'll note that in our prototype schema for describing SIA services, we included a description of the columns returned by an image query. This was done using the VOTable FIELD elements; thus, it included both the local names and the UCDs (along with the IDs, type, and anything else you can stick in a FIELD element and its children). This could be done for a catalog description as well (regardless of whether it is actually available in VOTable format). hope this helps, Ray On Mon, 19 May 2003, Marco C. Leoni wrote: > > > Patricio F. Ortiz wrote: > > >On Mon, 19 May 2003, Marco C. Leoni wrote: > > > > > >>Hi Patricio, > >> one question about the point below: > >> > >> > >> > >>> Let me include here another element before the submission of a > >>> query to the resource handling catalogues: The registry. > >>> > >>> Any registry will contain metadata about the services listed, ie, > >>> the catalogues. > >>> The registry would know (among other things), > >>> - catalogueName (possibly catalogueUniqueID) > >>> - catalogueTitle > >>> - catalogueKeywords > >>> - catalogueAuthor > >>> - number of columns > >>> - number of records > >>> - column names, UCDs, units > >>> - name ID,MAIN --- > >>> - RA POS_EQ_RA,MAIN h:m:s > >>> - Dec POS_EQ_DEC,MAIN d:m:s > >>> .... > >>> - Vmag PHOT_JHN_V mag > >>> - Bmag PHOT_JHN_B mag > >>> - z REDSHIFT --- > >>> > >>> > >>> > >>are you sure the Registry will include all these information? > >>Perhaps "column names" and "units" will be resolved by the service > >>provider, and not by the registry itself. > >> > >>Cheers, > >> Marco > >> > >> > > > >Hi Marco, > > > >well, I would have thought so. It's done already by vizier meta-information > >tables (which could be considered a local registry) and surely by other > >services. It's a good point though, how much is a registry supposed to > >know? In the scheme that Keith Noddle presented in Cambridge, I would > >expect at least the local register (read service provider's) should know > >about about this. Whether the higher level registers decide to collect this > >information is to be seen, but if you ask *me*, I would say yes, they > >should keep this information. We are not talking about a huge data volume > >here; the advantages are large though, as you don't have to go everywhere > >with your query. > > > >I just had to deal with getting a mortgage and run from bank to bank > >filling up applications not knowing if I satisfied all the conditions they > >asked; then a consultant phoned us and we dealt with him, he found out > >who would lend us the money and it worked fine. The analogy seems quite > >appropriate for our situation in VO. An astronomer wants to formulate a > >general query, he or she has no idea where this could be done, so s/he > >asks a service to lauch the query urbe et orbi... Surely the data will > >come back to the astronomer, but if we apply a filtering system which > >can select the services which may give a positive answer (and weed off > >the ones with a negative), then the number of transactions diminishes > >and the response time is shorter. > > > >An intermediate solution would be to send the query to broker-services urbe > >et orbi, and let those services do the filtering and send the query to > >places which may give a positive answer. > > > >IMHO not using this meta-information would be a real waste. > > > >Cheers, > > > >Patricio > > > >--- > >Patricio F. Ortiz pfo at star.le.ac.uk > >AstroGrid project > >Department of Physics and Astronomy > >University of Leicester Tel: +44 (0)116 252 2015 > >LE1 7RH, UK > > > > Patricio, > > I agree that a filtering system would be nice, and in fact a > registry is supposed to do that (in my opinion): if necessary it will > send an "urbi et orbi" query and give back to the astronomer only the > relevant results, where this means compare them with the original > requirements (e.g. exclude all the "no info about it" answers). The same > before sending the query, if the registry has enough information about > services (this is not the case if we think of registries containing only > references to other registries, but this will simply add one more level > to the structure). > > What I meant before is that probably we need only UCDs in the registry > without any unique column names: when the query reaches the service then > it will be translated and UCDs mapped to comlumn names. > - On the other side, if the user already knows the unique column name > then using it will remove any control about UCDs. > - Last possibility, giving both unique column name and UCD will create a > well-defined detailed query, with specific result even in the "urbi et > orbi" case. > > > Cheers, > Marco > From greene at stsci.edu Mon May 19 10:43:23 2003 From: greene at stsci.edu (Gretchen Greene) Date: Mon, 19 May 2003 13:43:23 -0400 Subject: MyUCDs & Registry In-Reply-To: Message-ID: <000a01c31e2e$25517490$2bfaa782@stsci.edu> If the registry is going to supply 'Subject' , i.e. keywords for targeting specific astronomical topics, then I think it would equally serve to provide the UCDs and their associated tags (units, type, etc.) for refining queries. Not to sound harsh, my only question is why do we have to consider generating more schema files to do this? UCD's are fundamental descriptors in my mind and could be included in the base terms as optional elements. I guess the discretion comes from service providers with column descriptions not mapped into UCD's? -Gretchen -----Original Message----- From: Ray Plante [mailto:rplante at poplar.ncsa.uiuc.edu] Sent: Monday, May 19, 2003 1:24 PM To: ucd at ivoa.net Cc: registry at ivoa.net Subject: Re: MyUCDs & Registry Hi Guys, The current registry model allows for catalogs to be registered individually and that the descriptions stored in the registry could contain all the information that Patricio listed. The main question is how fine-grained you want the registry to be and, thus, whether catalogs are within your limit. I think we heard at the interop meeting last week sufficient interest in registering catalogs. So then the next step would be to model the catalog description, using the generic Resource metadata as a starting point. We would ask ourselves what additional information do we want in the description. Column UCDs and names would be an example of that type of information. I'll note that in our prototype schema for describing SIA services, we included a description of the columns returned by an image query. This was done using the VOTable FIELD elements; thus, it included both the local names and the UCDs (along with the IDs, type, and anything else you can stick in a FIELD element and its children). This could be done for a catalog description as well (regardless of whether it is actually available in VOTable format). hope this helps, Ray On Mon, 19 May 2003, Marco C. Leoni wrote: > > > Patricio F. Ortiz wrote: > > >On Mon, 19 May 2003, Marco C. Leoni wrote: > > > > > >>Hi Patricio, > >> one question about the point below: > >> > >> > >> > >>> Let me include here another element before the submission of a > >>> query to the resource handling catalogues: The registry. > >>> > >>> Any registry will contain metadata about the services listed, ie, > >>> the catalogues. > >>> The registry would know (among other things), > >>> - catalogueName (possibly catalogueUniqueID) > >>> - catalogueTitle > >>> - catalogueKeywords > >>> - catalogueAuthor > >>> - number of columns > >>> - number of records > >>> - column names, UCDs, units > >>> - name ID,MAIN --- > >>> - RA POS_EQ_RA,MAIN h:m:s > >>> - Dec POS_EQ_DEC,MAIN d:m:s > >>> .... > >>> - Vmag PHOT_JHN_V mag > >>> - Bmag PHOT_JHN_B mag > >>> - z REDSHIFT --- > >>> > >>> > >>> > >>are you sure the Registry will include all these information? > >>Perhaps "column names" and "units" will be resolved by the service > >>provider, and not by the registry itself. > >> > >>Cheers, > >> Marco > >> > >> > > > >Hi Marco, > > > >well, I would have thought so. It's done already by vizier meta-information > >tables (which could be considered a local registry) and surely by other > >services. It's a good point though, how much is a registry supposed to > >know? In the scheme that Keith Noddle presented in Cambridge, I would > >expect at least the local register (read service provider's) should know > >about about this. Whether the higher level registers decide to collect this > >information is to be seen, but if you ask *me*, I would say yes, they > >should keep this information. We are not talking about a huge data volume > >here; the advantages are large though, as you don't have to go everywhere > >with your query. > > > >I just had to deal with getting a mortgage and run from bank to bank > >filling up applications not knowing if I satisfied all the conditions they > >asked; then a consultant phoned us and we dealt with him, he found out > >who would lend us the money and it worked fine. The analogy seems quite > >appropriate for our situation in VO. An astronomer wants to formulate a > >general query, he or she has no idea where this could be done, so s/he > >asks a service to lauch the query urbe et orbi... Surely the data will > >come back to the astronomer, but if we apply a filtering system which > >can select the services which may give a positive answer (and weed off > >the ones with a negative), then the number of transactions diminishes > >and the response time is shorter. > > > >An intermediate solution would be to send the query to broker-services urbe > >et orbi, and let those services do the filtering and send the query to > >places which may give a positive answer. > > > >IMHO not using this meta-information would be a real waste. > > > >Cheers, > > > >Patricio > > > >--- > >Patricio F. Ortiz pfo at star.le.ac.uk > >AstroGrid project > >Department of Physics and Astronomy > >University of Leicester Tel: +44 (0)116 252 2015 > >LE1 7RH, UK > > > > Patricio, > > I agree that a filtering system would be nice, and in fact a > registry is supposed to do that (in my opinion): if necessary it will > send an "urbi et orbi" query and give back to the astronomer only the > relevant results, where this means compare them with the original > requirements (e.g. exclude all the "no info about it" answers). The same > before sending the query, if the registry has enough information about > services (this is not the case if we think of registries containing only > references to other registries, but this will simply add one more level > to the structure). > > What I meant before is that probably we need only UCDs in the registry > without any unique column names: when the query reaches the service then > it will be translated and UCDs mapped to comlumn names. > - On the other side, if the user already knows the unique column name > then using it will remove any control about UCDs. > - Last possibility, giving both unique column name and UCD will create a > well-defined detailed query, with specific result even in the "urbi et > orbi" case. > > > Cheers, > Marco > From rplante at poplar.ncsa.uiuc.edu Mon May 19 11:42:58 2003 From: rplante at poplar.ncsa.uiuc.edu (Ray Plante) Date: Mon, 19 May 2003 13:42:58 -0500 (CDT) Subject: MyUCDs & Registry In-Reply-To: <000a01c31e2e$25517490$2bfaa782@stsci.edu> Message-ID: Hi Gretchen, On Mon, 19 May 2003, Gretchen Greene wrote: > If the registry is going to supply 'Subject' , i.e. keywords for > targeting specific astronomical topics, then I think it would equally > serve to provide the UCDs and their associated tags (units, type, etc.) > for refining queries. > > Not to sound harsh, my only question is why do we have to consider > generating more schema files to do this? UCD's are fundamental > descriptors in my mind and could be included in the base terms as > optional elements. I guess the discretion comes from service providers > with column descriptions not mapped into UCD's? There is precedence for integrating it into the generic resource metadata: the current VOResource contains Coverage. However, it was pointed out last week that it is unclear, for example, what Coverage means when it applies to an organization. It was suggested (and I plan to look at this) that Coverage only be associated with descriptions of Data Collections and Services. I mention this because it illustrates a basic modeling issue. When UCDs are associated with a description of an SIA service, it can be made clear what the role the UCDs play in the service (i.e. these are the UCDs associated with the columns returned from an image query). Similarly, the connection is clear when made part of a Catalog description. However, if they are associated with a generic resource, their role is ambiguous. That is, what does it mean to search for Organizations based on UCDs? > my only question is why do we have to consider > generating more schema files to do this? Your question suggests that dealing with multiple schema files has additional overhead costs associated with it (as opposed to just have one schema). Do you see this as a problem? We put the metadata that is not purely generic Resource metadata into a separate schema because it makes for a friendlier extension mechanism. If we add new metadata that, say, specifically describes Catalogs into the VOResource schema, it affects all users of this schema regardless of whether they care about Catalogs. (They may have to change their software to cope with the new version.) However, if we put the metadata into its own schema that extends VOResource, it only affects those that want to describe Catalogs. cheers, Ray From greene at stsci.edu Mon May 19 12:21:40 2003 From: greene at stsci.edu (Gretchen Greene) Date: Mon, 19 May 2003 15:21:40 -0400 Subject: MyUCDs & Registry In-Reply-To: Message-ID: <000c01c31e3b$dff0a660$2bfaa782@stsci.edu> Thanks for the insights Ray, I realize I missed out on some discussions that occurred at the IVOA. Still, in building this first registry with the existing schemas, I am finding that there are subtleties to the implementation and my concern is in chasing down multiple schema versions and metadata definitions prohibits efficient prototyping. For example, the existing repositories (OAI, previous Cone registry, etc.) all have varying elements and content. Now which schema files to I pursue to bring about uniformity? This is only the very beginning. The other point is that there are not huge numbers of entries/resources yet in these registries (remember I'm used to the GSC2 billions) and so it seems a little odd to create a highly designed configuration before ever implementing one even though the design ideas are good ones. So...I cheer on the designing but encourage people to start working with the standard schemas/files and see how they interface before going too far. -Gretchen -----Original Message----- From: Ray Plante [mailto:rplante at poplar.ncsa.uiuc.edu] Sent: Monday, May 19, 2003 2:43 PM To: Gretchen Greene Cc: ucd at ivoa.net; registry at ivoa.net Subject: RE: MyUCDs & Registry Hi Gretchen, On Mon, 19 May 2003, Gretchen Greene wrote: > If the registry is going to supply 'Subject' , i.e. keywords for > targeting specific astronomical topics, then I think it would equally > serve to provide the UCDs and their associated tags (units, type, etc.) > for refining queries. > > Not to sound harsh, my only question is why do we have to consider > generating more schema files to do this? UCD's are fundamental > descriptors in my mind and could be included in the base terms as > optional elements. I guess the discretion comes from service providers > with column descriptions not mapped into UCD's? There is precedence for integrating it into the generic resource metadata: the current VOResource contains Coverage. However, it was pointed out last week that it is unclear, for example, what Coverage means when it applies to an organization. It was suggested (and I plan to look at this) that Coverage only be associated with descriptions of Data Collections and Services. I mention this because it illustrates a basic modeling issue. When UCDs are associated with a description of an SIA service, it can be made clear what the role the UCDs play in the service (i.e. these are the UCDs associated with the columns returned from an image query). Similarly, the connection is clear when made part of a Catalog description. However, if they are associated with a generic resource, their role is ambiguous. That is, what does it mean to search for Organizations based on UCDs? > my only question is why do we have to consider > generating more schema files to do this? Your question suggests that dealing with multiple schema files has additional overhead costs associated with it (as opposed to just have one schema). Do you see this as a problem? We put the metadata that is not purely generic Resource metadata into a separate schema because it makes for a friendlier extension mechanism. If we add new metadata that, say, specifically describes Catalogs into the VOResource schema, it affects all users of this schema regardless of whether they care about Catalogs. (They may have to change their software to cope with the new version.) However, if we put the metadata into its own schema that extends VOResource, it only affects those that want to describe Catalogs. cheers, Ray From dmink at cfa.harvard.edu Mon May 19 12:40:44 2003 From: dmink at cfa.harvard.edu (Doug Mink) Date: Mon, 19 May 2003 15:40:44 -0400 Subject: MyUCDs & Registry References: <000c01c31e3b$dff0a660$2bfaa782@stsci.edu> Message-ID: <3EC9333C.CFC31104@cfa.harvard.edu> Gretchen Greene wrote: > The other point is that there are not huge numbers of entries/resources > yet in these registries (remember I'm used to the GSC2 billions) and so > it seems a little odd to create a highly designed configuration before > ever implementing one even though the design ideas are good ones. At the Center for Astrophysics, we tried gathering descriptions of as many types of archived data as we could find hanging around here and tried to get knowledgeable people to describe them in a uniform way which we developed in parallel to the larger VO effort. We thought that working from the archives up could be complementary to the grand design work that many others were doing. We do not have a VO standard interface to the information we collected, but I think that the data could be translated into the protocols which are being discussed. We would be interested in what other people who have data they want to publish to the VO think about the parameters we have been able to collect. Check it out at http://tdc-www.harvard.edu/vo -Doug Mink From schaaff at newb6.u-strasbg.fr Tue May 27 03:00:00 2003 From: schaaff at newb6.u-strasbg.fr (Andre Schaaff) Date: Tue, 27 May 2003 12:00:00 +0200 Subject: New Web Service at CDS Message-ID: <3ED33720.8424CD86@astro.u-strasbg.fr> Hello, A UCD resolver is now available as a Web Service at CDS. See http://cdsweb.u-strasbg.fr/cdsws.gml for more details and an example of use. Regards, Andr? -------------- next part -------------- A non-text attachment was scrubbed... Name: schaaff.vcf Type: text/x-vcard Size: 209 bytes Desc: Card for Andre Schaaff URL: