From dtody at nrao.edu Sun Oct 2 15:58:22 2011 From: dtody at nrao.edu (Douglas Tody) Date: Sun, 2 Oct 2011 16:58:22 -0600 (MDT) Subject: Support for data containing NaN values In-Reply-To: References: Message-ID: A couple of points: o The only place to consider using NaN is in data arrays; it is not supported (currently at least) for any VO metadata or for RDBMS mappings (common for tables containing metadata), as FO notes. o It is generally unwise to use NaN or Inf or -Inf etc. in data arrays; this will likely cause problems with software. This is not really a VO issue; VO doesn't care so long as the data format (FITS or whatever) supports it. However it will cause problems with general client software unless the software intended to consume the data is very constrained. A better approach, both from reliability and efficiency issues, is to carry along a separate mask to explicitly note the status of individual pixels or data values (this could even be a valid place to use STC!). Then general s/w muddles along ok, but more sophisticated software can support the nuances. So my suggestion would be to use a predefined value a la FITS, which is hopefully ignored by most naive processing software (but mapped to whatever by more advanced software), and a mask if one really wants to do it right. Of course, we do not yet have fully defined standards for the use of masks, e.g., for pixel/data value quality, uncertainties, etc. - Doug On Mon, 26 Sep 2011, Randy Thompson wrote: > Thanks for the replies. The question initially came up in testing a > VAO tool which will convert various input file formats to "compliant" > VOTable or FITS files. If NaNs are included in the input files the > values are passed unchanged. It sounds like this does not violate > any VO standards, and other VO applications should be expected to > handle them in data contained in either FITS or VOTable format. > > We are also archiving FITS files from a project using -inf values > to flag bad data points and wondered if any further processing would > be required before these files would be considered "VO compliant". > > Randy > > > On 9/26/11 8:33 AM, "Tom McGlynn" wrote: > >> I'm not sure that Mark's comments really addresses the Randy's >> question. Suppose we have an original dataset, O, in some non-VO >> format and a VO serialization of this dataset, V. Both O and V may >> contain NaNs. As Mark points out a NaN in V is the recommended >> representation for a null value. So in any context in which null >> values are distinct from NaNs, VOTables cannot distinguish them, i.e., >> a NaN in the VOTable does not in general mean that there was a NaN in >> the original data. So if you wish to preserve NaNs VOTables are not >> currently a safe way to do so. >> >> As alluded to, this particular issue has been discussed before and >> some thoughts have been collected at >> http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/VOTableIssues. >> >> Tom >> >> Mark Taylor wrote: >>> Mike, >>> >>> in fact VOTable does permit NaN values in (single or double) floating >>> point data; I think this is a consequence of the design decision >>> that the VOTable model for data is to be as close as possible to >>> that of FITS. Furthermore, NaN is how VOTable recommends to represent >>> NULL values in floating point data (again, following FITS) - whether >>> that's a good idea or not is a question that has been debated elsewhere, >>> but that's what VOTable section 6 says >>> >>> (http://www.ivoa.net/Documents/VOTable/20091130/REC-VOTable-1.2.html#ToC4 >>> 1) >>> >>> Mark >>> >>> On Mon, 26 Sep 2011, Mike Fitzpatrick wrote: >>> >>>> I took the question differently: If VO allows FITS data, and >>>> FITS data allows NaN, then apps should of course allow >>>> for this. OTOH, if the question is whether "VO data" as is >>>> serialized in a VOTable allows NaN values the I think the >>>> answer is 'no' (but I'd have to check). There are similar issues >>>> with how NULL values are handled, but again it depends on >>>> whether it is in the serialized VOTable or the end data product >>>> being accessed. The DAL protocols don't say anything about >>>> NaN/NULL beyond how they might be serialized, FITS is FITS >>>> and if that's what the app retrieves in the end and then that is >>>> the standard to follow when interpreting the data. Was that >>>> your question? >>>> >>>> My $0.02, >>>> -Mike >>>> >>>> >>>> >>>> On Mon, Sep 26, 2011 at 2:06 AM, Mark >>>> Taylorwrote: >>>> >>>>> On Thu, 22 Sep 2011, Randy Thompson wrote: >>>>> >>>>>> As a general question, does data containing NaN values >>>>>> violate any VO standards or protocols,and if not, should VO >>>>>> applications be expected to accept them as input? >>>>> >>>>> the question is rather broad ("data" can take many forms), but on >>>>> the whole the answer is that most standards and software in the VO >>>>> should and do behave sensibly in the presence of NaN-valued floating >>>>> point values. >>>>> >>>>> -- >>>>> Mark Taylor Astronomical Programmer Physics, Bristol University, >>>>> UK >>>>> m.b.taylor at bris.ac.uk +44-117-928-8776 >>>>> http://www.star.bris.ac.uk/~mbt/ >>>>> >>>> >>> >>> -- >>> Mark Taylor Astronomical Programmer Physics, Bristol University, UK >>> m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/ >> > > From patrick.dowler at nrc-cnrc.gc.ca Thu Oct 6 12:57:05 2011 From: patrick.dowler at nrc-cnrc.gc.ca (Patrick Dowler) Date: Thu, 6 Oct 2011 12:57:05 -0700 Subject: DAL interop sessions Message-ID: <201110061257.06087.patrick.dowler@nrc-cnrc.gc.ca> The rough schedule for the four DAL sessions is now available from the main programme web page. The direct link for DAL is: http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/InterOpOct2011DAL Aside from TAPRegExt (which is nearing completion), the sessions will be dominated by discussion rather than formal presentations. Anyone who wants to add a discussion topic to the agenda (and I use that term loosely :-) should add it to the appropriate linked wiki page. We will have that page displayed during the session and it will be updated shortly afterwards with the results. For people that are *not attending* the meeting but care about the topic: please add to the discussion page in advance of the session with just enough detail to get things going. -- Patrick Dowler Tel/T?l: (250) 363-0044 Canadian Astronomy Data Centre National Research Council Canada 5071 West Saanich Road Victoria, BC V9E 2M7 Centre canadien de donnees astronomiques Conseil national de recherches Canada 5071, chemin West Saanich Victoria (C.-B.) V9E 2M7 From rthomp at stsci.edu Fri Oct 7 06:20:37 2011 From: rthomp at stsci.edu (Randy Thompson) Date: Fri, 7 Oct 2011 13:20:37 +0000 Subject: Support for data containing NaN values In-Reply-To: Message-ID: Hi Doug, Thanks for the reply. I agree the special IEEE data values are best avoided and we can recommend projects do this, but if it doesn't violate any standard we can only make recommendations. Randy On 10/2/11 6:58 PM, "Douglas Tody" wrote: >A couple of points: > > o The only place to consider using NaN is in data arrays; it > is not supported (currently at least) for any VO metadata or for > RDBMS mappings (common for tables containing metadata), as FO > notes. > > o It is generally unwise to use NaN or Inf or -Inf etc. in data > arrays; this will likely cause problems with software. This is > not really a VO issue; VO doesn't care so long as the data > format (FITS or whatever) supports it. However it will cause > problems with general client software unless the software > intended to consume the data is very constrained. A better > approach, both from reliability and efficiency issues, is to > carry along a separate mask to explicitly note the status of > individual pixels or data values (this could even be a valid > place to use STC!). Then general s/w muddles along ok, but > more sophisticated software can support the nuances. > >So my suggestion would be to use a predefined value a la FITS, which is >hopefully ignored by most naive processing software (but mapped to >whatever by more advanced software), and a mask if one really wants to >do it right. Of course, we do not yet have fully defined standards for >the use of masks, e.g., for pixel/data value quality, uncertainties, >etc. > > - Doug > > >On Mon, 26 Sep 2011, Randy Thompson wrote: > >> Thanks for the replies. The question initially came up in testing a >> VAO tool which will convert various input file formats to "compliant" >> VOTable or FITS files. If NaNs are included in the input files the >> values are passed unchanged. It sounds like this does not violate >> any VO standards, and other VO applications should be expected to >> handle them in data contained in either FITS or VOTable format. >> >> We are also archiving FITS files from a project using -inf values >> to flag bad data points and wondered if any further processing would >> be required before these files would be considered "VO compliant". >> >> Randy >> >> >> On 9/26/11 8:33 AM, "Tom McGlynn" wrote: >> >>> I'm not sure that Mark's comments really addresses the Randy's >>> question. Suppose we have an original dataset, O, in some non-VO >>> format and a VO serialization of this dataset, V. Both O and V may >>> contain NaNs. As Mark points out a NaN in V is the recommended >>> representation for a null value. So in any context in which null >>> values are distinct from NaNs, VOTables cannot distinguish them, i.e., >>> a NaN in the VOTable does not in general mean that there was a NaN in >>> the original data. So if you wish to preserve NaNs VOTables are not >>> currently a safe way to do so. >>> >>> As alluded to, this particular issue has been discussed before and >>> some thoughts have been collected at >>> http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/VOTableIssues. >>> >>> Tom >>> >>> Mark Taylor wrote: >>>> Mike, >>>> >>>> in fact VOTable does permit NaN values in (single or double) floating >>>> point data; I think this is a consequence of the design decision >>>> that the VOTable model for data is to be as close as possible to >>>> that of FITS. Furthermore, NaN is how VOTable recommends to represent >>>> NULL values in floating point data (again, following FITS) - whether >>>> that's a good idea or not is a question that has been debated >>>>elsewhere, >>>> but that's what VOTable section 6 says >>>> >>>> >>>>(http://www.ivoa.net/Documents/VOTable/20091130/REC-VOTable-1.2.html#To >>>>C4 >>>> 1) >>>> >>>> Mark >>>> >>>> On Mon, 26 Sep 2011, Mike Fitzpatrick wrote: >>>> >>>>> I took the question differently: If VO allows FITS data, and >>>>> FITS data allows NaN, then apps should of course allow >>>>> for this. OTOH, if the question is whether "VO data" as is >>>>> serialized in a VOTable allows NaN values the I think the >>>>> answer is 'no' (but I'd have to check). There are similar issues >>>>> with how NULL values are handled, but again it depends on >>>>> whether it is in the serialized VOTable or the end data product >>>>> being accessed. The DAL protocols don't say anything about >>>>> NaN/NULL beyond how they might be serialized, FITS is FITS >>>>> and if that's what the app retrieves in the end and then that is >>>>> the standard to follow when interpreting the data. Was that >>>>> your question? >>>>> >>>>> My $0.02, >>>>> -Mike >>>>> >>>>> >>>>> >>>>> On Mon, Sep 26, 2011 at 2:06 AM, Mark >>>>> Taylorwrote: >>>>> >>>>>> On Thu, 22 Sep 2011, Randy Thompson wrote: >>>>>> >>>>>>> As a general question, does data containing NaN values >>>>>>> violate any VO standards or protocols,and if not, should VO >>>>>>> applications be expected to accept them as input? >>>>>> >>>>>> the question is rather broad ("data" can take many forms), but on >>>>>> the whole the answer is that most standards and software in the VO >>>>>> should and do behave sensibly in the presence of NaN-valued floating >>>>>> point values. >>>>>> >>>>>> -- >>>>>> Mark Taylor Astronomical Programmer Physics, Bristol University, >>>>>> UK >>>>>> m.b.taylor at bris.ac.uk +44-117-928-8776 >>>>>> http://www.star.bris.ac.uk/~mbt/ >>>>>> >>>>> >>>> >>>> -- >>>> Mark Taylor Astronomical Programmer Physics, Bristol University, >>>>UK >>>> m.b.taylor at bris.ac.uk +44-117-928-8776 >>>>http://www.star.bris.ac.uk/~mbt/ >>> >> >> From francois.bonnarel at astro.unistra.fr Fri Oct 14 01:46:39 2011 From: francois.bonnarel at astro.unistra.fr (=?ISO-8859-1?Q?Fran=E7ois_Bonnarel?=) Date: Fri, 14 Oct 2011 10:46:39 +0200 Subject: towards a DataLink IVOA Note Message-ID: <4E97F6EF.7050806@astro.unistra.fr> Dear all, Next week in Pune a DAL working group session will be focused on DataLink. I present here first ideas for an IVOA note on this topic... Sorry for not providing this sooner to people willing to co write the note... But I think after discussing these ideas next weeks we can move rapidly together to something more collective... Cheers Fran?ois Introduction --------------- Discussion in the IVOA has shown that "DataLink" is a usefull concept within the scope of the Generic data set architecture (see "Service architecture and standard profile" - http://www.ivoa.net/internal/IVOA/SiaInterface/DAL2_Architecture.pdf - for first description of the GDS concept and a presentation by Doug Tody at the Strasbourg May 2009 Interop meeting - http://www.ivoa.net/internal/IVOA/200905DALSessions/siapv2-may09.pdf - for first mention of the DataLink concept) Within the scope of DAL protocols , Generic dataset protocol concept illustrates the need for a type of services valid for any kind of data , as a counterpart of typed interfaces such as SSA and SIA ... which can describe only a single category of data, but can do it in finer detail, with a data model specific to the data. A second important difference is that the generic dataset can describe only data sets or files as they are stored in some archive, whereas the typed interfaces can describe, and provide access to both static archival datasets as well as virtual data. In addition, they actually DRIVE the generation of the latter. Finally, since the generic dataset can describe any type of data it can also describe all sorts of complex data. This approach provides a lot of flexibility for both describing and accessing data. A complex observation consisting of several related data products can be described via the generic dataset query mechanism. For example we might have a survey field consisting of a spectral data cube, some 2-D projections of the cube (integrated flux, maps for a given wavelength, velocity/position maps, etc ...), a source catalog for the field computed from the 2-D continuum, and possibly some extracted spectra of objects in the field. It is the work of Client applications to deal with the data sets. Simple clients will discover and help to retrieve entire datasets while more sophisticated ones could make use of the metadata to go further in the analysis of the data. This phase could require interpretaion and use of new data links attached to the considered data set ... Within the frame of generic data set "DataLinks" relate a Data discovery table record to some other "object". A record can have a number of such links. The table representation of links is what is more convenient in the TAP era... Links contain at least a dataset ID used as a key, a link type and an URL . DataLinks link a record in the dataset table to files of some types, a standard or custom service to access the data, an HTML page, etc ... Discovery with Obstap ------------------------------ Obstap is actually playing the role of a data discovery service within the frame of Generic dataset protocols ... Building on the work done on data models (ref) and TAP (ref), it became recently possible to define a standard service protocol to expose standard metadata describing available datasets: Obstap (ref). In general, any data model can be mapped to a relational database and exposed directly with the TAP protocol. The goal of ObsTAP is to provide such a capability based upon an essential subset of the general observational data model. Specifically, this effort defined a database table to describe astronomical datasets (data products) stored in archives that can be queried directly with the TAP protocol. This is very usefull for global data discovery as any type of data can be described in a straightforward and uniform fashion. The described datasets can be directly downloaded, or linked to IVOA Data Access Layer (DAL) protocols such as for accessing images (SIA) or spectra (SSA) or whatever kind of services. These links can potentially be used to perform more advanced data access operations on the referenced datasets. Actually this is what is behind the "DataLink" concept which we will describe now. Linking the Discovery results with "something else" -------------------------------------------------------------------- Suppose we have interrogated an Obstap service... : Each row in the result page describes a dataset... The description contains a field named "reference". It could be there for direct retrieval or tp provide richer type of "links". What kind of "links" could be provided to the user for such a dataset ? One can imagine various types such as: - direct retrieval of the full dataset - access to a part of the dataset when the internal structure is known ... - access to a service by forcing an ObsID-like parameter to be fixed - access to related files : previews, visualisations, calibration files, etc, etc.... Of course each dataset described in the Obstap service query response may have several links like this and the nature of the link has to be described somehow.. The reference itself (generally a web access) has to be described by its URL, its format and size like in SSA services or Obstap services query responses but in addition the structure of the file has also to be given ... Describing the link in practice --------------------------------------- Concretly, how will be the links described... A small package of attributes (data model package) can be defined for this. We will define : - the ObsId attribute - an attribute giving the meaning (or semantics) of the link (Calibration file or SIA DESCRIPTION or catalogue part in a complex dataset -archive - ) - an attribute describing the IVOA type of the link: simple retrieval, Other Obstap, SIA , SSA service, with either query or DataAccess method, UWS service , etc.... - an Access package details the structure of the link (see Characaterisation 2 reference) for more complete description of this Access package.... It contains : * an URL or URI * a mime/type eg image/fits, .... * an estimated size for the response * a subtype : table, votable, mef, archive * a set of internal attributes: . path . array . row . field . extnum . extname The "internal" attributes are really important to provide localisation of the link inside a complex dataset... - The "Path" attribute allows to describe the file path and name inside an "archive" dataset - The "array" attribute defines a cutout in a n-dimensional array image using the cfitsio syntax: [50:100,70:200] being the extracted subimage from pixel 50 to 100 in x and 70 to 200 in y. - The "extnume" or "extname" attribute designates the extension number or name in a multi extension FITS file. Extname can also be used to designate a RESOURCE or TAble name in a complex VOTABLE document ... - "FIELD" and "ROW" have obvious significations in a FITS or VOTABLE table... All these attributes are optional and by using this ordering: [path][extnum|extname][field][row][array] it should be possible to locate any kind of significant structure in archives or datasets containing the most commun astronomy standards for files ...j Building a Data Link service ------------------------------------- A data Link service is a simple DAL service providing DataLinks for a set of Obsids. The result is presented as a VOTABLE with one field per attribute ( attributes described above) ... The input parameters is an Obsid. A set of OBsIds stored in a file can also be given as input..... Let's give examples of queries and query responses. A service query could be of the following form: http://aaa.bbbb.fr/dal-services/datalink?obsid="ivoa://xxx.yyy.edu/123345" where ivoa://xxx.yyy.edu/12345 is an IVOA identifier of a specific observation which could have been provided by an Obstap Query. The query response could be something like this: In this response example, where the main dataset is a tar archive, the first record links to the full retrieval of the whole archive, the second record links to a FITS image cccc.fits in the directory image of the tar file.... The last record links to the query method of a SIAP service which will answer by description of images sharing the "ivoa://xxx.yyy.edu/123345" Obsid Extensions of ObsTap for dataLinking ---------------------------------------------------- In any case a future version of ObsTAp could benefit defining an additional FIELD with utype "dataLink" which will be pointing to a DataLink service (a PARAM could also be sufficient...), using the Obsid value of each record as the main parameter for the query. In addition an ObsTAp service is a TAP service and may have several query languages ... The mandatory ADQL interface can be usefully completed by PQL for example. In the case we use a Obstap service with a PQL interface the standard doesn't require the single table response ... This allows to add DAL extensions (additional tables) - see SSA recommendation for a definition of DAL extension mechanism - to the main standard Obstap table. Adding a specific DataLink response to the main Obstap table becomes then possible. The ObsId FIELD which is common to the two tables allows to relate records in the main obstap table to records in the concatenated DataLink response, and can be used as a reference key. This approach avoids the necessity for the clients to extract the URL for the dataLink service from the main table and to start a new query on another service... -------------- next part -------------- An HTML attachment was scrubbed... URL: From norman at astro.gla.ac.uk Fri Oct 14 06:00:45 2011 From: norman at astro.gla.ac.uk (Norman Gray) Date: Fri, 14 Oct 2011 14:00:45 +0100 Subject: towards a DataLink IVOA Note In-Reply-To: <4E97F6EF.7050806@astro.unistra.fr> References: <4E97F6EF.7050806@astro.unistra.fr> Message-ID: <35AF19FC-DEA3-4E34-A293-D40DB0434909@astro.gla.ac.uk> Fran?ois and all, hello. On 2011 Oct 14, at 09:46, Fran?ois Bonnarel wrote: > Next week in Pune a DAL working group session will be focused on DataLink. > I present here first ideas for an IVOA note on this topic... Sorry for not providing this sooner to people willing to co write the note... > But I think after discussing these ideas next weeks we can move rapidly together to something more collective... This is very interesting. Looking at the links you posted, it looks as if some part of what you're describing -- the inter-dataset linkage -- is supported by the prototype AstroDAbis service which Bob Mann, Dave Morris and I have been working on (there are a few details at the funder-mandated project blog: ). We hope this will turn into a long-term supported service, and would welcome comments. Another large fraction of the desired capability would seem to be supplied by data DOIs, such as those being developed by the DataCite consortium of libraries . They're shaking down the specification, and promoting this as a general world-wide framework for referring to and linking to datasets in a long-term stable way. Best wishes, Norman -- Norman Gray : http://nxg.me.uk SUPA School of Physics and Astronomy, University of Glasgow, UK From genova at newb6.u-strasbg.fr Sat Oct 15 00:02:14 2011 From: genova at newb6.u-strasbg.fr (Francoise Genova) Date: Sat, 15 Oct 2011 09:02:14 +0200 (MEST) Subject: towards a DataLink IVOA Note Message-ID: <201110150702.p9F72Ee13290@cluster.u-strasbg.fr> Dear Francois & Norman, It is planned to discuss data citation in the DataCP session http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/InterOpOct2011DCP I am really interested to discuss the DOI capabilities to fulfil astronomy needs (e.g. sustainability - which requires a strong and sustainable orgaisation among other qualities, scalability, etc) Francoise From dburke at cfa.harvard.edu Sat Oct 15 04:55:20 2011 From: dburke at cfa.harvard.edu (Douglas Burke) Date: Sat, 15 Oct 2011 07:55:20 -0400 Subject: towards a DataLink IVOA Note In-Reply-To: <35AF19FC-DEA3-4E34-A293-D40DB0434909@astro.gla.ac.uk> References: <4E97F6EF.7050806@astro.unistra.fr> <35AF19FC-DEA3-4E34-A293-D40DB0434909@astro.gla.ac.uk> Message-ID: <4E9974A8.2020803@cfa.harvard.edu> On Fri Oct 14 09:00:45 2011, Norman Gray wrote: > > Fran?ois and all, hello. > > On 2011 Oct 14, at 09:46, Fran?ois Bonnarel wrote: > >> Next week in Pune a DAL working group session will be focused on DataLink. >> I present here first ideas for an IVOA note on this topic... Sorry for not providing this sooner to people willing to co write the note... >> But I think after discussing these ideas next weeks we can move rapidly together to something more collective... > > This is very interesting. Looking at the links you posted, it looks as if some part of what you're describing -- the inter-dataset linkage -- is supported by the prototype AstroDAbis service which Bob Mann, Dave Morris and I have been working on (there are a few details at the funder-mandated project blog:). We hope this will turn into a long-term supported service, and would welcome comments. > > Another large fraction of the desired capability would seem to be supplied by data DOIs, such as those being developed by the DataCite consortium of libraries. They're shaking down the specification, and promoting this as a general world-wide framework for referring to and linking to datasets in a long-term stable way. > > Best wishes, > > Norman > > Just a quick note to say that the ADS AstroExplorer team is interested in this area, as we want to link literature and data. Alberto will be at the Pune meeting. Doug ------------------------------------------------------------------- Doug Burke | http://hea-www.harvard.edu/~dburke/ Harvard-Smithsonian | Email: dburke at cfa.harvard.edu Center for Astrophysics | Phone: (617) 496 7853 60 Garden Street MS-2 | Fax: (617) 495 7356 Cambridge, MA 02138 | Office: B-440 ------------------------------------------------------------------- From norman at astro.gla.ac.uk Sat Oct 15 08:15:54 2011 From: norman at astro.gla.ac.uk (Norman Gray) Date: Sat, 15 Oct 2011 16:15:54 +0100 Subject: towards a DataLink IVOA Note In-Reply-To: <201110150702.p9F72Ee13290@cluster.u-strasbg.fr> References: <201110150702.p9F72Ee13290@cluster.u-strasbg.fr> Message-ID: Fran?oise, hello. On 15 Oct 2011, at 08:02, Francoise Genova wrote: > It is planned to discuss data citation in the DataCP > session > http://www.ivoa.net/cgi-bin/twiki/bin/view/IVOA/InterOpOct2011DCP > > I am really interested to discuss the DOI capabilities to > fulfil astronomy needs (e.g. sustainability - which requires > a strong and sustainable orgaisation among other qualities, > scalability, etc) Alberto has scheduled me for a quick talk, presuming that Skype holds up. For what it's worth, I think the issues are the following: 1. What should astronomy use for literature citation? 2. What should astronomy use for long-term (online) document citation? 3. What should astronomy use for data citation? 1. Seems sorted with arXiv IDs and DOIs. 2. PURLs or ARKs I think solve this. DOIs are also a possibility, but I have a strong preference for things with 'http://' at the front, and http://dx.doi.org doesn't _really_ do it. I wouldn't, however, go to the stake for it. 3. It really sounds as if DataCite can handle this. They certainly seem serious about making sure they can, they seem to be the sort of people (national libraries and the like) who can pull off something international, fully scalable and fully sustainable, and it seems that the only real things to sort out there are what the funding model should be, deciding between a couple of possibilities (the issue is how to have a funding model that's fair to people who want to mint 10 DOI/yr and 10^9 DOI/yr). The DataCite members are at . Should the IVOA become an associate member of this? I'm pretty sure that we -- meaning the IVOA -- don't have to invent anything in this space. Indeed I'd go further: it is necessary that we do not invent anything here. Enjoy the indian evening. All the best, Norman -- Norman Gray : http://nxg.me.uk SUPA School of Physics and Astronomy, University of Glasgow, UK From genova at newb6.u-strasbg.fr Sat Oct 15 10:06:06 2011 From: genova at newb6.u-strasbg.fr (Francoise Genova) Date: Sat, 15 Oct 2011 19:06:06 +0200 (MEST) Subject: towards a DataLink IVOA Note Message-ID: <201110151706.p9FH66f16264@cluster.u-strasbg.fr> (for the readers who are on the datacp list and not on the dal list: the strating point for this discussion is on the dal list) Hi Norman, I tend to think that having sustainable solutions for data citation is a critical requirement (ie, the funding model and its sustainability is not a detail). I remember very well at the beginning of the web, when CDS/ADS/NED and the journals decided to use the bibcode/refcode to network bibliographic services: at that time it was not at all evident to see who was going to win the battle which was finally won by the DOI for which concerns bibliographic reference citations. I'll post the talk I have prepared for the DataCP meeting. It contains two links to recent meetings/actions which tend to show that the debate on solutions for which concerns citations of data (or even on a compilation of best practices in that domain) is not over: http://sites.nationalacademies.org/PGA/brdi/PGA_063656 and http://www.codata.org/taskgroups/TGdatacitation/index.html I would be interested to know if there is detailed information somewhere about the scalability of the DataCite solution. People I know from other disciplines who push for it want to declare 'campaign' data sets, and not each observation of a large observatory which is operated for many years. Cheers Francoise From aaccomazzi at cfa.harvard.edu Sat Oct 15 11:03:38 2011 From: aaccomazzi at cfa.harvard.edu (Alberto Accomazzi) Date: Sat, 15 Oct 2011 14:03:38 -0400 Subject: towards a DataLink IVOA Note In-Reply-To: <201110151706.p9FH66f16264@cluster.u-strasbg.fr> References: <201110151706.p9FH66f16264@cluster.u-strasbg.fr> Message-ID: <4E99CAFA.2070508@cfa.harvard.edu> Hi Francoise, Thank you for bringing the discussion to DC&P. Just a couple of additional thoughts on this: 1. All the projects which make use of DataCite that I know of are quite happy with a "macro" approach, which involves assigning a single DOI to all the data products published in a paper/study. I will show an examples of this in my talk. This does not mean that we can't go any finer, but it's just something to be aware of when the question of scaling comes up. 2. The issue of what should be a citeable nugget and how it should be expressed in a document is yet a separate problem, as is the technical issue of how the resource should be de-referenced and what safeguards exist behind the infrastructure that supports this de-referencing. Unfortunately it's hard to talk about one issue without bringing into the discussion the other(s), which often only makes taking decisions more difficult. Francoise, is 10 minutes enough for your DC&P presentation? Originally I thought you would just say something about the WDS, but maybe there's more you want to say. Please let me know so I can arrange the schedule accordingly. Thanks, -- Alberto Francoise Genova wrote, On 10/15/11 1:06 PM: > (for the readers who are on the datacp list and not on the > dal list: the strating point for this discussion is > on the dal list) > > Hi Norman, > > I tend to think that having sustainable solutions > for data citation is a critical requirement > (ie, the funding model and its sustainability is > not a detail). I remember very well at the > beginning of the web, when CDS/ADS/NED and > the journals decided to use the bibcode/refcode > to network bibliographic services: > at that time it was not at all evident to see > who was going to win the battle which was finally > won by the DOI for which concerns bibliographic > reference citations. > > I'll post the talk I have prepared > for the DataCP meeting. > It contains two links to recent meetings/actions > which tend to show that the debate on solutions for > which concerns citations of data (or even on a > compilation of best practices in that domain) is not over: > > http://sites.nationalacademies.org/PGA/brdi/PGA_063656 > > and > > http://www.codata.org/taskgroups/TGdatacitation/index.html > > I would be interested to know if there is detailed information > somewhere about the scalability of the DataCite solution. > People I know from other disciplines who push for it > want to declare 'campaign' data sets, and not each observation > of a large observatory which is operated for many years. > > Cheers > > Francoise -- Dr. Alberto Accomazzi aaccomazzi(at)cfa harvard edu Program Manager NASA Astrophysics Data System ads.harvard.edu Harvard-Smithsonian Center for Astrophysics www.cfa.harvard.edu 60 Garden St, MS 83, Cambridge, MA 02138, USA From norman at astro.gla.ac.uk Mon Oct 17 03:42:30 2011 From: norman at astro.gla.ac.uk (Norman Gray) Date: Mon, 17 Oct 2011 11:42:30 +0100 Subject: towards a DataLink IVOA Note In-Reply-To: <201110151706.p9FH66f16264@cluster.u-strasbg.fr> References: <201110151706.p9FH66f16264@cluster.u-strasbg.fr> Message-ID: <7DD51514-243E-41E1-B088-0718EF806A35@astro.gla.ac.uk> Fran?oise, hello. On 2011 Oct 15, at 18:06, Francoise Genova wrote: > I tend to think that having sustainable solutions > for data citation is a critical requirement > (ie, the funding model and its sustainability is > not a detail). I remember very well at the > beginning of the web, when CDS/ADS/NED and > the journals decided to use the bibcode/refcode > to network bibliographic services: > at that time it was not at all evident to see > who was going to win the battle which was finally > won by the DOI for which concerns bibliographic > reference citations. Ah, I didn't mean to suggest that the funding model was just a detail, just that the people involved seem comfortable that the remaining problems are not technical ones (of course the non-technical problems can be the hardest ones). I was this morning in touch with some of the UK DataCite people, and they said: > Scalability of the level you suggest [I'd said: what about 10^8 DOI/yr?] has not been tested but we see no reason why our service cannot provide for this. Essentially we're happy to do this but suggest that some planning and staging is required. They're talking about a flat subscription, around GBP 1000 + GBP500/year. ---- But I say all this with some diffidence. I don't really have standing here -- I'm not an archive, and I'm more likely to be a consumer of the IVOA's citation solution than a provider for it. However, as a result of projects over the last couple of years I've ended up talking to library people, and people involved in whole-academy digital preservation (as of course will you and others in the DC&P group), and I have the uncomfortable perception that the rest of the academy seems to be pressing ahead with plausible solutions while astronomy -- which would be a natural leader, given its experience -- seems disconnected from this activity. Of course, this is at base due to a lack of FTEs, and perhaps due to a lack of obvious community urgency. So this is me being Community, calling out from the floor! > I'll post the talk I have prepared > for the DataCP meeting. > It contains two links to recent meetings/actions > which tend to show that the debate on solutions for > which concerns citations of data (or even on a > compilation of best practices in that domain) is not over: This is very reassuring, from my point of view. > I would be interested to know if there is detailed information > somewhere about the scalability of the DataCite solution. > People I know from other disciplines who push for it > want to declare 'campaign' data sets, and not each observation > of a large observatory which is operated for many years. I hope the comments above are relevant to this. My impression is that the DataCite people haven't dedicated resources to experimenting with large-scale performance, because there hadn't been a community banging on their door demanding it. It sounds as if we're on the edge of a substantial step forwards. But we seem to have been on this threshold for quite a long time, now. Best wishes, Norman -- Norman Gray : http://nxg.me.uk SUPA School of Physics and Astronomy, University of Glasgow, UK From mireille.louys at unistra.fr Tue Oct 18 03:34:38 2011 From: mireille.louys at unistra.fr (Mireille Louys) Date: Tue, 18 Oct 2011 12:34:38 +0200 Subject: [PQL]how to ask for ordered data in the response? Message-ID: <20111018123438.utidona6hcc80cos@webmail.u-strasbg.fr> Hi all, just browsing quickly the PQL docs mentionned during the PQL session @Pune, I am wondering how to define a keyword in the PQL parameter list that tells the server to provide ordered data in the query result. Is it already addressed somewhere? Thanks , Mireille -- Mireille Louys, assistant professor at UDS: ENSPS, Laboratoire ICube et CDS Observatoire de Strasbourg mail to: mireille.louys at unistra.fr Tel: +33 3 68 85 24 34 Adress 1: CDS/Observatoire de Strasbourg 11, rue de l'Universit? 67000 STRASBOURG From m.b.taylor at bristol.ac.uk Fri Oct 28 01:30:49 2011 From: m.b.taylor at bristol.ac.uk (Mark Taylor) Date: Fri, 28 Oct 2011 09:30:49 +0100 (BST) Subject: TAPRegExt upload URIs Message-ID: Hi DAL, to anyone out there running TAP services which support upload: the most recent draft of the TAPRegExt standard requires upload methods to be declared using elements with IDs like this: (note the new "std/" in the URI). The most recent versions of TOPCAT (v3.8) and also taplint (STILTS v2.4), released today look for these URIs. That means that if your service uses one of the older forms instead, newer TOPCAT versions will fail to recognise that it supports uploads. TOPCAT looks for these elements in the document emitted by the /capabilities endpoint; the content of any registry record is not examined. So, to enable TOPCAT users to perform uploads to your TAP service, you are encouraged to update your capabilities declaration accordingly. (TOPCAT version 3.8 looked for "ivo://ivoa.org/tap/uploadmethods#inline", so if you want to maintain compatibility with older versions, you could declare an upload method with that ID too. Declaration of such additional "custom" upload methods is permitted within TAPRegExt). Mark -- Mark Taylor Astronomical Programmer Physics, Bristol University, UK m.b.taylor at bris.ac.uk +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
ivoa://xxx.yyy.edu/123345 full dataset retrieval http://xxx.yyy.de/archive.tar 3.4Gb archive/tar none none none none none none none
ivoa://xxx.yyy.edu/123345 image retrieval http://xxx.yyy.de/archive.tar 1Gb image/fits none image/cccc.fits none none none none none
ivoa://xxx.yyy.edu/123345 image metadata sia http://xxx.yyy.de/sinea?query&obsid="ivoa://xxx.yyy.edu/123345" 1Kb application/xml+votable none none none none none none none