Format of tokens
Frederic V. Hessman
Hessman at Astro.physik.Uni-Goettingen.DE
Tue Nov 13 09:08:37 PST 2007
> My concern is that there is a discrepancy between Rick’s SKOS model
> generated by his script and the original files. My feeling is that
> the SKOS model representing the IAU Thesaurus that is to be
> published by the IVOA should be an accurate model. If we cannot
> produce an accurate SKOS model but claim that it is, then people
> will not trust the IVOAT or any of the semantics works involving
> vocabularies and ontologies.
The problem was simply that I had forgotten to delete the entries
which turned into aliases. The real raw statistics are
Number of initial entries: 2950
Number of explicit narrower entries (with BTs): 1226
Number of explicit broader entries (with NTs): 512
Number of entries with references (with RTs): 2134
Final number of SKOS Concepts: 2551
Number of TopConcepts: 1325
Thus, you can't assume that the BT's and NT's are all present in the
original (trex.txt). Alasdair's figure of 512 top concepts assumed
that the IAU thesaurus was reasonably complete and self-consistent.
> What should be the base URI for the thesauri? Can we
> formalise this work within the semantics group and give the
> thesauri a home within the IVOA domain?
I assume this will be needed only after it's possible to have access
to the IVOA domain by working members of the semantics group. I'll
be happier than most when my own address disappears, but I figured
that a real "incorrect" address is better than an imaginary or
cumbersome "correct" address.
Why don't we simply apply for an obivous root URI like
> · Looking at the namespace imports, rdfs, owl and iau93 are
> not used within the document.
Not yet, yes. Easy to get rid of for now.
> · The declared top level concepts should accurately match
> those of the original IAU Thesaurus. (At the moment Rick’s script
> does not generate anything close to the proper model here.)
Well, better than you thought and better now that I've found the
> · The relationships within each concept need to point to
> other concepts. (Although Rick has sorted this out, the version on
> the web is still wrong.)
> · The 398 terms which declare Use relationships should only
> appear as skos:altLabel. For example “ab variable stars” should not
> appear as a concept but as an alternative label for “Bailey Types”
> and “RR Lyrae Stars”.
This problem is solved (it was the bug).
> · Agreement on the format of labels. At the moment Rick has
> left them as they appear in thesaurus files but I feel that it
> would be more user friendly to use lower case with the first word
Frankly, the original document uses (practically) all capitals and we
want to convert the original thesaurus using as few changes as
necessary (the only point of doing it), so why not keep the original
labels? If people hate to be shouted at and think that the IAU93
isn't very user-friendly, all the better. Any other format will have
problems: e.g. you don't really want to turn "BAADE WESSELINK METHOD"
into "Baade wesselink method" - you want people to use the IVOAT and
see "Baade-Wesselink method".
> · Agreement on the format of identifiers. The options that
> have been considered are:
> 1. Generating a new unique identifier, e.g. some number
> 2. Using camel back notation based on the preferred label, so
> “Bailey Types” would have the identifier “BaileyTypes”
> 3. Using a lower case only version of the preferred label, so
> “Bailey Types” would have the identifier “baileytypes”
> Please see the appropriate thread in the semantics list for a full
> discussion of this issue.
... from which you'll see that there are few people who really care.
I still haven't seen any recent complaints about compromise notation
# 2 but previous stronger complaints about #1 and #3. Barring
complaints can we simply adopt #2? There is not perfect solution
(e.g. "Ba II stars" -> "BaIiStars", which looks like something else).
> Once we have agreement on these issues, then the results can be
> applied to the IVOAT.
... and the rest of the thesauri we're going to generate in this
> Cheers (I think I’m going to go for a long drink to recover from
Now you all know how many beers you all owe me.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the semantics