Format of tokens (was Re: Fwd: Re: IVOA Thesaurus)
mjg at cacr.caltech.edu
Thu Nov 1 14:50:14 PDT 2007
I think that Norman's ideas actually make quite good sense.
Douglas Burke wrote:
> Brian Thomas wrote:
>> On Thursday 01 November 2007 1:06:55 pm Frederic V. Hessman wrote:
>>> At the time, there where lots of voices saying that, while you are
>>> perfectly correct (and I'd prefer to have them as humanly readable
>>> as possible), the realities of computer-based parsing mean that a
>>> trivial token format costs less pain.
>>> How about an official show of hands?
>> Could we have the arguments against human readable again first,
>> before voting?
> Norman wrote the following in an email on Oct 10 - Versions and
> namespaces (was: Vocab AND Ontology?) - where >> indicates a quite
> from Rick.
> >> I personally find the revamped token list to be much more
> palatable (which is obviously why I did it), being nearly human-usable
> (I don't like to be shouted at by capitalized tokens) and with
> implicit additional info (e.g. formal names of people and objects).
> Doug brought up the issue of how to generate the concept names, as URI
> fragments. This is a stylistic point, but I think an important one.
> I'd like to suggest a rather drastic canonicalisation, so that "He+
> ionization zone" would turn into #heionizationzone. This is a
> pragmatic middle ground between having the concept name mirror the
> label, and having it fully opaque (such as #concept12345).
> Having it consist of only lowercase alpha means (a) we're guaranteed
> to avoid any parsing troubles, with RDF parsers or with anything else;
> (b) it's clear to anyone looking at this that they're not supposed to
> be displaying the concept name, but using the concept's 'Label' and
> declared relationships instead; while (c) it retains some mnemonic value.
> There is a case which can be made for having fully opaque concept
> names (this is what's done in the Gene Ontology, for example): it's
> point (b) above, plus it removes any temptation to argue about
> relationships based on the name alone. Despite that, I think there's
> value in making it at least partly human-recognisable.
More information about the semantics