-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify semantics aspects #71
Conversation
This introduces a potentially document-breaking change, namely the requirement that datalink/core concept URIs must be relative (i.e., not include the URI). I think everyone has always done it like this, and making this guaranteed makes it a bit simpler to correctly deal with the semantics column (actually, both implementations I know that use values from the semantics columns already make that assumption). This depends on the ivoatex update that comes with PR ivoa-std#70 for citations to resolve.
Le 25/10/2021 à 11:28, msdemlei a écrit :
Clarification of the meaning and use of semantics and content_qualifier.
This introduces a potentially document-breaking change, namely the
requirement
that datalink/core concept URIs must be relative (i.e., not include
the URI).
I think everyone has always done it like this, and making this
guaranteed makes
it a bit simpler to correctly deal with the semantics column (actually,
both implementations I know that use values from the semantics columns
already make that assumption).
Well. During the last DAL running meeting we apparently had a consensus
the content_qualifier will mandate to have full URIs . But you were not
attending Markus.
That's why the initial text of the first PR #51 with content_qualifier
was rewritten like in the recently merged master
Your new text is going in the other direction
See :
https://wiki.ivoa.net/internal/IVOA/IvoaDAL_RunningMeetings/IVOA_DAL_RM12.txt
… This depends on the ivoatex update that comes with PR #70
<#70> for citations
to resolve.
This, I claim, would solve Issue #67
<#67>.
------------------------------------------------------------------------
You can view, comment on, or merge this pull request online at:
#71
<#71>
Commit Summary
* Updating Example 4.5 ("custom access data service")
<586d87e>
* Updating ivoatex.
<2e4f430>
* Clarification of the meaning and use of semantics and
content_qualifier.
<5ce399f>
File Changes
* *M* .gitignore
<https://github.com/ivoa-std/DataLink/pull/71/files#diff-bc37d034bad564583790a46f19d807abfe519c5671395fd494d8cce506c42947>
(2)
* *M* DataLink.tex
<https://github.com/ivoa-std/DataLink/pull/71/files#diff-7afbca7274a5a8b32496d79cc4cc63315fe13869b4e334b784218a2379ff4f63>
(285)
* *M* ivoatex
<https://github.com/ivoa-std/DataLink/pull/71/files#diff-1da03da606aed8b7ca688ea31f0524476ac7b511cdad1b075aa5b9dadbe4d0f2>
(2)
Patch Links:
* https://github.com/ivoa-std/DataLink/pull/71.patch
<https://github.com/ivoa-std/DataLink/pull/71.patch>
* https://github.com/ivoa-std/DataLink/pull/71.diff
<https://github.com/ivoa-std/DataLink/pull/71.diff>
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#71>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMP5LTBVMPYB5VYRIBNOKHDUIUPMHANCNFSM5GUZHUCA>.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
On Mon, Oct 25, 2021 at 02:59:00AM -0700, Bonnarel wrote:
Le 25/10/2021 à 11:28, msdemlei a écrit :
> it a bit simpler to correctly deal with the semantics column (actually,
> both implementations I know that use values from the semantics columns
> already make that assumption).
>
Well. During the last DAL running meeting we apparently had a consensus
the content_qualifier will mandate to have full URIs . But you were not
attending Markus.
That's why the initial text of the first PR #51 with content_qualifier
was rewritten like in the recently merged master
Your new text is going in the other direction
See :
https://wiki.ivoa.net/internal/IVOA/IvoaDAL_RunningMeetings/IVOA_DAL_RM12.txt
Hm. I wonder what Pat's concerns about "undesirable usage" were.
I have no *strong* opinion either way, but if I had been at that
telecon, I'd have said:
(a) Well, it *would* be nicer if content_qualifier worked the same
way as semantics; it's certainly a bit odd that vocabularies are used
in two different ways in the same standard.
(b) Having a standard vocabulary increases the chances that people
will actually do the right thing and take terms from it rather than
just dump it random URIs that no client at all will understand (in
which case it's not machine readable, which kind of defeats the
purpose).
(c) Nobody wants to have long URIs when short words would do most of
the time, which, I think, for content_qualifier is a reasonable
expectation (though I admit I'm not sure what use cases for the long
URIs are there).
(d) Comparing URIs (whose schemes and perhaps authority parts are
supposed to be case-insensitive, with path parts and fragment
identifiers quite certainly case-sensitive) is a huge pain. Let's
spare normal clients that pain.
If the other authors say they've weighed those points and found them
outweighed by whatever concern brought up the full URL thing, I'd
still like the changes to semantics in; and the content_qualifier
text could then probably be something like
Where applicable, concepts from the vocabulary
http://www.ivoa.net/rdf/product-type should be chosen. In
contrast to the semantics column, content_qualifier must always
contain full concept URIs, regardless of whether URIs point into
product-type or somewhere else.
As in the semantics case, non-IVOA concept URIs may be used.
Again, they should resolve to human-readable definitions of the
meaning and intended usage of the concept.
As an example, a light curve service might link to a spectrum of
the object by using #counterpart in the semantics column and
http://www.ivoa.net/rdf/product-type#spectrum in content_qualifier.
Is that preferable to the proponents of full URIs here? Given it's a
bit odd to have two different recipes, I think it would be great if
someone could donate a rationale for the difference (I can't write
that because I don't see a good reason).
|
IIRC, the "undesirable usage" was that if you can use bare product-type terms like "spectrum" and we allow terms from other vocabs, people might use bare terms from them as well, in which case it's just a column where you can put any one word. I think content_qualifier is a little different than semantics: my understanding of using fully qualified URIs in semantics was that it was for a custom term (extension) but still in the same vocabulary (best example is our #thumbnail child of #preview -- extension of datalink/core rather than different vocab entirely). I don't recall off-hand how an rdf doc says it is an extension of another, but if that's possible I would expect any custom FQ term in semantics to be in a vocab that extends datalink/core. That's not true of content-qualifier Anyway, looking at the current text now, it isn't as clear/explicit as I thought and the above from Markus looks fine to me, but I wonder if I am still reading something different into it. I think that content_qualifier could contain URIs from UAT or SimDM or whatever, not just standard product-type and custom extensions, because not all links are "to data products". |
On Mon, Oct 25, 2021 at 09:08:04AM -0700, Patrick Dowler wrote:
IIRC, the "undesirable usage" was that if you can use bare
product-type terms like "spectrum" and we allow terms from other
vocabs, people might use bare terms from them as well, in which
case it's just a column where you can put any one word.
While I'm sure people will put all kind of junk into the field as
long as clients don't do anything sensible with it, I think the hash
"marker" has worked quite well as an indicator that you're not
supposed to put any old junk into semantics.
I think content_qualifier is a little different than semantics: my
understanding of using fully qualified URIs in semantics was that
it was for a custom term (extension) but still in the same
vocabulary (best example is our #thumbnail child of #preview --
extension of datalink/core rather than different vocab entirely). I
No. RDF as such doesn't have much of a notion of a "vocabulary"; it
just gives rules for interpreting triples of URIs, and is rather
relaxed about how to group these URIs.
By giving rules for how RDF resource URIs in the VO schould look
like, we in the VO have our specific idea of what "a" vocabulary is;
it's basically all the concepts in one of our RDF/desise files. If
you write some URI not starting with "the" vocabulary URI, the
corresponding concept is not "in" that vocabulary.
But really, that distinction only has practical relevance only
insofar as clients can be expected to do smart things with terms in the
vocabulary (because they can easily retrieve label, description, and
relationships for them), while for now they can't do that for anything
else, whether or not these concepts are supposed to be related to
concepts in the "core" vocabulary.
We *could* in Vocabularies 2.1 give rules for how people could host
their own IVOA-compliant vocabularies and how clients should deal
with them. But I didn't do that in 2.0 on purpose: It'll be hard
enough to make clients pick up our Semantics tech without the
vagaries of having to pull stuff from all over the web and having to
deal with... loosely... curated semantic resources.
don't recall off-hand how an rdf doc says it is an extension of
another, but if that's possible I would expect any custom FQ term
in semantics to be in a vocab that extends datalink/core. That's
not true of content-qualifier
Again, no, there is no formal or informal requirement that some
custom concept you put into semantics has any relationship to
something in datalink/core, and indeed there is no defined way to
declare such relationships.
Anyway, looking at the current text now, it isn't as clear/explicit
as I thought and **the above from Markus looks fine to me**, but I
wonder if I am still reading something different into it. I think
that content_qualifier could contain URIs from UAT or SimDM or
whatever, not just standard product-type and custom extensions,
because not all links are "to data products".
It certainly would help if we had a clear scenario for that,
ideally of the form: "A datalink service operator wants to declare X
on Y so that a client does Z. They therefore put the URI of concept
X' from Vocabulary V into content_qualifier." Does such a thing
exist somewhere? Does anyone perhaps even do that already?
I'd gladly amend the PR with text in that direction (also for my own
sake, because so far I find all of that so cloudy that I wonder if I
can consider the implementation requirement as satisfied for
content_qualifier in its current shape) -- and it might even provide
enough of a rationale for handling content_qualifier differently from
semantics in case we really want to go back to the
no-default-vocabulary text.
|
+1. I definitely prefer this version than the one in PR #71 and than the initial one I wrote
We don't want to "close the future" by giving a special rule in favor of data-product vocabulary. |
On Tue, Oct 26, 2021 at 05:29:48AM -0700, Bonnarel wrote:
> Again, they should resolve to human-readable definitions of the
> meaning and intended usage of the concept. As an example, a light
> curve service might link to a spectrum of the object by using
> #counterpart in the semantics column and
> http://www.ivoa.net/rdf/product-type#spectrum in
> content_qualifier.
+1. I definitely prefer this version than the one in PR #71 and
than the initial one I wrote
> Is that preferable to the proponents of full URIs here? Given
> it's a bit odd to have two different recipes, I think it would be
> great if someone could donate a rationale for the difference (I
> can't write that because I don't see a good reason).
We don't want to "close the future" by giving a special rule in
favor of data-product vocabulary.
Well -- we don't in either case, so that doesn't help the decision.
In both cases, people can use arbitrary concept URIs.
The question at hand is: "Do we want to have two different ways of
dealing with vocabularies in one standard because there is an
overriding reason?" And my request was to try and figure out what
the overriding reason back in the DAL running meeting was, because
I'd prefer to explain these reasons if we do have them.
Imagine in the case of "semantics=documentation" we want to specify
if it's simple free description, refereed paper, or conference
proceedings paper. content_qualifier would be the right place to
specify that I think. We may imagine having a standard vocabulary
for "documents and papers" in the future.
Sure. But whether or not we define a standard vocabulary for the one
clear use case now, people doing this later would be writing
http://www.ivoa.net/rdf/documentation-type#refereed-paper (say).
There's simply no difference to them.
The difference is for people who have "data products" -- for them,
it's writing #spectrum vs.
http://www.ivoa.net/rdf/product-type#spectrum. And it's perhaps with
implementors who try to make something with content_qualifier and who
with just #spectrum have a slightly simpler time (e.g., no headache
as to whether or not a part of the string needs to be compared
case-insensitively).
Which doesn't make a *big* difference, but I'd not want to make
people write the noticibly more unwieldy full URIs and deal with the
difference to semantics just because of some misunderstanding.
|
hmmm. Since RDF has no notion of a vocabulary and therefore an extension, if I use Substitute The other aspect where short If this doesn't sound crazy, why not allow it? s/w will still only do things automatically if it recognises the |
On Wed, Oct 27, 2021 at 11:17:05AM -0700, Patrick Dowler wrote:
hmmm. Since RDF has no notion of a vocabulary and therefore an
extension, if I use `http://www.opencadc.org/rdf/foo#bag` in
`content_qualifier` there is no implied sense that this is a custom
product-type or a custom astronomical object type or anything else.
Not by RDF itself, and not by current VocInVO. But that is, really,
the reason why I suspect we're doing our client writers a favour if
we say "get vocabulary X and try to interpret the terms that way,
while being graceful when there's a full URI and hence the thing is
not in X".
Only with that vocabulary can clients do all the magic of inserting
labels and exploiting hierarchy at least for the well-known terms.
We *can*, if we really need it, expand this to "voabulary X and Y"
(for very few vocabularies, because in consequence these must be
checked for identifier clashes). Or we can say "also get vocabulary
Y, but be aware that concepts from that will always come as full
URIs" (which I'd recommend).
And of course there's some value in doing "custom contracts" between
services and specialised clients using "singleton" concept URIs as in
your vospace#container example -- but as long as we don't require
clients to pull semantic resources from all over the net (and I'm
sure we don't want that), once you put in arbitrary URIs, 90% of the
magic is gone.
Substitute `http://ivoa.net/rdf/vospace#container` for `bag` and it
would be a real use case; also content_type `text/xml` would not
convey enough information. Also, we could drop the RFE for VOTable
to allow content param in the mimetype and just put `#datalink`
into content_qualifier for recursive datalink.
Hm... Do we do clients a favour if we do that? Suppose I have an
object, and there's a spectrum and a time series attached to it, both
of which are described through datalink documents. Wouldn't a client
still want to know whether to send the link to a spectral or a time
series client?
This would be different if we expected generic "datalink clients".
But this is becoming so speculative that I'd suggest we ought to wait
until someone actually wants to do anything like that. And why they
want that.
The other aspect where short `#term` and full
`http://ivoa.net/rdf/{vocab}#term` comes into play for me is the
VEP process. I had been (in semantics) using FQ uris for prototype
terms, but VEP requires that the term be demonstrated in use.
That's manageable for me because the terms are in s/w, not (eg) in
the database directly. But I wonder: if using a new term is as
simple as `create VEP && start using term` (and be prepared to
change use, of course) then that removes one use of FQ uris. How
Right. That was the intent.
bad would it be if we said that any term in any ivoa vocab could be
used in short form? That seems like it would cover > 98% of use
No, that won't work. A client cannot be expected to pull all the
vocabularies to figure out its label, descripion, and relationships,
and I certainly don't want to require that different vocabularies
cannot use the same identifier.
cases. And I could see making a service to resolve `#term` to
`http://ivoa.net/rdf/{v}#term` (which in principle would have to
allow for multiple returns in some cases).
...in which case a client is totally in the rain. Would it show all
the labels? Guess which relationships to use? Also, that service
would again require clients to access network resources while doing
semantics, which I'm sure we want to avoid if at all possible.
Frankly: My impression is that this discussion is another instance of
where we introduce a feature with the server side in mind, and as
long as no client actually consumes the stuff, and there are hazy
additional use cases in the air, it's really hard to pin down
requirements and limitations. Which makes it really hard to know
what will make the lives of future clients hard and what wouldn't.
Given that situation, I'd again say "let's concentrate on the use
case we understand to a certain degree and make that work well".
That's the "find an appropriate SAMP client", and for that, it's
reasonable to recommend to clients "Get product-type and work with
it; but be aware that there can be other stuff in that field". It's
kind of working for semantics, and I've not yet seen a reason in this
discussion why it shouldn't work for content_qualifier.
|
Corresponds to changes from issue ivoa-std#67 and PR ivoa-std#71.
Clarification of the meaning and use of semantics and content_qualifier.
This introduces a potentially document-breaking change, namely the requirement
that datalink/core concept URIs must be relative (i.e., not include the URI).
I think everyone has always done it like this, and making this guaranteed makes
it a bit simpler to correctly deal with the semantics column (actually,
both implementations I know that use values from the semantics columns
already make that assumption).
This depends on the ivoatex update that comes with PR #70 for citations
to resolve.
This, I claim, would solve Issue #67.