Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File and URL should be designated disjoint classes #536

Closed
15 tasks done
ajnelson-nist opened this issue Jun 13, 2023 · 1 comment · Fixed by #539, #538 or #553
Closed
15 tasks done

File and URL should be designated disjoint classes #536

ajnelson-nist opened this issue Jun 13, 2023 · 1 comment · Fixed by #539, #538 or #553

Comments

@ajnelson-nist
Copy link
Contributor

ajnelson-nist commented Jun 13, 2023

Disclaimer

Participation by NIST in the creation of the documentation of mentioned software is not intended to imply a recommendation or endorsement by the National Institute of Standards and Technology, nor is it intended to imply that any specific software is necessarily the best available for the purpose.

Background

In the 2023-04-18 Ontology Committees meeting, the OCs discussed Issue 534, which in brief is about whether three observable:ObservableObject subclasses that are currently unrelated to one another could be used together to represent downloading a file from a URL with an expectation of certain hashes being computable.

One of the points that came out of the discussion was a general agreement that observable:File and observable:URL should be disjoint classes.

No commentary was made on how observable:ContentData relates or doesn't relate to either of those classes.

There also was not a suggestion on whether there is a superclass of observable:File or observable:URL that would be a more appropriate disjointedness target. But, the belief is that this specific disjointedness designation would be compatible with future modeling refinements.

Requirements

Requirement 1

UCO must prevent a user from designating a node as both an observable:File and observable:URL.

Risk / Benefit analysis

Benefits

  • This aligns with Ontology Committee members' intutions.

Risks

  1. New disjointedness designations would need to be added as SHACL shapes with sh:Warning severity for UCO 1.x.0, and could be designated sh:Violation severity only in a future major release. This is believed low-risk, as the practice is being exercised in other proposals currently.
  2. This restriction on typing does nothing to resolve whether it is appropriate to continue duck-typing an individual node as like a file and like a URL by giving the node a observable:FileFacet and observable:URLFacet. UCO Facets still permit this, and no policy in English, OWL, or SHACL disallows it.
  3. This proposal sidesteps the original question of how to associate "Expected" hashes with a URL that is expected to provide a file.
  4. This restriction lacks modeling rationale stated beyond the OCs' intuition. The discussion in the meeting included asides like "A URL is more an address, or locator, which a file isn't." While this aligns with intuition, for reasons unclear to the proposer, observable:URL is not currently a subclass of observable:Address. Was this an oversight? If so, is it appropriate to add to UCO these statements: observable:Address owl:disjointWith observable:File . and observable:URL rdfs:subClassOf observable:Address .?

Competencies demonstrated

Competency 1

A user is trying to represent a downloadable file. (This is compiled and excerpted from the same example data in #534.)

<https://files.pythonhosted.org/packages/d4/f9/28260b3e9335605ac2093779e9780acaaba2c0794a47a53822a0c98e52d9/case_utils-0.10.0-py3-none-any.whl>
	a
		uco-observable:ContentData ,
		uco-observable:File ,
		uco-observable:URL
		;
	uco-core:hasFacet
		kb:ContentDataFacet-2e1a9cee-1353-471d-b318-92fc9da7280b ,
		kb:FileFacet-82fd5577-bed0-4f7f-ba3f-08d3583c2efb ,
		kb:URLFacet-a78e2688-44b8-4eb9-b474-33c5e2b3c32a
		;
	.

kb:ContentDataFacet-2e1a9cee-1353-471d-b318-92fc9da7280b
	a uco-observable:ContentDataFacet ;
	uco-observable:hash kb:Hash-cb51e845-086c-43a7-99ef-6d44569e2143 ;
	uco-observable:sizeInBytes 537812 ;
	.

kb:FileFacet-82fd5577-bed0-4f7f-ba3f-08d3583c2efb
	a uco-observable:FileFacet ;
	uco-observable:fileName "case_utils-0.10.0-py3-none-any.whl" ;
	uco-observable:sizeInBytes 537812 ;
	.

kb:Hash-cb51e845-086c-43a7-99ef-6d44569e2143
	a uco-types:Hash ;
	uco-types:hashMethod "SHA256"^^uco-vocabulary:HashNameVocab ;
	uco-types:hashValue "daf617d96b1dc74b2953f82067365b1858cbe0e9d4a9d2659091f23951129bc1"^^xsd:hexBinary ;
	.

kb:URLFacet-a78e2688-44b8-4eb9-b474-33c5e2b3c32a
	a uco-observable:URLFacet ;
	uco-observable:fullValue "https://files.pythonhosted.org/packages/d4/f9/28260b3e9335605ac2093779e9780acaaba2c0794a47a53822a0c98e52d9/case_utils-0.10.0-py3-none-any.whl" ;
	.

Competency Question 1.1

Is this conformant UCO data? Should it be?

Result 1.1

In UCO 1.2.0, yes this is conformant; but per this proposal, no, it should not be, because the URL should not be considered to be a file. This situation is flaggable with this constraint being added to observable:File:

observable:File
	sh:not [
		a sh:NodeShape ;
		sh:class observable:URL ;
	] ;
	.

(That constraint would work, but in an oversimplified manner; the solution description section provides a fuller implementation and rationale.)

Competency Question 1.2

Before any download action takes place from that files.pythonhosted.org URL, what is the association between the hash daf617d... and the URL https://files.pythonhosted.org/packages/d4/f9/28260b...?

Result 1.2

The answer to this question is out of scope of this proposal.

Suggestions are welcome, but likely need to be part of future proposal(s). The proposer has in mind a potential solution based on Qualities that might also be of interest to the Adversary Engagement Ontology.

Solution suggestion

First, designate with OWL that observable:File and observable:URL are disjoint by adding this one triple:

observable:File
	owl:disjointWith observable:URL ;
	.

Then, a new shape specialized to the pairwise disjointedness of observable:File and observable:URL:

observable:File-disjointWith-URL-shape
	a sh:NodeShape ;
	sh:message "observable:File and observable:URL are disjoint classes."@en ;
	sh:not [
		a sh:NodeShape ;
		sh:class observable:URL ;
	] ;
	sh:targetClass observable:File ;
	.

Solution discussion

The reasons for adding a shape specialized to the pair are for (1) shape performance, and (2) deprecation management.

First, on shape performance: It is possible to use a general-purpose "Find all disjoint-set members" SPARQL query that would work across all OWL usage. One has been used in CASE-Corpora for some months, defined here, and it has assisted with finding modeling errors by only needing a sole owl:disjointWith statement to be added to an ontology. However, to use that shape, some degree of inferencing (/graph expansion) is required, either RDFS- or OWL-based. And further, this is reliant on a SPARQL engine's performance capabilities.

Second, on deprecation management: Recently, CDO shapes repositories have been begun to explore potential concurrent usage of other ontologies with UCO. The Friend-of-a-Friend shapes repository, used in the UCO FOAF Profile, handles these disjointedness statements, which are all of the disjointWith occurrences in FOAF:

foaf:Document
	owl:disjointWith
		foaf:Organization ,
		foaf:Project
		;
	.

foaf:Organization
	owl:disjointWith
		foaf:Document ,
		foaf:Person
		;
	.

foaf:Person
	owl:disjointWith
		foaf:Organization ,
		foaf:Project
		;
	.

foaf:Project
	owl:disjointWith
		foaf:Document ,
		foaf:Person
		;
	.

Note that not all the classes mentioned are disjoint with all of the other classes. For instance, it is conformant with FOAF to have a node that is both a foaf:Organization and foaf:Project, despite both those classes being disjoint with foaf:Document.

An initial draft of the shape to represent Documents being disjoint with Organizations and Projects looked like this:

sh-foaf:Document-disjointedness-shape
	a sh:NodeShape ;
	sh:message "foaf:Document is a disjoint class with foaf:Organization and foaf:Project."@en ;
	sh:not [
		a sh:NodeShape ;
		sh:or (
			[
				a sh:NodeShape ;
				sh:class foaf:Organization ;
			]
			[
				a sh:NodeShape ;
				sh:class foaf:Project ;
			]
		) ;
	] ;
	sh:targetClass foaf:Document ;
	.

(The nested sh:or is because SHACL requires that a single sh:NodeShape not have two values of sh:not.)

Instead, these shapes were implemented, copied here:

sh-foaf:Document-disjointWith-Organization-shape
	a sh:NodeShape ;
	sh:message "foaf:Document and foaf:Organization are disjoint classes."@en ;
	sh:not [
		a sh:NodeShape ;
		sh:class foaf:Organization ;
	] ;
	sh:targetClass foaf:Document ;
	.

sh-foaf:Document-disjointWith-Project-shape
	a sh:NodeShape ;
	sh:message "foaf:Document and foaf:Project are disjoint classes."@en ;
	sh:not [
		a sh:NodeShape ;
		sh:class foaf:Project ;
	] ;
	sh:targetClass foaf:Document ;
	.

The reasons were:

  • The shape with sh:or ends up obscuring which of the classes, Organization or Project, triggered the violation. An sh:message cannot be fruitfully embedded deeper in the sh:or tree. That is, the deeper message does not display in the SHACL validation report. (This was, at least, the proposer's experience using pyshacl. A functioning demonstration rebutting this belief of SHACL incapability is welcome.) So for specificity, the specialized shapes were implemented.
    • Related to not being able to nest sh:message: Piling the sh:not into a general shape targeting the class (such as UCO does) would leave disjointedness violations having a description message that is basically a repetition of the Turtle-encoded SHACL. It is likely to be a better user experience to provide a short natural-language sentence, especially versus a sh:not, around a sh:or, around several shapes describing classes and possibly complements of classes.
  • The specialized shapes also enable documenting deprecation for the pair's disjointedness (i.e., making it OK for a node to be both classes again) at an IRI. It is documentable with an rdfs:comment if using the sh:or style, but that comment would not be necessary to record. A specialized IRI would at least leave the IRI in place (per UCO policy on retaining IRIs), which could explicitly do nothing. It's a fair debate which is "better" style, but the proposer believes separate IRIs is more compatible with UCO policy and historic record-keeping.

In summary, these will be added:

  • observable:File owl:disjointWith observable:URL ..
  • The node shape observable:File-disjointWith-URL-shape.
    • This shape will bear a sh:severity sh:Warning level until UCO 2.0.0, unless requested for further delay.

Coordination

  • Tracking in Jira ticket OC-293
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on 2023-06-13
  • Requirements to be discussed in OC meeting, 2023-06-20
  • Requirements Review vote occurred, passing, on 2023-06-20
  • Requirements development phase completed.
  • Solution announced to OCs on 2023-07-03
  • Solutions Approval to be discussed in OC meeting, 2023-07-20
  • Solutions Approval vote occurred, passing, on 2023-07-20
  • Solutions development phase completed.
  • Backwards-compatible implementation merged into develop for the next release
  • develop state with backwards-compatible implementation tracked by CASE develop branch (for prerelease delivery on CASE website)
  • develop state with backwards-compatible implementation merged into develop-2.0.0
  • Backwards-incompatible implementation merged into develop-2.0.0
  • develop-2.0.0 state with backwards-incompatible implementation tracked by CASE develop-2.0.0 branch (for prerelease delivery on CASE website)
  • Milestone linked
  • Documentation logged in pending release page
@sbarnum
Copy link
Contributor

sbarnum commented Jun 20, 2023

This makes sense to me

ajnelson-nist added a commit that referenced this issue Jun 29, 2023
A follow-on patch will regenerate Make-managed files.

References:
* #536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Jun 29, 2023
References:
* #536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue Jun 29, 2023
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to ucoProject/UCO-Archive that referenced this issue Jun 29, 2023
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to ucoProject/UCO-Archive that referenced this issue Jun 29, 2023
References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Archive that referenced this issue Jun 29, 2023
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Jun 29, 2023
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Jun 29, 2023
References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Jun 29, 2023
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue Jul 7, 2023
This patch defines a `observable:File`-like object to separate
`observable:URL`s from `observable:File`s, reflecting a design decision
from the UCO and CASE Ontology Committees related to UCO Issue 536.

To reflect new usage, the Digital Corpora supplemental-graph script is
adapted to now separate `Facet` assignment between `drafting:S3Object`s
and `observable:URL`s.  The Android 10 CASE supplemental graph has also
been updated to reflect the new `Facet` IRIs.

A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue Jul 7, 2023
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to ajnelson-nist/CASE-Examples-QC that referenced this issue Jul 7, 2023
References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist ajnelson-nist linked a pull request Aug 14, 2023 that will close this issue
4 tasks
ajnelson-nist added a commit to casework/CASE that referenced this issue Aug 17, 2023
No effects were observed on Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Oct 20, 2023
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Oct 20, 2023
References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Oct 23, 2023
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Oct 23, 2023
References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to ucoProject/ucoproject.github.io that referenced this issue Nov 28, 2023
References:
* ucoProject/UCO#536

Signed-off-by: Alex Nelson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment