Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UCO should perform OWL 2 DL review with SHACL-SPARQL #406

Closed
11 tasks done
ajnelson-nist opened this issue Jun 30, 2022 · 3 comments · Fixed by #412
Closed
11 tasks done

UCO should perform OWL 2 DL review with SHACL-SPARQL #406

ajnelson-nist opened this issue Jun 30, 2022 · 3 comments · Fixed by #412

Comments

@ajnelson-nist
Copy link
Contributor

ajnelson-nist commented Jun 30, 2022

Background

Many questions have come up over the years of UCO's development related to whether its usage of OWL 2 DL is correct. Development since the UCO and CASE prototypes has been performed without using an OWL review mechanism that could determine elementary issues such as whether the Turtle syntax was correct (resolved once a syntax normalizer was adopted for Continuous Integration), through advanced issues such as whether the ontology defines unsatisfiable classes (i.e. classes that constrain themselves, intentionally or otherwise, to always be empty).

Some engines have been tried to determine UCO's OWL 2 DL conformance, but have frequently met issues with being incompatible in some way or another with UCO's usage of SHACL. Most pointedly, SHACL is not defined as an OWL ontology in any way that exercises OWL classes or properties, which causes OWL validation of a SHACL ontology to halt due to considering sh:-prefixed concepts to be incompletely defined.

UCO needs some constraints from OWL 2 DL, such as ensuring disjointedness of object-properties from datatype-properties. There are also significant node constraints within OWL 2 DL that have proven difficult to determine even after several read-throughs of specifications, and misunderstanding those constraints could accidentally move a UCO graph from OWL 2 DL into OWL FULL where behaviors are undefined. To wit, prior considerations for ontology versioning and for reification of triples both encountered significant strategic revisions after finding part of the intended strategy was disallowed in OWL 2 DL.

SHACL provides SPARQL-based mechanisms (in SHACL-SPARQL; see examples) to identify triple combinations that should not appear in an ontology-graph or data-graph. UCO should make best-effort usage of SPARQL-based constraints to validate its OWL usage.

Requirements

Requirement 1

UCO must be able to validate its conformance against OWL 2 DL, in at least partial degree.

Requirement 2

Extensions to UCO (such as ontology revisions under draft outside of this Git repository, and private extensions) must be able to use UCO's OWL 2 DL conformance-review mechanism.

Requirement 3

The transitive closure of UCO's imports must be testable with at least the same OWL 2 DL stringency as is applied to UCO.

Risk / Benefit analysis

Benefits

  • Definition of OWL 2 DL conformance in SHACL shapes adds a review mechanism that is compatible with UCO's usage of both OWL and SHACL.
  • Recent proposals (such as CASE's AnalyticInference proposal, and a paused proposal reviewing UCO's syntax of enumerant-based datatypes) have been significantly slowed from early attempts exercising OWL mechanisms. Having mechanically-reviewed rules will reduce confusion with design and implementation of new proposals.
  • Review with OWL-focused SHACL shapes will help UCO measure risk of new adoptions of ontologies.

Risks

  • The goal of this proposal is NOT to implement all of OWL 2 DL in SHACL. Full OWL 2 DL review needs to handle operations like expansion of abstract class definitions, identification of constraints that reduce to empty sets, and inconsistency declarations like recognizing when an empty set is also asserted to have a member. It's not clear if this is possible with SHACL and SPARQL.
  • It is possible an effort to validate OWL 2 DL with SHACL (to the maximal extent possible) exists. The proposer has not been able to locate such an effort.
  • Most of the OWL-focused constraints seem to require SHACL-SPARQL to implement. While these may seem expensive for review, they will only infrequently (if ever) run in the "ABox" graphs of users' knowledge bases - that is, portions definining concrete individuals, rather than classes and properties. So, their estimated impact on SHACL validation is only expected to be felt in unit testing for "TBox" (class/property/datatype) focused graphs like what is in the UCO Git repository. (The JSON-LD samples under tests/examples/ in this repository are examples of "ABox" graphs.)
  • UCO CP-100 took a shortcut with rdf:List for the purpose of easing maintainability of OWL enumerant-based datatypes and UCO's semi-open vocabularies needing to be able to reference member lists in SHACL shapes. This shortcut was called out as a known act of delaying a OWL 2 DL conformant implementation. For better or worse, the SHACL shapes accompanying this proposal flag that as an error, inducing the need to undo that shortcut. This causes two risks:
    • Test timing - This will at least double parallel-testing time (i.e. make -j), and triple non-parallel testing time (make without -j, as the CI runs it), because rdf-toolkit takes an extensive amount of time to sort long rdf:Lists, especially those in the vocabulary namespace, and they will now be duplicated in the observable namespace. This does not currently cause a risk of timeouts on Github Actions, as the default timeout is currently 6 hours.
    • List consistency - So long as UCO uses this current semi-open vocabulary design, an additional list-review mechanism needs to be deployed to ensure vocabulary members copied into SHACL match with members as they're recorded in rdfs:Datatypes.

Competencies demonstrated

Competency 1

As part of CI testing, UCO can now review its conformance with OWL 2 DL.

Competency Question 1.1

What does UCO define as best-effort conformant with OWL 2 DL?

Result 1.1

A review of the uco-owl namespace shows shapes that quote and link the OWL 2 specification.

Competency Question 1.2

How does UCO test that its (TBox) ontology is conformant with OWL 2 DL?

Result 1.2

Within the CI, a monolithic build of UCO is constructed, combining all of the Turtle files under the ontology/ directory. Before that file is syntax-normalized, pyshacl is used to review the combined file with the uco-owl namespace's shapes. See tests/Makefile, target uco_monolithic.ttl.

Competency Question 1.3

What other ontologies can be reviewed with the uco-owl namespace?

Result 1.3

The uco-owl namespace tests conformance versus OWL 2 DL, as well as some implications for SHACL shapes, such as confirming that DatatypePropertys used in PropertyShapes aren't constrained to match non-Literals. This can apply to ontologies that are more focused on TBoxes (classes/properties/datatypes), or broader ABox knowledge-bases such as tool output mapped into UCO.

The support done for the broader ABox-oriented knowledge bases is currently review of ontology imports' transitive closure, and owl:Axioms for assertion-annotations. See especially the shapes pertaining to owl:Axiom, owl:ontologyIRI, and owl:versionIRI.

Solution suggestion

  • Add UCO-OWL namespace, IRI https://ontology.unifiedcyberontology.org/owl, prefix uco-owl:.
  • Define SHACL shapes that include citations (using the generic rdfs:seeAlso) to OWL 2 documentation.
    • Use default sh:severity (sh:Violation) for "MUST NOT" pattern matches.
    • Use sh:severity sh:Warning for "SHOULD NOT " pattern matches.
  • Revert assignment of IRIs for rdf:Lists done to ease semi-open vocabulary synchronization.
  • Add unit test for semi-open vocabulary synchronization.
  • Add PASS and XFAIL JSON-LD samples for each uco-owl: shape.

Coordination

  • Tracking in Jira ticket OC-157
  • Administrative review completed, proposal announced to Ontology Committees (OCs) on 2022-06-29
  • Requirements to be discussed in OC meeting, 2022-07-12
  • Requirements Review vote occurred, passing, on 2022-07-12
  • Requirements development phase completed.
  • Solution announced to OCs on 2022-07-22
  • Solutions Approval to be discussed in OC meeting, 2022-07-28.
  • Solutions Approval vote occurred, passing, on 2022-08-09
  • Solutions development phase completed.
  • Implementation merged into develop
  • Milestone linked
  • Documentation logged in pending release page
@ajnelson-nist
Copy link
Contributor Author

PR 407 is posted to help with review, but it will be replaced with another patch series after tomorrow's meeting.

ajnelson-nist added a commit to casework/CASE-Implementation-PROV-O that referenced this issue Jul 8, 2022
This follows the general pattern of recent UCO import-review shapes
files, for the Collections Ontology (389) and OWL (406).

References:
* ucoProject/UCO#389
* ucoProject/UCO#406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Jul 23, 2022
This patch undoes an engineering convenience put in place as part of UCO
CP-100.  RDF Lists that were part of semi-open vocabularies were given
IRIs, so they could be referenced for OWL datatype definitions and for
SHACL membership testing.  This was acknowledged as an incompatibility
with OWL 2 DL, which requires that RDF Lists be identified as blank
nodes.  The concepts were intended to remain until an OWL test mechanism
would identify this error.

A test mechanism is now under development as part of UCO Issue 406, and
correctly flags IRI-identified RDF lists.  Hence, this patch undoes the
change.

To ensure the RDF lists are kept in sync. across their duplicate
locations, a Python unit test has been added to confirm list-equality.

References:
* [UCO OC-12] (CP-100) UCO's idea of "Open vocabulary" does not agree
  with its implementation with owl:oneOf
* #406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Jul 23, 2022
A follow-on patch will refresh Make-managed files.

References:
* #406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Jul 23, 2022
References:
* #406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Jul 23, 2022
A draft version of this patch series assisted in reviewing Issue 389.

References:
* #389
* #406

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist ajnelson-nist linked a pull request Jul 28, 2022 that will close this issue
7 tasks
@frederich-stine frederich-stine mentioned this issue Aug 3, 2022
10 tasks
ajnelson-nist added a commit that referenced this issue Aug 8, 2022
This test builds on the PR for Issue 406, and will fail CI as it is
currently filed.  The failure is an intentional demonstration of
non-conformance.  This test will need to be merged into another branch
that had applied the syntax fix.

References:
* #406
* #435

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Aug 9, 2022
References:
ucoProject/UCO#406
Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Aug 11, 2022
References:
* ucoProject/UCO#406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Aug 11, 2022
The semi-open vocabulary pattern is in conformance with the OWL
enforcement designed in Issue 406.

The imported concepts match the state of CASE-Examples PR 90.

References:
* [UCO OC-119] (CP-43) Represent recoverability of unallocated files
* casework/CASE-Examples#90
* #406

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist ajnelson-nist added this to the UCO 1.0.0 milestone Aug 11, 2022
ajnelson-nist added a commit that referenced this issue Aug 12, 2022
This typing error was flagged by the OWL SHACL review mechanism.

References:
* #375
* #406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Aug 12, 2022
This typing error was flagged by the OWL SHACL review mechanism.

References:
* #375
* #406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Aug 12, 2022
This typing error was flagged by the OWL SHACL review mechanism.

References:
* #375
* #406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Aug 15, 2022
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#401
* ucoProject/UCO#406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Aug 15, 2022
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Aug 15, 2022
A follow-on patch will regenerate Make-managed files.

References:
* ucoProject/UCO#401
* ucoProject/UCO#406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Examples that referenced this issue Aug 15, 2022
References:
* ucoProject/UCO#406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/casework.github.io that referenced this issue Aug 15, 2022
ajnelson-nist added a commit that referenced this issue Aug 18, 2022
Somewhere in the chain between core rdflib, pyshacl, and
rdf-toolkit.jar, the normalized Turtle content sometimes picks up
redundant list artifacts that vary between runs.  The anonymous list cut
in this patch has been seen to waver in how much of it is left as a
detached blank node.

This patch removes the orphaned list.  Guidance needs to be provided
that, as long as this bug in the tool stack persists, the Make-managed
generated output should not have it Git-tracked.

References:
* #406

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

An objection to this proposal's Solution was made in a committee meeting, and has been logged here.

ajnelson-nist added a commit that referenced this issue Aug 19, 2022
SHACL Specification Section 5.2.1 specifies a "viral" behavior of
`sh:declare` throughout an OWL transitive closure.  This patch removes
usage of `sh:declare` as a matter of lack of authority for non-UCO
prefixes.  It just so happens the only place this was used was in the
introduction of the OWL SHACL review mechanisms of Issue 406.

A follow-on patch will regenerate Make-managed files.

References:
* #406
* #457

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit that referenced this issue Aug 19, 2022
References:
* #406
* #457

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE that referenced this issue Aug 19, 2022
This is a downstream application of two proposals:

* UCO CP-100 implemented the suggested-value enforcement pattern for
  semi-open vocabularies.
  This was a part of UCO 0.8.0.
* UCO Issue 406 adjusted the implementation pattern from UCO CP-100 to
  account for an OWL requirement on `rdf:List` usage.
  This has been approved for UCO 1.0.0.

A third proposal adjusting the CASE vocabulary namespace, UCO Issue 435,
would also apply, but has not had an approval vote yet.  Due to vote and
other Git logistics, that will be handled separately.

References:
* [UCO OC-12] (CP-100) UCO's idea of "Open vocabulary" does not agree
  with its implementation with owl:oneOf
* ucoProject/UCO#406
* ucoProject/UCO#435

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE that referenced this issue Aug 19, 2022
This is a downstream application of two proposals:

* UCO CP-100 implemented the suggested-value enforcement pattern for
  semi-open vocabularies.
  This was a part of UCO 0.8.0.
* UCO Issue 406 adjusted the implementation pattern from UCO CP-100 to
  account for an OWL requirement on `rdf:List` usage.
  This has been approved for UCO 1.0.0.

A third proposal adjusting the CASE vocabulary namespace, UCO Issue 435,
would also apply, but has not had an approval vote yet.  Due to vote and
other Git logistics, that will be handled separately.

No effects were observed on Make-managed files.

References:
* [UCO OC-12] (CP-100) UCO's idea of "Open vocabulary" does not agree
  with its implementation with owl:oneOf
* ucoProject/UCO#406
* ucoProject/UCO#435

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE that referenced this issue Aug 20, 2022
This is a downstream adoption of UCO Issue 406.

The Git submodule is updated to the earliest state necessary to make use
of the OWL SHACL tests.

No effects were observed on Make-managed files.

References:
* ucoProject/UCO#406

Signed-off-by: Alex Nelson <[email protected]>
@ajnelson-nist
Copy link
Contributor Author

For the OCs' awareness - there is another effect of adding OWL review using pyshacl. It is noted in this comment.

ajnelson-nist added a commit that referenced this issue Aug 31, 2022
The SHACL import-review ontologies didn't get `rdfs:label`s assigned
when they were created.  This patch adds them to fit style practice with
other UCO ontology files.

No effects were observed on Make-managed files.

References:
* #390
* #406

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Utilities-Python that referenced this issue Sep 2, 2022
RDFS and OWL are receiving aliases for in-common spelling in adopters'
code.  OWL also specifically got further support in some UCO issues.

This patch also adds a `Namespace` for the import of the Collections
Ontology, and the new UCO namespace `configuration`.

References:
* ucoProject/UCO#389
* ucoProject/UCO#406
* ucoProject/UCO#432
* ucoProject/UCO#437

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Utilities-Python that referenced this issue Sep 2, 2022
RDFS and OWL are receiving aliases for in-common spelling in adopters'
code.  OWL also specifically got further support in some UCO issues.

This patch also adds a `Namespace` for the import of the Collections
Ontology, and the new UCO namespace `configuration`.

References:
* ucoProject/UCO#389
* ucoProject/UCO#406
* ucoProject/UCO#432
* ucoProject/UCO#437

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue Sep 2, 2022
One potential bug has been flagged with this shape, implemented in UCO
Issue 406:
`uco-owl:ObjectProperty-shacl-constraints-shape`

The `sh:PropertyShape` raising the bug has been given an IRI in order to
link a deactivation rationale.

A new shapes file `debug.ttl` has been added to disable that shape until
a test is written to confirm the CASE-Corpora shape is correct.

`Facet`s that were blank nodes have been given IRIs, per the
implementation of UCO Issue 430.  New `sh:Info`-severity violations are
reported for some URLs treated in the "URL as an `rdfs:Resource` manner,
which will not be given UUID endings.  `case_validate` is called with
`--allow-warnings`, but is intended to be called with `--alow-infos`;
that will have to wait for `case-utils` Issue 70 to resolve.

Imports of CASE and UCO ontologies now use their `owl:versionIRI`s,
implemented in UCO Issue 437.

A follow-on patch will regenerate Make-managed files.

References:
* casework/CASE-Utilities-Python#70
* ucoProject/UCO#406
* ucoProject/UCO#430
* ucoProject/UCO#437

Signed-off-by: Alex Nelson <[email protected]>
ajnelson-nist added a commit to casework/CASE-Corpora that referenced this issue Mar 15, 2023
These issues were flagged by the UCO OWL SHACL shapes.

References:
* ucoProject/UCO#406

Signed-off-by: Alex Nelson <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant