Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Base Schema with fragment / hash in parsed turtle files leads to incorrect IRI resolution for some parsers #47

Open
Lenostatos opened this issue Jun 2, 2024 · 2 comments

Comments

@Lenostatos
Copy link

Hello,

I tried to parse the turtle of your RDF Schema ontology file at https://databus.dbpedia.org/ontologies/w3.org/2000--01--rdf-schema/2020.06.10-215336/2000--01--rdf-schema_type=parsed.ttl with a library that uses the N3.js parser.

There, I ran into a problem with the base IRI resolution. I opened an issue with the maintainers of that library, and it seems that there might be an error in the RDF Schema turtle file: rdfjs-base/parser-n3#15

The problem is in the beginning of the turtle code:

@base <http://www.w3.org/2000/01/rdf-schema#> .
@prefix rdf: <../../1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <> .
@prefix owl: <../../2002/07/owl#> .
@prefix dc: <http://purl.org/dc/elements/1.1/> .

<>
    dc:title "The RDF Schema vocabulary (RDFS)" ;
    a owl:Ontology ;
    rdfs:seeAlso <rdf-schema-more> .

rdfs:Class
    a rdfs:Class ;
    rdfs:comment "The class of classes." ;
    rdfs:isDefinedBy <> ;
    rdfs:label "Class" ;
    rdfs:subClassOf rdfs:Resource .

The hash symbol (#) at the end of the base IRI is apparently stripped by the parser (?) and the resulting triples then contain invalid IRIs:
334765001-5b74cd02-0fc9-4d36-89e8-d6b70e738fef

Unfortunately, I don't have time right now to look too deeply into whether your turtle or the N3.js implementation is correct but I at least wanted to let you know about the issue.

@JJ-Author
Copy link
Collaborator

JJ-Author commented Jun 5, 2024

really interesting feedback - thank you very much.
we use the raptor-utility rapper for parsing that creates this kind of prefix preamble. but indeed reusing the base prefix when defining the rdfs prefix adds more complexity (than actually needed though) so in terms of a more reliable parsing i think it would be better to not have these relative IRIs in the prefix definition. but even when fixing this, still the first 3 triples were left broken...

I tried it with at tool based on RDFLib and it works
https://rdftools.ga.gov.au/convert as expected but this also has the same issue.
I will have a deeper look again to see what actually is correct and what could be done about it.

workaround at the moment:

  • using the parsed ntriples files from Archivo instead, since they can be read by turtle parsers as well

@JJ-Author JJ-Author changed the title The RDF Schema turtle seems to be erroneous Base Schema with fragment / hash leads to incorrect IRI resolution for some parsers Jun 5, 2024
@JJ-Author JJ-Author changed the title Base Schema with fragment / hash leads to incorrect IRI resolution for some parsers Base Schema with fragment / hash in parsed turtle files leads to incorrect IRI resolution for some parsers Jun 5, 2024
@Lenostatos
Copy link
Author

Thank you very much for looking into this @JJ-Author ! And also for the tip with the .nt files. That really helps 😄

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants