Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefixes not used and full IRI displayed in generated data #233

Open
nicolastoira opened this issue Mar 14, 2024 · 1 comment
Open

Prefixes not used and full IRI displayed in generated data #233

nicolastoira opened this issue Mar 14, 2024 · 1 comment
Labels
enhancement New feature or request question Further information is requested upstream Issue is related to another component then the RMLMapper

Comments

@nicolastoira
Copy link

nicolastoira commented Mar 14, 2024

I'm converting JSON files to turtle files with the RMLMapper. In my mapping file I have a long list of prefixes that should be used when the output data is generated. In general this works fine and the generated data is displaying the IRIs with the prefixing applied. Nevertheless, there are some cases where the prefix is not used and the full IRI is reported in the generated data. I was therefore wondering if there is some intrinsic logic that rejects some prefixes compared to others.

For example I have the following prefixes. The first one is correctly replaced while the second is not:

@prefix snomed: <http://snomed.info/id/> .
@prefix obi: <http://purl.obolibrary.org/obo/OBI_> .

Test data:

{
    "content": {
        "sphn:Assay": [
            {
                "sphn:hasCode": {
                    "termid": "OBI-0002188",
                    "iri": "http://purl.obolibrary.org/obo/OBI_0002188",
                    "sourceConceptID": "9a9e4310-8fb8-4bab-a875-4cd37cbd7025"
                }
            },
            {
                "sphn:hasCode": {
                    "termid": "SNOMED-CT-1149430001",
                    "iri": "http://snomed.info/id/1149430001",
                    "sourceConceptID": "9a9e4310-8fb8-4bab-a875-4cd37cbd7025"
                }
            }
        ]
    }
}

Generated data:

resource:PROVIDER-sphn-Assay-9a9e4310-8fb8-4bab-a875-4cd37cbd7025-sphn-Code-OBI-0002188
  a <http://purl.obolibrary.org/obo/OBI_0002188> .

resource:PROVIDER-sphn-Assay-9a9e4310-8fb8-4bab-a875-4cd37cbd7025-sphn-Code-SNOMED-CT-1149430001
  a snomed:1149430001 .

The mapping RML logic is the following:

:sphnAssay_sphnhasCode_rangesphnTerminology a rr:TriplesMap ;
    rml:logicalSource [ rml:iterator "$.content.sphn:Assay[*].sphn:hasCode" ;
            rml:referenceFormulation ql:JSONPath ;
            rml:source "patient_data_input.json" ] ;
    rr:predicateObjectMap [ rr:objectMap [ rml:reference "iri" ;
                    rr:termType rr:IRI ] ;
            rr:predicate rdf:type ] ;
    rr:subjectMap [ rr:template "resource:PROVIDER-sphn-Assay-{sourceConceptID}-sphn-Code-{termid}" ] .

As you can see, even if the prefix is defined in the RML mapping file, we get <http://purl.obolibrary.org/obo/OBI_0002188> while the expected result should be obi:0002188. If I modify the prefix to something like this @prefix obi: <http://purl.obolibrary.org/obo/OBI/> . and change the input data to "iri": "http://purl.obolibrary.org/obo/OBI/0002188" it works as expected.

Do you see any issues with the prefix definition or is there any logic in the RML mapper that blocks the correct replacement of the namespace prefix? Thank you.

@DylanVanAssche
Copy link
Contributor

Hi!

Do you see any issues with the prefix definition or is there any logic in the RML mapper that blocks the correct replacement of the namespace prefix? Thank you.

With the RML mapping you provided, you seem to try to make a Turtle shortcut with an rr:template: "resource:PROVIDER-sphn-Assay-{sourceConceptID}-sphn-Code-{termid}". This won't work for other RDF serializations as it is not a proper IRI.

we get http://purl.obolibrary.org/obo/OBI_0002188 while the expected result should be obi:0002188.

The RDF library inside the RMLMapper generates Turtle in a certain way, it does not (always) use shortcuts that are available in the Turtle language. Unfortunately, that's related to the Turtle specification, it does not have a proper way to say: 'use shortcut X' or 'expand Y'. It allows these shortcuts and it is up to the implementation to pick one.

If I modify the prefix to something like this @Prefix obi: http://purl.obolibrary.org/obo/OBI/ . and change the input data to "iri": "http://purl.obolibrary.org/obo/OBI/0002188" it works as expected.

Sometimes the RDF library picks up the original prefixes from the mapping, but not always. It depends on how it resolves the RDF triples which we do not have control over. Although, I find it a bit weird to have this prefix:

@prefix obi: <http://purl.obolibrary.org/obo/OBI_> .

The part OBI_ is also part of the IRI which is a bit weird. Normally a prefix ends with # or /.

@DylanVanAssche DylanVanAssche added enhancement New feature or request question Further information is requested upstream Issue is related to another component then the RMLMapper labels Mar 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested upstream Issue is related to another component then the RMLMapper
Projects
None yet
Development

No branches or pull requests

2 participants