-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Misaligned reference links in full text #35
Comments
Hi @dennlinger, thanks for your bug report. We are already aware of this bug but couldn't fix it until now (see openlegaldata/legal-reference-extraction#1 ). If the original text without any annotation would help you, we could provide it as an additional field in the API response. Best, |
Hi Malte, The test cases provided in I think the feature is extremely helpful if working properly, and could potentially be extended, if you are willing to accept contributions on this issue. Best, |
Contributions are always welcome! I'll try to update the API accordingly within the next week. |
The decision content which is currently available via the API does not contain any annotations. Thus, it should not be affected by the reference extraction bug. The API serializer returns the For the UI, all annotations are added later (See https://github.com/openlegaldata/oldp/blob/master/oldp/apps/cases/models.py#L186-L209 ) |
After running some tests (for example on this document) it seems like the references are misaligned because of the HTML-Offset, i.e. replacing special characters like "ö" with "ö". The references are placed as if they were applied to plain text without taking these special characters into account resulting in the misalignment. |
Hi @fchrubasik & @dennlinger thanks again for your contribution! The last months have been really busy over here so I only today managed to finally deploy your changes to production. I'm really sorry for that! I'm currently reprocessing all our documents with the changes (that might take 10hrs or so). Did you end up doing anything with the citation data? Best, |
Hi, thanks for incorporating the changes! Cheers, |
Not sure where to follow up with this, but it seems the references are still misaligned on the live server, as it seems. Did we miss anything with the original bugfix that might cause this to be still misaligned? |
The case mentioned in the issue seems to have all reference correct ( https://de.openlegaldata.io/case/bag-2019-07-11-6-azr-4017 ). Do you have an example for still misaligend references? |
I was specifically looking at the most recent "Urteil" at the time of writing (https://de.openlegaldata.io/case/bverwg-2020-08-06-6-b-1120). Great to see that the original issue is fixed, though! |
OK. Then let's reopen this one. |
For some of the decisions (e.g., this one), the references are not aligned at all with the corresponding occurrences in the text.
Is there any way to work with the data prior to the annotation (as it is available through the JSON), to potentially help with investigating this?
The text was updated successfully, but these errors were encountered: