Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix generation of multiple bibliography sections #12309

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

subhramit
Copy link
Member

Fixes one part of the issue #12262
Also discussed in #11712 (comment)

Changes

Now, when a switch is made from one style "family" to another (CSL->JStyles or JStyles->CSL) in the same document where an existing bibliography of the old family is already present, the next entry cited in the document with the new style will no longer create a separate bibliography section (which led to two independent bibliography sections earlier).

That is, the bibliography section will also switch, and populate entries only pertaining to the currently selected family.

bib_sample.mp4

More details (user-facing):

Since the style parsing and reference marks of JStyles and CSL are different, only the entries cited with CSL will be present in the bibliography (if presently switched to CSL), and the entries cited with JStyle will be present in the JStyle (if switched to JStyle) with no loss of old information (provided the citations are not deleted in the text - so that they can be parsed again to regenerate the corresponding style's bibliography list).

More info (for developers):

The implementation involves using the same text section name for both the CSL and JStyle bibliographies, so that their refresh functions can detect one another and clear them. This obviously hints that a lot of the implementation could be merged into a single class (which is the case with many aspects of JStyle and CSL - the OO backend could be unified to a large extent). However, since the way the code is written for these is very different, the classes for these project classes are currently separated and grouped together for easier modification (at least from the perspective of CSL - which seems to be more important now).

Mandatory checks

  • I own the copyright of the code submitted and I licence it under the MIT license
  • Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@Siedlerchr
Copy link
Member

This will just make existing CSL style usage in the document obsolete?

@subhramit
Copy link
Member Author

subhramit commented Dec 17, 2024

This will just make existing CSL style usage in the document obsolete?

Unless the style is switched back to it. Then, the existing CSL citations will be parsed again, so not completely obsolete, but if never switched back - yes (since they have a different reference mark format).

Same is true for JStyle usage being turned obsolete.

@subhramit
Copy link
Member Author

subhramit commented Dec 17, 2024

Unless the style is switched back to it. Then, the existing CSL citations will be parsed again, so not completely obsolete, but if never switched back - yes (since they have a different reference mark format).

This gives the idea - there is a way to take into account all the entries present - but for that we will need to change the reference mark format of jstyles so that they can be parsed together with CSL as well - so the bibliography section will have all references in the currently selected style, regardless of the citations having different style families.

@subhramit
Copy link
Member Author

subhramit commented Dec 17, 2024

This gives the idea - there is a way to take into account all the entries present - but for that we will need to change the reference mark format of jstyles so that they can be parsed together with CSL as well - so the bibliography section will have all references in the currently selected style, regardless of the citations having different style families.

Further unification possible - on switching style families, the citations should also be modified into the current style (which currently happens "within" style families). This one's a good project, but depends on how useful this will be.

@subhramit
Copy link
Member Author

subhramit commented Dec 17, 2024

Further unification possible - on switching style families, the citations should also be modified into the current style (which currently happns "within" style families). This one's a good project, but depends on how useful this will be.

Ref. for myself:
CSL:
image
JStyles:
image

@subhramit
Copy link
Member Author

subhramit commented Dec 18, 2024

Decision on whether or not to merge this one remains for now.
From a user's perspective - what would be less irritating?

Two bibliography sections

Workaround - delete the old section, re-cite the old entries with the new style along with the new entries and refresh bibliography.

One bibliography section with only fresh entries

Workaround - re-cite the old entries with the new style along with the new entries, then refresh bibliography.

@koppor
Copy link
Member

koppor commented Dec 18, 2024

Since the style parsing and reference marks of JStyles and CSL are different, only the entries cited with CSL will be present in the bibliography (if presently switched to CSL), and the entries cited with JStyle will be present in the JStyle (if switched to JStyle) with no loss of old information (provided the citations are not deleted in the text - so that they can be parsed again to regenerate the corresponding style's bibliography list).

Please not. One bibliography. The style should not trigger which entries are part of some bibliography. As user, I choose a style, and that style should be used throughout the paper - especially the bibliogrgraphy.

@koppor
Copy link
Member

koppor commented Dec 18, 2024

Ref. for myself:

Is this the chance to convert all references to the Zotero format? Now that you are learning about identifiers in other programs, chances are high that another identifier format can be used?

@subhramit
Copy link
Member Author

subhramit commented Dec 18, 2024

@koppor this will be a long one, read when you have time.

Current focus

One bibliography.

That would make the merging of this PR relevant. The issue that would remain is:

The style should not trigger which entries are part of some bibliography. As user, I choose a style, and that style should be used throughout the paper - especially the bibliogrgraphy.

With only this PR, this would be considered the responsibility of a user to choose one style type/"family" (CSL or JStyle) uniformly throughout the paper (that is how these styles are supposed to be used anyway - even within a style family, the user should use one style consistently). But then one may argue that if that is properly done by the user, there will not be two bibliography sections in the first place.

If the user messes up and realizes that they had to use another style family (hopefully, the user detects such an obvious error very early into his work looking at the citations or the bibliography), they have a simple enough workaround as I mentioned above: re-cite the old entries too with the new style, then refresh bibliography.

If we want JabRef to take care of it, we have to unify the reference mark format, hence have to take a decision - whether to migrate CSL to JStyle format (this is not too hard, as I haven't lost context of the classes I wrote - will have to change the structure of assignment and parsing in CSLReferenceMark, CSLReferenceMarkManager, etc and fix inter-dependencies that could possibly break how I finally edited numbers in the document once their positions are swapped or some of them deleted (subhramit#22, #11712). One thorough round of testing after that - that would be it). Considering my other commitments, I think would be able to complete this by the time 6.0-beta or 6.0 stable is released.
Pro - easier to do
Con - the current CSL format was closer to Zotero's (chosen for future cross-software compatibility. More on this below.)

The other option would be migration of JStyle to the CSL format - this would be hard and a lot more time taking as the code there is very different with hard to navigate logic and dependencies.
Pro - don't have to touch CSL, so the net format remains closer to Zotero for future extension.
Con - will be hard and take lots more time.

The other part

Is this the chance to convert all references to the Zotero format? Now that you are learning about identifiers in other programs, chances are high that another identifier format can be used?

I learnt about identifiers in other programs when I was taking a decision on what to choose for JabRef's CSL integration back during GSoC itself. That's why, the format we use for CSL is very similar to:
https://github.com/zotero/zotero-libreoffice-integration/blob/39a4b0586c110d9ed42561295bc36e4e0383c793/build/source/org/zotero/integration/ooo/comp/ReferenceMark.java#L301

There are differences, such as:

  1. The reference marks begin with ZOTERO_ITEM CSL_CITATION... instead of JR_... or JABREF_... (then how would Mendeley detect that? I don't know as of now - never got time to delve deep into that. Maybe their code allows parsing of references beginning with ZOTERO...? Or we may be misinterpreting something**).
  2. They use a string with pre-set characters to make a 10-letter random string at the end of the citation led by "RND..." (https://github.com/zotero/zotero-libreoffice-integration/blob/39a4b0586c110d9ed42561295bc36e4e0383c793/build/source/org/zotero/integration/ooo/comp/Document.java#L928-L933)
    whereas we use an 8-letter CUID directly.

But wait, these still seem pretty easy to change - where's the catch?

It is how we identify the entries. We use their citation keys. This is generated in a particular way, which can differ from software to software. In case of Zotero, I haven't been able to expand the reference mark, bit it appears to me as if Zotero doesn't use citation keys to identify:
image

They use a JSON format, with the first field being the "citationID", which, on inspection, cannot be a citation key as it begins with a small "w" (no field of the entry has that).
image

I tried searching the code for "citationId" but didn't find any results.
** I never asked for more clarification on the statement in the GSoC project - "Zotero and Mendeley can read each other's citations" as I was running low on time. The statement can mean read each other's LibreOffice citations, or it could mean that they can read each other's .bib files, which doesn't have much to do with LibreOffice or reference marks (this seems too hard to match from the analysis above). So for the ability for entries of JabRef to be read by other softwares, maybe we need to start at BibEntry.java, as @calixtus and @Siedlerchr once suggested. Taking our current active manpower into consideration, I would advise against touching such a core class before 6.0 as it could open up a Pandora's box full of unforeseen issues.

Maybe this can be a GSoC project this year?

@koppor
Copy link
Member

koppor commented Dec 20, 2024

With only this PR, this would be considered the responsibility of a user to choose one style type/"family" (CSL or JStyle) uniformly throughout the paper (

I did not pay attention to the development of JSTyle and GSoC, because I thought following is obvious:

  1. I wrote a paper to conference A having style A
  2. I sumit paper
  3. Paper gets reviewed
  4. Paper gets rejected
  5. I address review comments
  6. I rewrite for conference B having style B.

In step 6, the switch from style A to style B is a one-line change (values are example values)

-\bibliographystyle{plain}
+\bibliographystyle{IEEEtranN}

This is the main use case! - Everything else is an addon. Addons are for example: Bibliography per chapter,

The main use case has to work. Other use cases have lower priority.

With "work" I mean, I should be a single action of the user to change the style.

@subhramit
Copy link
Member Author

subhramit commented Dec 20, 2024

The main use case has to work. Other use cases have lower priority.

Got it. Hadn't considered this. @ThiloteE had suggested that one would use LaTeX for writing papers, and LibreOffice for "less intense" use cases like a Bachelor's thesis or a report. But in case they use it for papers, what you mention is a valid use case.

Let me think over this. For editing citations, it's gonna be the way we ended up editing the numbers but this time editing the style along with it as well. Editing the text was the most brainwreck part, so I'll think of ways to extend that. We won't need to update content of the reference mark this time, so it might be simpler than I am assuming right now.

For bibliography, style A-> style B migration already works (since bibliography is repopulated from scratch with each entry), except if one is a jstyle and the other is csl. Fixing that would be linear effort (plan mentioned in my previous comment).

@subhramit
Copy link
Member Author

subhramit commented Dec 20, 2024

@subhramit for myself -> have a global optional to detect changes in used styles. If new style detected on next cite, use updateText (currently for numbers) with a secondary condition to update older citations as well. Will need citationstylegenerator for each new style, for each old entry.

@subhramit subhramit marked this pull request as draft December 22, 2024 08:21
@koppor
Copy link
Member

koppor commented Dec 23, 2024

Got it. Hadn't considered this. @ThiloteE had suggested that one would use LaTeX for writing papers, and LibreOffice for "less intense" use cases like a Bachelor's thesis or a report. But in case they use it for papers, what you mention is a valid use case.

A Bachelor's thesis must also have the same citation style throughout the thesis.

Steps here are:

  1. Choose a style A
  2. Draft a thesis with citations
  3. Talk to supervisor
  4. Supervisor says: Use style B

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants