Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TaxonomyField update shouldn't use the database unless necessary #8704

Open
BenedekFarkas opened this issue Jun 27, 2023 · 3 comments
Open

Comments

@BenedekFarkas
Copy link
Member

When a content item is saved that has TaxonomyFields, TaxonomyFieldDriver will call TaxonomyFieldService.UpdateTerms (unless there's a validation error). In that function, all the TermContentItem entries associated with the field are dropped and then new ones are created based on the Terms selected in the editor. However, this happens even if the selected term(s) didn't change.

Suggested changes:

  1. TaxonomyField should store the list of its selected term IDs serialized in an infoset property. This would help, because when you already have the content item/part in memory, we don't need to go to the DB to find out which terms are selected.
  2. When UpdateTerms is called, the list of already selected terms IDs is compared to the would-be term IDs, then:
    1. Do nothing, because they contain the same IDs.
    2. Some were deselected, so remove those TermContentItem entries + remove them from the infoset list.
    3. Some new ones were selected, so add those TermContentItem entries + add them to the infoset list. (this can happen together with the previous one)

What do you think? CC: @MatteoPiovanelli-Laser @sebastienros

@MatteoPiovanelli-Laser
Copy link
Contributor

I'll do a bit of a stream of consciousness as I try to remember how TaxonomyFields worked, by putting here information I'm gleaning by re-reading the code.

Those IDs (for the terms selected in a TaxonomyField) are NOT already serialized (for either published and latest versions of a ContentItem) in the StringFieldIndexRecord, that is used for queries/projections. OnLoading the terms in the field are populated (in its LazyField) off the records for the TermsPart for the ContentItem. Those records are also used for projections. This is unlike, for example, ContentPickerField.

ITaxonomyService.UpdateTerms: given the list of terms we selected, we want this method to make sure that the TermsPart for the ContentItem we are processing ends up having, for a specific field, only TermContentItems for those terms. Moreover, this method should take all the terms that were unselected to fire off a processing task. Currently, that doesn't work properly, to be entirely honest:

  • Start with a ContentItem with selected Term 'A'. Publish it.
  • Change selected terms (e.g. unselect 'A' and select 'B'). Save the content as a Draft (don't publish it).
    The records are updated anyway, and when displaying the content it's going to show the information for the Draft TaxonomyField.

It's possible that one way to fix this would be to make the TermsPartRecord a ContentPartVersionRecord. That would also take care of the need to keep track of the changes to the list of terms in the UpdateTerms method, since we'd have new records for the new version. This wouldn't be adding more stuff to the DB than it is now on update, but it would not delete TermContentItem records attached to previous versions: fewer operations, but in the end a larger table.
It would complicate the step done OnPublished to call the task that counts selected ContentItems for each term, but only slightly: we would compare the current version with the previous one to also account for unselected terms, rather than have a separate invocation of the task in the UpdateTerms method.
That "new" versioned part could on principle also have the Ids in its infoset, but it would have to query for all corresponding records anyway, and we wouldn't need to compare the list of Ids to a previous version.

@BenedekFarkas
Copy link
Member Author

BenedekFarkas commented Jun 29, 2023

I wasn't really considering versioning, but I agree that it's weird that saving a draft item with different terms selected is immediately reflected in the published version as well.

My immediate concern here is that DB changes happen even when they aren't needed: When you save a Taxonomy Field without changes, the DB operations play out as if you unselected all the terms, saved it and then re-selected the same terms and saved it again, which is wasteful.

@MatteoPiovanelli-Laser
Copy link
Contributor

Yes. On the other hand, if we moved to a versioned record, that's sort of the default behaviour: the records for the new version, if no change has happened, end up being duplicates of the previous ones (except for the relevant Ids). But we would not be deleting a record just to add it back.

@sebastienros sebastienros added this to the Orchard Future Versions milestone Jun 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants