Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow multiple collections #376

Open
gforcada opened this issue Jun 10, 2024 · 2 comments
Open

Allow multiple collections #376

gforcada opened this issue Jun 10, 2024 · 2 comments

Comments

@gforcada
Copy link
Member

Solr is great, but it has a few downsides:

  • if you upgrade to a new version: you must reindex all your content
  • if you change the schema: you must reindex all your content
  • ...

... and that's specially hurting if reindexing the complete website takes a sizeable amount of time (for us around 24h hours).

💡 One mitigation strategy we have been using is to make the changes on non-production environment, and as soon as the critical amount of content has been reindexed, move Solr data from non-production to production and finish the reindexing there.

Another strategy that I read somewhere (probably on the solr docs) is to configure a second parallel collection, do the full reindex there (while the existing collection is still being used), and whenever reindexing has catch-up, switch them over ✨

Would that be something that could be done within collective.solr ? 🤔

@davisagli
Copy link
Member

@gforcada I was thinking about the same thing, but haven't had a chance to work on it. I think a key thing to solve is making sure that the indexing of the new collection has a way to catch up with changes to any documents that are modified during the reindex process.

@gforcada
Copy link
Member Author

😖 sorry, way too many things on my plate as of late 🙃

Thinking it twice, the two collections solution does not fit to fix the first problem: upgrading to a new version, as you can not have two different solr versions on the same server...

So, allowing to configure multiple Solr instances would be the solution here? 🤔

Probably we are approaching it the wrong way, Solr itself has to have some tooling around that...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants