-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doubt about mangleSearchableText/isSimpleSearch functions #181
Comments
I’ve tried to dig into the history of this. IIRC I introduced this as searches for something like This change was a workaround so single words containing numbers weren’t consider simple anymore and as a result didn’t get the automatic wildcard treatment. Instead they were used literally as-is. At least at the time of this change, a Solr query for Looking at the history of the default schema generated by collective.recipe.solr, I made some related changes a couple of months after this workaround. Of specific note are collective/collective.recipe.solrinstance@a8361b0#diff-e9704fdb9716ae57883502dd0b393d72 and collective/collective.recipe.solrinstance@5a57413#diff-e9704fdb9716ae57883502dd0b393d72. Before those changes the text field used the WordDelimiterFilterFactory without specifying a value for its splitOnNumerics attribute. The default value for this is 1 meaning true. This is what let to the splitting of Plone5 into two seperate terms. After those two commits, the WordDelimiterFilterFactory is used again, but with an explicit splitOnNumerics=0 - meaning it no longer splits on numbers. With all this said, I think the real fix here was the schema change. The workaround in isSimpleSearch can probably be removed again. But I’m not involved in this project anymore, so I can’t make a decision on that. As a side note the automatic wildcard treatment for simple searches was always questionable. It was behavior we preserved from the old ZCTextIndex implementation. But I cannot say whether or not this actually matches user expectations or if there isn’t a better way to deal with this in modern Solr. |
@tisto what is your opinion on this? |
@mamico my feeling is that this is something that we could solve in a better way on Solr level. Though, I currently don't really have time to dig deeper into that. |
here:
https://github.com/collective/collective.solr/blob/master/src/collective/solr/mangler.py#L74
mangleSearchableText return raw searchabletext value if isSimpleSearch is not True, and that is if the value ends with a digit:
https://github.com/collective/collective.solr/blob/master/src/collective/solr/utils.py#L116
So if you search by "something about Plone5" the "mangle" machinery is really different (in a wrong way IMHO) than "something about Plone".
The text was updated successfully, but these errors were encountered: