-
Notifications
You must be signed in to change notification settings - Fork 683
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sort by a combination of fields #1802
Comments
Multiple sort keys come in handy when some of the document hits are missing or equal sort values for earlier keys. Here is an example test case that could help - |
This doesn't seem to be quite what I was looking for. As far as I can tell, it's sorting only based on age while what I was looking for is something like sorting by the multiplication product of two relevance score and a field in the document. FWIW, I finally came up with this, which seems to work: func (so *RankedSort) UpdateVisitor(field string, term []byte) {
switch field {
case "pRank":
if len(term) > len(so.pRank) {
so.pRank = make([]byte, len(term))
copy(so.pRank, term)
}
case "hRank":
if len(term) > len(so.hRank) {
so.hRank = make([]byte, len(term))
copy(so.hRank, term)
}
}
}
func (so *RankedSort) Value(a *search.DocumentMatch) string {
prp, _ := numeric.PrefixCoded(so.pRank).Int64()
pr := math.Float64frombits(uint64(prp))
hrp, _ := numeric.PrefixCoded(so.hRank).Int64()
hr := math.Float64frombits(uint64(hrp))
so.pRank = so.pRank[:0]
so.hRank = so.hRank[:0]
score := numeric.Float64ToInt64((a.Score + 1) * (pr + 1) * (hr + 1))
return string(numeric.MustNewPrefixCodedInt64(score, 0))
} (I'm pretty sure how I use the absolute value of score and multiply it by other values is technically incorrect, since the score value does not have any bounds that I'm aware of; would love any suggestions on that front too.) |
Ah I see, for custom sorting - we've provided an API for the SearchRequest that you could leverage. I suppose you're registering your The internal score determination is based on the tf-idf algorithm. Boosting is the only way we allow users to adjust/influence this score generated for your document hits. |
I'm using SortByCustom. I didn't know about SetSortFunc. Looks like it's more concerned with the implementation of sort, than scoring, or did I get that wrong? Are you saying that what I'm doing is borderline unsupported? Boost it is not gonna work for my use case, because I want the scores to come partially from relevance, and partially by the rank of each item. This is a small search engine, and pRank is actually PageRank; the score*rank multiplication was the best I could come up with to achieve what I wanted. It does seem to be working mostly. I just don't like that I'm multiplying by score, which I don't know the dimensions or bounds of, and indeed I think I read somewhere it's not even meant to be comparable between different searches. |
I'm trying to implement custom sorting based on a combination of _score and two other numeric fields. I've looked at the implementation of SortField and the like but I'm still not quite sure how to proceed.
What I've gathered so far is that I would need to extract the values from the terms provided to UpdateVisitor, convert them to a number and store them, and then in the Value function combine them with score as I wish and then encode them back somehow.
I haven't been able to figure out how the encoding/decoding part works. Is there a resource or an example you can point me to?
The text was updated successfully, but these errors were encountered: