Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

html search error #1894

Open
mywander opened this issue Oct 26, 2023 · 1 comment
Open

html search error #1894

mywander opened this issue Oct 26, 2023 · 1 comment
Assignees

Comments

@mywander
Copy link

I index html contents. but search error:
`msg := []struct {
Id string
Body string
}{{Id: "1", Body: "

You trusted bbs server all proxies
"},
{Id: "2", Body: "
this is NOT bbs server safe
"}}

tmpIndexPath := createTmpIndexPath(t)
defer cleanupTmpIndexPath(t, tmpIndexPath)
idxMapping := bleve.NewIndexMapping()
idxMapping.DefaultAnalyzer = "en"
if err := idxMapping.AddCustomAnalyzer("custom-html", map[string]interface{}{
	"type":         custom.Name,
	"tokenizer":    "unicode",
	"char_filters": []interface{}{html_char_filter.Name},
}); err != nil {
	t.Fatal(err)
}

fm := mapping.NewTextFieldMapping()
fm.Analyzer = "custom-html"

idxMapping.DefaultMapping.AddFieldMappingsAt("Body", fm)
idx, err := bleve.New(tmpIndexPath, idxMapping)
if err != nil {
	t.Fatal(err)
}

defer func() {
	err = idx.Close()
	if err != nil {
		t.Fatal(err)
	}
}()
for _, v := range msg {
	idx.Index(v.Id, v)
}

keywords := []string{"bbs", "server"}
for _, v := range keywords {
	query := bleve.NewQueryStringQuery(v)
	searchRequest := bleve.NewSearchRequest(query)
	searchResult, err := idx.Search(searchRequest)
	if err != nil {
		panic(err)
	}
	if searchResult.Hits.Len() > 0 {
		fmt.Println("Search ", v, " found ", searchResult.Hits.Len())
	} else {
		fmt.Println("Search ", v, " Not found!")
	}
}`

Search "bbs" no results, but search "server" give 2 results.

@CascadingRadium
Copy link
Member

Hello;

The problem you're encountering is because the query you're executing (the query string) doesn't explicitly mention a field, which means it defaults to searching in the general field (_all). As a result, the analyzer applied to the query string is the default analyzer, which you've defined as "en." To resolve this problem, please include the "Body" field in your query string. This will utilize the custom analyzer you've configured for the "Body" field in the default mapping.

Heres the updated code:

tmpIndexPath := createTmpIndexPath(t)
defer cleanupTmpIndexPath(t, tmpIndexPath)
idxMapping := NewIndexMapping()
idxMapping.DefaultAnalyzer = "en"
if err := idxMapping.AddCustomAnalyzer("custom-html", map[string]interface{}{
	"type":         custom.Name,
	"tokenizer":    "unicode",
	"char_filters": []interface{}{html_char_filter.Name},
}); err != nil {
	t.Fatal(err)
}
msg := []struct {
	Id   string
	Body string
}{
	{
		Id:   "1",
		Body: "You trusted bbs server all proxies",
	},
	{
		Id:   "2",
		Body: "this is NOT bbs server safe",
	},
}

fm := mapping.NewTextFieldMapping()
fm.Analyzer = "custom-html"

idxMapping.DefaultMapping.AddFieldMappingsAt("Body", fm)
idx, err := New(tmpIndexPath, idxMapping)
if err != nil {
	t.Fatal(err)
}

defer func() {
	err = idx.Close()
	if err != nil {
		t.Fatal(err)
	}
}()
for _, v := range msg {
	idx.Index(v.Id, v)
}

keywords := []string{"Body:bbs", "Body:server"}
for _, v := range keywords {
	query := NewQueryStringQuery(v)
	searchRequest := NewSearchRequest(query)
	searchResult, err := idx.Search(searchRequest)
	if err != nil {
		panic(err)
	}
	if searchResult.Hits.Len() > 0 {
		fmt.Println("Search "+v+" found ", searchResult.Hits.Len())
	} else {
		fmt.Println("Search " + v + " Not found!")
	}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants