Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

全英文内容搜索没有数据 #72

Open
akincol4544 opened this issue Feb 14, 2023 · 3 comments
Open

全英文内容搜索没有数据 #72

akincol4544 opened this issue Feb 14, 2023 · 3 comments

Comments

@akincol4544
Copy link

the Hypertext Transfer Protocol HTTP is an application layer protocol in the Internet protocol 我增加了内容, 我搜 layer 没有出数据啊,搜 protocol 也没有数据,debug默认开的,控制台没有报错误

@xiaoyaoking
Copy link

放弃吧。。 我试了下, 这个搜索 使用得分词 不太理想。导致很多词他没分,然后逆就搜不到。

@jingkang99
Copy link

image
英文的空格被忽略,没有正确分词

@jingkang99
Copy link

分词用的是 jieba,可以调试下
tokenizer.go 里

//移除所有的标点符号
text = utils.RemovePunctuation(text)

//移除所有的空格
//text = utils.RemoveSpace(text)
fmt.Println("----------------> remove space\n")

移除空格导致英文分词不对
注释了就对了
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants