Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The TEXT_MATCH method did not achieve the expected results as described in the documentation. #39388

Open
1 task done
cnzackliu opened this issue Jan 17, 2025 · 2 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@cnzackliu
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: 2.5.0
- Deployment mode(standalone or cluster): standalone
- MQ type(rocksmq, pulsar or kafka): default   
- SDK version(e.g. pymilvus v2.0.0rc2): 2.5.3
- OS(Ubuntu or CentOS): Rocky Linux
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

The code shown in the documentation

Use text match
Image

The code I wrote

    for keyword in keywords:
        query_parts.append(f"TEXT_MATCH(text, '{keyword}')")
    filter = ' and '.join(query_parts)
    print(filter)
    query_vectors = model.encode([keywords])
    print(f'search {milvus_collection_name}')
    res = client.search(
        collection_name=milvus_collection_name,
        data=query_vectors,  # Query vectors
        anns_field='vector', # Vector field name
        # limit=10,  # number of returned entities
        filter=filter,
        output_fields=["text"],  # specifies fields to be returned
        search_params={'params':{'radius':0.8}}
    )

My code result

I want to search for documents that contain both "2栋-仓库213" and "乙方租客" simultaneously.
But I got some other content in addition to what I wanted when I searched("11栋-108A").

Image

Is this a bug? If not, how should I deal with it?

Expected Behavior

No response

Steps To Reproduce

Milvus Log

No response

Anything else?

No response

@cnzackliu cnzackliu added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 17, 2025
@yanliang567
Copy link
Contributor

@cnzackliu I think this is expected as the search text "2栋-仓库213" will be parsed to ["2", "栋", ”仓库“,”213“], so any text matched to either of the parsed text will be returned. I guess you want to do phrase match, which is in the roadmap of next release. please stay tune.

/assign @cnzackliu
/unassign

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 17, 2025
@cnzackliu
Copy link
Author

@cnzackliu I think this is expected as the search text "2栋-仓库213" will be parsed to ["2", "栋", ”仓库“,”213“], so any text matched to either of the parsed text will be returned. I guess you want to do phrase match, which is in the roadmap of next release. please stay tune.

/assign @cnzackliu /unassign

I have already processed the string and generated a new expression, which is the one on the second line of the picture: TEXT_MATCH(text, "2栋-仓库231") and TEXT_MATCH(text, "租客") and TEXT_MATCH(text, "乙方"),Will it still be split into ["2", "栋", "仓库", "213"] like this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

2 participants