Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<Community>: Add Initial Support for TiDB Vector Store #15796

Merged
merged 34 commits into from
Mar 8, 2024

Conversation

IANTHEREAL
Copy link
Contributor

@IANTHEREAL IANTHEREAL commented Jan 10, 2024

This pull request introduces initial support for the TiDB vector store. The current version is basic, laying the foundation for the vector store integration. While this implementation provides the essential features, we plan to expand and improve the TiDB vector store support with additional enhancements in future updates.

Upcoming Enhancements:

  • Support for Vector Index Creation: To enhance the efficiency and performance of the vector store.
  • Support for max marginal relevance search.
  • Customized Table Structure Support: Recognizing the need for flexibility, we plan for more tailored and efficient data store solutions.

Simple use case exmaple

from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)

Copy link

vercel bot commented Jan 10, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
langchain ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 7, 2024 4:17am

@baskaryan
Copy link
Collaborator

@IANTHEREAL feel free to ping when the PR is ready for review

@IANTHEREAL
Copy link
Contributor Author

IANTHEREAL commented Jan 19, 2024

@baskaryan The TiDB Vector feature is expected to release a preview version around the end of January. I will make the final adjustments then and invite you to review it 😊

Copy link

vercel bot commented Mar 4, 2024

Deployment failed with the following error:

The provided GitHub repository does not contain the requested branch or commit reference. Please ensure the repository is not empty.

@IANTHEREAL IANTHEREAL marked this pull request as ready for review March 4, 2024 11:01
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. Ɑ: vector store Related to vector store module 🔌: openai Primarily related to OpenAI integrations 🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features labels Mar 4, 2024
@IANTHEREAL
Copy link
Contributor Author

Thank you for your patience with this long-standing PR. I'm excited to tell that the TiDB Vector feature is now in the release pipeline, with most tests completed. We're targeting a release in the last two weeks of March.

And. the feature has been deployed in our staging environment for this integration review with LangChain. I'll share access credentials with reviewers shortly. Your review would be greatly appreciated, thanks in advance! @baskaryan @hwchase17

@IANTHEREAL
Copy link
Contributor Author

Recently, there have been more and more code conflicts. If possible, please help review this PR. @baskaryan @hwchase17

And I've DM'd the access method for the tidb vector staging environment to @baskaryan, hope it's useful for this PR reviewing

@AV25242
Copy link

AV25242 commented Mar 7, 2024

This is great @baskaryan can you please assist ?

@dosubot dosubot bot added the lgtm PR looks good. Use to confirm that a PR is ready for merging. label Mar 8, 2024
@baskaryan baskaryan merged commit 390ef6a into langchain-ai:master Mar 8, 2024
61 checks passed
@IANTHEREAL IANTHEREAL deleted the tidb_vectorestore branch March 8, 2024 01:25
@IANTHEREAL
Copy link
Contributor Author

IANTHEREAL commented Mar 8, 2024

Thanks for reviewing @baskaryan @AV25242

gkorland pushed a commit to FalkorDB/langchain that referenced this pull request Mar 30, 2024
…n-ai#15796)

This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
hinthornw pushed a commit that referenced this pull request Apr 26, 2024
This pull request introduces initial support for the TiDB vector store.
The current version is basic, laying the foundation for the vector store
integration. While this implementation provides the essential features,
we plan to expand and improve the TiDB vector store support with
additional enhancements in future updates.

Upcoming Enhancements:
* Support for Vector Index Creation: To enhance the efficiency and
performance of the vector store.
* Support for max marginal relevance search. 
* Customized Table Structure Support: Recognizing the need for
flexibility, we plan for more tailored and efficient data store
solutions.

Simple use case exmaple

```python
from typing import List, Tuple
from langchain.docstore.document import Document
from langchain_community.vectorstores import TiDBVectorStore
from langchain_openai import OpenAIEmbeddings

db = TiDBVectorStore.from_texts(
    embedding=embeddings,
    texts=['Andrew like eating oranges', 'Alexandra is from England', 'Ketanji Brown Jackson is a judge'],
    table_name="tidb_vector_langchain",
    connection_string=tidb_connection_url,
    distance_strategy="cosine",
)

query = "Can you tell me about Alexandra?"
docs_with_score: List[Tuple[Document, float]] = db.similarity_search_with_score(query)
for doc, score in docs_with_score:
    print("-" * 80)
    print("Score: ", score)
    print(doc.page_content)
    print("-" * 80)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🤖:enhancement A large net-new component, integration, or chain. Use sparingly. The largest features lgtm PR looks good. Use to confirm that a PR is ready for merging. 🔌: openai Primarily related to OpenAI integrations size:XXL This PR changes 1000+ lines, ignoring generated files. Ɑ: vector store Related to vector store module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants