Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose show_progress_bar option for Hugging Face vectorizer #232

Closed
antonum opened this issue Oct 9, 2024 · 2 comments
Closed

Expose show_progress_bar option for Hugging Face vectorizer #232

antonum opened this issue Oct 9, 2024 · 2 comments
Labels
enhancement New feature or request

Comments

@antonum
Copy link
Contributor

antonum commented Oct 9, 2024

Currently, hf.embed() and hf.embed_many() always display the tqdm progress bar while building embeddings, even if it's just for one embedding.

# Embed a sentence
hf = HFTextVectorizer(model="sentence-transformers/all-MiniLM-L6-v2")
test = hf.embed("This is a test sentence.")

Ask: expose show_progress_bar parameter of the underlying HuggingFace model, so one can choose if to display a progress bar.

model = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
model.encode(["This is a test sentence."], show_progress_bar=False)

So one can do:

# Embed a sentence
hf = HFTextVectorizer(model="sentence-transformers/all-MiniLM-L6-v2")
test = hf.embed("This is a test sentence.", show_progress_bar=False)
@justin-cechmanek justin-cechmanek added the enhancement New feature or request label Oct 10, 2024
@tylerhutcherson
Copy link
Collaborator

We can accomplish this by exposing optional **kwargs on the .embed() + .embed_many() (and async variants). Each vectorizer would be responsible for parsing and passing through kwargs as needed. @antonum is this something you might be able to contribute if you're looking for a little OSS action ;) ?

tylerhutcherson pushed a commit that referenced this issue Oct 11, 2024
Address #232

Now **kwargs are being passed from `hf.embed()` and `hf.embed_many()` to
the underlying `model.encode()`

```
from redisvl.utils.vectorize import HFTextVectorizer
from tqdm.auto import tqdm
hf = HFTextVectorizer(model="sentence-transformers/all-MiniLM-L6-v2")
# Embed a sentence
test = hf.embed("This is a test sentence.", show_progress_bar=True) #progress bar would show
test = hf.embed("This is a test sentence.") #progress bar would show (default behavior as before)
test = hf.embed("This is a test sentence.", show_progress_bar=False) #progress bar would NOT show

# Uncomment to see vector embedding output
print(test[:10])
```
@tylerhutcherson
Copy link
Collaborator

Solved thanks to #236

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants