Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace Map<String,Object> with IntObjectHashMap for KnnVectorsReader #13763

Merged
merged 7 commits into from
Oct 31, 2024

Conversation

bugmakerrrrrr
Copy link
Contributor

Description

Following up on #13686.

@@ -223,7 +224,8 @@ public void checkIntegrity() throws IOException {

@Override
public FloatVectorValues getFloatVectorValues(String field) throws IOException {
FieldEntry fieldEntry = fields.get(field);
final FieldInfo info = fieldInfos.fieldInfo(field);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couldn't final FieldInfo info be null? (we should check for null, see #13641 which should be merged soon).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is also what I want to discuss. You can ref to my comment below, if we all agree to check for null, I can fix this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion, the most important thing is to remove the leniency we have in some places, like if (fieldEntry == null) return EMPTY_VECTOR_VALUES, which could hide bugs. Checking for null explicitly is not required, though it's obviously nice.

if (knnVectorsReader == null) {
final FieldInfo info = fieldInfos.fieldInfo(field);
KnnVectorsReader reader;
if (info == null || (reader = fields.get(info.number)) == null) {
return null;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the Javadoc of KnnVectorsReader#getFloatVectorValues:

Returns the FloatVectorValues for the given field. The behavior is undefined if the given field doesn't have KNN vectors enabled on its FieldInfo. The return value is never null.

Maybe we should return an empty instance if the target field doesn't exists in the segment or we should correct the Javadoc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The javadocs are correct. AssertingKnnVectorsReader checks for this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

forgive me if i misunderstand something, but the javadoc says that the return value is never null, and we do return null in PerFieldKnnVectorsFormat. If we check the field info in caller, AssertingKnnVectorsReader will always get the non-null value from the delegate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's probably not worded as well as it could, it wants to say that the return value is never null when the field has vector enabled on its field infos.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These checks on info==null and reader == null are not hurting, but they are not needed either since KnnVectorsFormat#getFloatVectorValues should not be called when the field doesn't have vectors enabled on its infos.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, the current implementation will throw a NullPointerException if we don't perform a check. We may address this in a followup PR, if desired.

@bugmakerrrrrr
Copy link
Contributor Author

@benwtrent @jpountz I have merged main branch into this one, can I get a review on this?

Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awaiting result of discussion #13805

Copy link

github-actions bot commented Oct 4, 2024

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label Oct 4, 2024
@bugmakerrrrrr
Copy link
Contributor Author

hi @jpountz ,since we have moved to lucene 10, should we merge this and add back #13686?

@github-actions github-actions bot removed the Stale label Oct 15, 2024
@jpountz
Copy link
Contributor

jpountz commented Oct 15, 2024

Yes, let's add this in 10.1.

@jpountz
Copy link
Contributor

jpountz commented Oct 15, 2024

Would you like to make this PR up-to-date and open a new one for the other change (a cherry-pick isn't clean due to other changes)?

@bugmakerrrrrr
Copy link
Contributor Author

Would you like to make this PR up-to-date and open a new one for the other change (a cherry-pick isn't clean due to other changes)?

@jpountz sure, I‘d be happy to do this.

@bugmakerrrrrr
Copy link
Contributor Author

@jpountz I have merged main branch, PTAL:)

@jpountz jpountz merged commit 494b160 into apache:main Oct 31, 2024
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants