-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Bow histogram computation #26
Conversation
… can train a vocabulary of words. Based on SIFT descriptors
Hey @niosus would you have time to review this one? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A solid PR, @ovysotska! I left a couple of nitpicks here and there, but feel free to ignore them. I think the code is easy to read and the intent is clear. Added plus for having a couple of tests.
return trainImageFiles | ||
|
||
|
||
def rescaleImageIfNeeded(image): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: very minor, but you seem to have docs for the other functions and not for these. Maybe add docstring to these too?
src/python/bow/bow.py
Outdated
|
||
|
||
def computeIDF(descriptorsPerImage, clusters): | ||
"""Compute inverse document frequence (IDF). Here means in how many images does the word occur. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here means in how many images does the word occur.
nit: This reads a bit strange. What did you mean to say here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I meant the IDF (which is inverse document frequency) , here stands for how many images (not documents) does the word occur in.
src/python/bow/bow.py
Outdated
descriptors = [ | ||
descriptor for descriptors in descriptorsPerImage for descriptor in descriptors | ||
] | ||
descriptors = np.array(descriptors) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really know how to improve this, but the naming seems to be overloaded here. The word descriptors
is very overused here. Maybe use more descriptive names?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is the best. I also don't know now what that suppose to be :)
src/python/bow/bow.py
Outdated
idfs = computeIDF(descriptorsPerImage, words) | ||
|
||
plt.bar(range(0, len(idfs)), idfs) | ||
plt.savefig("idf_" + str(kDefaultClusterSize) + ".png") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Do you want to always save these images? Or should it be hidden behind some flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Neh, not really important. Will just remove it and reduce the dependencies list.
src/python/bow/bow.py
Outdated
|
||
|
||
def getVocabulary(imageTrainFolder, vocabularyFile): | ||
if vocabularyFile is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: you can simplify this to:
if vocabularyFile and vocabularyFile.exists():
This PR adds the Python code to:
In the testing mode, the code takes a folder with images and returns the np.array of Bow features stored in the specified file.
The computed features can be used in the current codebase further, by first turning them into proto files, and then the full pipeline for image_sequence_localizer can be used as usual.