Deep Ranking based ImageSimilarity will be developed as plugin on ImageSpace. (https://users.eecs.northwestern.edu/~jwa368/pdfs/deep_ranking.pdf)
Libraries -
pip3 install urllib3
pip3 install numpy
pip3 install https://download.pytorch.org/whl/cu90/torch-1.0.0-cp36-cp36m-win_amd64.whl
pip3 install torchvision
pip3 install skimage
pip3 install pandas
pip3 install tqdm
pip3 install tensorflow-gpu or pip3 install tensorflow
pip3 install keras
The training of DeepRanking requires triplets images (query, positive and negative image) which can be found at here. Download QUERY_AND_TRIPLETS.TXT
and store it in a folder.
After downloading QUERY_AND_TRIPLETS.TXT
run retrieve_folder.py
in same folder. This process will take about 30 minutes.
This python code will parse through txt file -- retrieve triplets and store it in folder. It'll repeat this process for 5033 Triplets.
Run triplet_sampler.py and also specify arguments.
python triplet_sampler.py --input_directory ./ --output_directory ./ --num_pos_images 10 --num_neg_images 10
Triplet sampler is used here to augment dataset, since relevance score (mentioned in paper) is not publically available this technique is used. Sampler will pair two similar images (of 1 folder) with positive/negative images of another folder.
deep_ranking.py models the architecture given in paper, current implementation just prints layers of P,Q,R network. Also, tripletLoss is preferred over hinge loss as it more suitable to our use case.
You can find link to download trained model here Put both files in one directory. Provide image 1 and 2 for model to compare similarity.
python get_distance.py --model ./deepranking-v2-150000.h5 --image1 0positive.jpg --image2 0query.jpg
- retrieve_folder.py - Implemented python script to retrieve triplets from query_and_triplets.txt and store it in different folders.
- folder.py - If you have downloaded entire dataset without using retrieve_folder.py and want to add images to folder use this simple script.
- deep_ranking.py - Implemented model P,Q,R from paper using PyTorch along with optimizer and loss function.
- triplet_sampler.py - Implemented a simple randomized sampler to make different triplets out of existing 5033 triplets.
- DataLoader - Implement a dataloader to pass triplets to model.
- get_distance.py - To calculate distance between embeddings of images.
- Docker image for easy installation and integration with image space.
ImageSpace is developed by the JPL-Kitware team funded through the DARPA Memex program.
- Dr. Chris Mattmann, JPL.
- Purvak Lapsiya, USC.
This project is licensed under the Apache License, version 2.0.