๐ฅ Exploring all possible Deep Learning Models for Webcam-Based Hand
Gesture Navigation for Enhanced Accessibility. ๐ฌ
SRA Eklavya 2023
Table of Contents
- The aim of our project is to develop an intuitive and accessible navigation system that leverages deep learning and computer vision to enable users to control digital devices through natural hand gestures captured by webcams.
We have implemented 3 Deep Learning Models for gesture Recognition and Navigation:-
We used VGG7 Architecture followed by running averages model for BG subtraction for Gesture Recognition and Navigation We have used the OpenCV function "accumulateWeghts" to find the running averages of frames. We manually created a dataset through which we trained our model.
cv2.accumulateWeighted(src, dst, alpha)
This is the motion detection class that we used in our project.
This(Running Average model โ Background Subtraction)Article explains the Running Averages Approach for Gesture Recognition very clearly.
We have created our own dataset to implement this model.
We used MobileNet Pretrained Weights from Tensorflow.zoo to implement our Yolo Model. We Manually labeled the Dataset for Hand Detection using LabelImg. The dataset has 60 images of 4 different gestures(15 each).
Although we only used 60 images for training our YOLO Model, we got great results in real-time.
We created a motion detection model for gesture recognition using Jester Dataset. This model consists of 10 CONV3D Layers and 3LSTM layers. Here, we extracted spatio-temporal features for motion detection. We implemented this model in TensorFlow as well as Pytorch. You can refer this (Attention in Convolutional LSTM for Gesture Recognition) paper for learning more about Conv3D+LSTM implementation.
.
โโโ 3b1b notes
โย ย โโโ Deep Learning
โย ย โโโ Linear Algebra
โโโ Coursera Notes
โย ย โโโ Course_1 Neural Networks and Deep Learning (Coursera)
โย ย โโโ Course_2 Improving Deep Neural Networks
โย ย โโโ Course_4 Convolutional Neural Networks
โโโ Create_Dataset
โย ย โโโ PreProcessingData.py
โย ย โโโ detect.py
โโโ GestureDetection
โย ย โโโ BgEliminationAndMotionDetection.py
โโโ Hand Detection Using OpenCV
โย ย โโโ Background_subtractor_hand_detection.py
โย ย โโโ Skin_Segmentation.py
โโโ Keras_Models
โย ย โโโ 3DCNN_LSTM.ipynb
โย ย โโโ 3DCNN_LSTM_Pytorch.ipynb
โย ย โโโ GestureWiseMaverick_Masking.ipynb
โย ย โโโ GestureWiseMaverick_NoMasking.ipynb
โย ย โโโ Yolo_MobileNet.ipynb
โโโ MNIST From Scratch Using Jax and Numpy
โย ย โโโ JAX_4L_Autodiff_MNIST_IMPLEMENTATION.ipynb
โย ย โโโ JAX_4L_Without_Autodiff.ipynb
โย ย โโโ NumPy_2L.ipynb
โย ย โโโ NumPy_4L.ipynb
โโโ README.md
โโโ ResNet-34
โย ย โโโ Assets
โย ย โโโ ResNets_34.ipynb
โย ย โโโ Residual model paper.pdf
โโโ Saved_Models
โย ย โโโ 1
โย ย โโโ 2
โย ย โโโ 3
โย ย โโโ 4
โย ย โโโ 5
โโโ environment.yml
For the Running Averages Model approach, we created our own dataset which consists of 14,000 images of 11 different hand gestures. We have uploaded our dataset on Kaggle with a sample notebook.
Running.Averages.Results.mp4
Results with YOLO Object Detection Model with MobileNet Pretrained Weights(Consisting of only 60 Images)
Yolo.Detection.Results.mp4
Demo.mp4
-
Gaming: Implement gesture-based controls in gaming applications to provide a more immersive and interactive gaming experience, allowing players to control in-game actions through hand movements.
-
Accessibility Tools: The project has the potential to create accessibility tools that empower individuals with disabilities to control computers, mobile devices, and applications using hand gestures, enhancing their digital independence.
-
Educational Platforms: The project could lead to the development of interactive educational platforms where teachers and students can engage with digital content, presentations, and simulations using gestures, fostering more engaging and immersive learning experiences.
-
Human-Robot Interaction: The project has the potential to improve human-robot interactions by enabling robots to understand and respond to human gestures, making collaborative tasks more intuitive and efficient.
- Linux 18.04 or above
- TensorFlow Object Detection API
- Conda installed on system
Start by cloning the repo in a directory of your choice
git clone https://github.com/AryanNanda17/GestureSense
Navigate to the Project directory Let's say you cloned in your Desktop
cd Desktop/GestureSense
Create a virtual environment with the required dependencies
conda env create --name envname -f environment.yml
Switch environment
conda activate envname
Create a Dataset of Your choice
cd Create_Dataset
Here you can add path where you want the dataset to get collected and the name of your Gesture Label
To do that run:-
python3 detect.py -h
for example :
-
choose an Image path:
python3 main.py -p "Your Path Here"
-
choose Label :
python3 main.py -l GestureName
Now, we will parse this dataset through another code for masking. For that run:-
python3 PreProcessingData.py "Image_Path"
Now, you can train your own model with the selected gestures using the VGG-7 Architecture
After Training Export your Model
Run the Running Averages Bg_Subtraction Model
For that First Navigate there:-
cd ~/Desktop/GestureSense/GestureDetection
Now, Run the following command
python your_script.py -m /path/to/your/model -c 1finger 2finger 3finger C ThumbRight fingersclosein italydown kitli pinky spreadoutpalm yoyo
Hence the setup of the Running Averages BgSubtraction Model is completed.
- Attention Mechanism Integration: Incorporate attention mechanisms into the CONV3D+LSTM model to improve its ability to focus on relevant features in gesture sequences, enhancing accuracy.
- Mouse Control with YOLO Object Detection API: Implement mouse control functionality using the YOLO object detection API, allowing users to manipulate their computers using gesture-based control with high accuracy.
- Interactive Web Platform Development: Create an interactive web platform that provides users with a user-friendly interface to access and utilize the gesture control system. This platform should be compatible with various browsers and operating systems.
- SRA Vjti Eklavya 2023
A special thanks to our mentors for this project:
The LICENSE used in this project.