GestureSense

🎥 Exploring all possible Deep Learning Models for Webcam-Based Hand
Gesture Navigation for Enhanced Accessibility. 🎬
SRA Eklavya 2023

Table of Contents

Aim
Description
Tech Stack
File Structure
Dataset
Applications
Results
PreRequisites
Project Setup
Future Scope
Contributors
Acknowledgements

⭐Aim

The aim of our project is to develop an intuitive and accessible navigation system that leverages deep learning and computer vision to enable users to control digital devices through natural hand gestures captured by webcams.

📝Description

We have implemented 3 Deep Learning Models for gesture Recognition and Navigation:-

1. Running Averages Model for Bg-Subtraction:-

We used VGG7 Architecture followed by running averages model for BG subtraction for Gesture Recognition and Navigation We have used the OpenCV function "accumulateWeghts" to find the running averages of frames. We manually created a dataset through which we trained our model.

cv2.accumulateWeighted(src, dst, alpha)

This is the motion detection class that we used in our project.

This(Running Average model – Background Subtraction)Article explains the Running Averages Approach for Gesture Recognition very clearly.

We have created our own dataset to implement this model.

2. YOLO Hand Gesture Detection Model

We used MobileNet Pretrained Weights from Tensorflow.zoo to implement our Yolo Model. We Manually labeled the Dataset for Hand Detection using LabelImg. The dataset has 60 images of 4 different gestures(15 each).

Although we only used 60 images for training our YOLO Model, we got great results in real-time.

3. CONV3D+LSTM Model

We created a motion detection model for gesture recognition using Jester Dataset. This model consists of 10 CONV3D Layers and 3LSTM layers. Here, we extracted spatio-temporal features for motion detection. We implemented this model in TensorFlow as well as Pytorch. You can refer this (Attention in Convolutional LSTM for Gesture Recognition) paper for learning more about Conv3D+LSTM implementation.

🤖Tech-Stack

Programming Language

DL Framework

Image Processing

Libraries

📁File Structure

. ├── 3b1b notes │ ├── Deep Learning │ └── Linear Algebra ├── Coursera Notes │ ├── Course_1 Neural Networks and Deep Learning (Coursera) │ ├── Course_2 Improving Deep Neural Networks │ └── Course_4 Convolutional Neural Networks ├── Create_Dataset │ ├── PreProcessingData.py │ └── detect.py ├── GestureDetection │ └── BgEliminationAndMotionDetection.py ├── Hand Detection Using OpenCV │ ├── Background_subtractor_hand_detection.py │ └── Skin_Segmentation.py ├── Keras_Models │ ├── 3DCNN_LSTM.ipynb │ ├── 3DCNN_LSTM_Pytorch.ipynb │ ├── GestureWiseMaverick_Masking.ipynb │ ├── GestureWiseMaverick_NoMasking.ipynb │ └── Yolo_MobileNet.ipynb ├── MNIST From Scratch Using Jax and Numpy │ ├── JAX_4L_Autodiff_MNIST_IMPLEMENTATION.ipynb │ ├── JAX_4L_Without_Autodiff.ipynb │ ├── NumPy_2L.ipynb │ └── NumPy_4L.ipynb ├── README.md ├── ResNet-34 │ ├── Assets │ ├── ResNets_34.ipynb │ └── Residual model paper.pdf ├── Saved_Models │ ├── 1 │ ├── 2 │ ├── 3 │ ├── 4 │ └── 5 └── environment.yml

📓Dataset

For the Running Averages Model approach, we created our own dataset which consists of 14,000 images of 11 different hand gestures. We have uploaded our dataset on Kaggle with a sample notebook.

Results

Results of Running Averages BgSubtraction Model With VGG-7 Architecture

Running.Averages.Results.mp4

Results with YOLO Object Detection Model with MobileNet Pretrained Weights(Consisting of only 60 Images)

Yolo.Detection.Results.mp4

💸Applications

Demo.mp4

Gaming: Implement gesture-based controls in gaming applications to provide a more immersive and interactive gaming experience, allowing players to control in-game actions through hand movements.

Accessibility Tools: The project has the potential to create accessibility tools that empower individuals with disabilities to control computers, mobile devices, and applications using hand gestures, enhancing their digital independence.

Educational Platforms: The project could lead to the development of interactive educational platforms where teachers and students can engage with digital content, presentations, and simulations using gestures, fostering more engaging and immersive learning experiences.

Human-Robot Interaction: The project has the potential to improve human-robot interactions by enabling robots to understand and respond to human gestures, making collaborative tasks more intuitive and efficient.

🛠Getting Started

Prerequisites

Linux 18.04 or above

TensorFlow Object Detection API

Conda installed on system

Project Setup

🛠Project Setup

Start by cloning the repo in a directory of your choice

git clone https://github.com/AryanNanda17/GestureSense

Navigate to the Project directory Let's say you cloned in your Desktop

cd Desktop/GestureSense

Create a virtual environment with the required dependencies

conda env create --name envname -f environment.yml

Switch environment

conda activate envname

Create a Dataset of Your choice

cd Create_Dataset

Here you can add path where you want the dataset to get collected and the name of your Gesture Label

To do that run:-

python3 detect.py -h

for example :

choose an Image path:

python3 main.py -p "Your Path Here"

choose Label :

python3 main.py -l GestureName

Now, we will parse this dataset through another code for masking. For that run:-

python3 PreProcessingData.py "Image_Path"

Now, you can train your own model with the selected gestures using the VGG-7 Architecture

After Training Export your Model

Run the Running Averages Bg_Subtraction Model

For that First Navigate there:-

cd ~/Desktop/GestureSense/GestureDetection

Now, Run the following command

python your_script.py -m /path/to/your/model -c 1finger 2finger 3finger C ThumbRight fingersclosein italydown kitli pinky spreadoutpalm yoyo

Hence the setup of the Running Averages BgSubtraction Model is completed.

🔮Future Scope

Attention Mechanism Integration: Incorporate attention mechanisms into the CONV3D+LSTM model to improve its ability to focus on relevant features in gesture sequences, enhancing accuracy.

Mouse Control with YOLO Object Detection API: Implement mouse control functionality using the YOLO object detection API, allowing users to manipulate their computers using gesture-based control with high accuracy.

Interactive Web Platform Development: Create an interactive web platform that provides users with a user-friendly interface to access and utilize the gesture control system. This platform should be compatible with various browsers and operating systems.

Contributors

Aryan Nanda - [email protected]

Mihir Gore - [email protected]

Acknowledgements

SRA Vjti Eklavya 2023

A special thanks to our mentors for this project:

Advait Dhamorikar

LakshayaSinghal

Khushi-Balia

License

The LICENSE used in this project.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GestureSense

🎥 Exploring all possible Deep Learning Models for Webcam-Based Hand
Gesture Navigation for Enhanced Accessibility. 🎬
SRA Eklavya 2023

⭐Aim

📝Description

1. Running Averages Model for Bg-Subtraction:-

2. YOLO Hand Gesture Detection Model

3. CONV3D+LSTM Model

🤖Tech-Stack

Programming Language

DL Framework

Image Processing

Libraries

📁File Structure

📓Dataset

Results

Results of Running Averages BgSubtraction Model With VGG-7 Architecture

Results with YOLO Object Detection Model with MobileNet Pretrained Weights(Consisting of only 60 Images)

💸Applications

🛠Getting Started

Prerequisites

Project Setup

🛠Project Setup

🔮Future Scope

Contributors

Acknowledgements

License

About

Releases

Packages

Contributors 2

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
3b1b notes		3b1b notes
Coursera Notes		Coursera Notes
Create_Dataset		Create_Dataset
GestureDetection		GestureDetection
Hand Detection Using OpenCV		Hand Detection Using OpenCV
Keras_Models		Keras_Models
Mnist From Scratch Using Jax and Numpy		Mnist From Scratch Using Jax and Numpy
Pytorch_Models		Pytorch_Models
ResNet-34		ResNet-34
Saved_Models		Saved_Models
.gitignore		.gitignore
LICENSE.txt		LICENSE.txt
README.md		README.md
environment.yml		environment.yml

License

AryanNanda17/GestureSense

Folders and files

Latest commit

History

Repository files navigation

GestureSense

🎥 Exploring all possible Deep Learning Models for Webcam-Based Hand Gesture Navigation for Enhanced Accessibility. 🎬 SRA Eklavya 2023

⭐Aim

📝Description

1. Running Averages Model for Bg-Subtraction:-

2. YOLO Hand Gesture Detection Model

3. CONV3D+LSTM Model

🤖Tech-Stack

Programming Language

DL Framework

Image Processing

Libraries

📁File Structure

📓Dataset

Results

Results of Running Averages BgSubtraction Model With VGG-7 Architecture

Results with YOLO Object Detection Model with MobileNet Pretrained Weights(Consisting of only 60 Images)

💸Applications

🛠Getting Started

Prerequisites

Project Setup

🛠Project Setup

🔮Future Scope

Contributors

Acknowledgements

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

🎥 Exploring all possible Deep Learning Models for Webcam-Based Hand
Gesture Navigation for Enhanced Accessibility. 🎬
SRA Eklavya 2023

Packages