Instance segmentation with PixelLib is based on MaskRCNN framework.
Download the mask rcnn model from here
Code to implement instance segmentation of videos
import pixellib
from pixellib.instance import instance_segmentation
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_video("video_path", frames_per_second= 20, output_video_name="path_to_outputvideo")
import pixellib
from pixellib.instance
segment_video = instance_segmentation()
We imported in the class for performing instance segmentation and created an instance of the class.
segment_video.load_model("mask_rcnn_coco.h5")
We loaded the maskrcnn model trained on coco dataset to perform instance segmentation and it can be downloaded from here.
segment_video.process_video("sample_video2.mp4", frames_per_second = 20, output_video_name = "output_video.mp4")
We called the function to perform segmentation on the video file.
It takes the following parameters:-
video_path: the path to the video file we want to perform segmentation on.
frames_per_second: this is parameter to set the number.of frames per second for the output video file. In this case it is set to 15 i.e the saved video file will have 15 frames per second.
output_video_name: the saved segmented video. The output video will be saved in your current working directory.
import pixellib
from pixellib.instance import instance_segmentation
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_video("sample_video.mp4", frames_per_second= 15, output_video_name="output_video.mp4")
Output video
We can perform instance segmentation with object detection by setting the parameter show_bboxes to true.
import pixellib
from pixellib.instance import instance_segmentation
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_video("sample_video2.mp4", show_bboxes = True, frames_per_second= 15, output_video_name="output_video.mp4")
Output video with bounding boxes
PixelLib supports the extraction of objects in videos and camera feeds.
import pixellib
from pixellib.instance import instance_segmentation
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_video("sample.mp4", show_bboxes=True, extract_segmented_objects=True,
save_extracted_objects=True, frames_per_second= 5, output_video_name="output.mp4")
segment_video.process_video("sample.mp4", show_bboxes=True, extract_segmented_objects=True,save_extracted_objects=True, frames_per_second= 5, output_video_name="output.mp4")
It still the same code except we introduced new parameters in the process_video which are:
extract_segmented_objects: this is the parameter that tells the function to extract the objects segmented in the image. It is set to true.
save_extracted_objects: this is an optional parameter for saving the extracted segmented objects.
The pre-trained coco model used detects 80 classes of objects. PixelLib has made it possible to filter out unused detections and detect the classes you want.
target_classes = segment_video.select_target_classes(person=True)
segment_video.process_video("sample.mp4", show_bboxes=True, segment_target_classes= target_classes, extract_segmented_objects=True,save_extracted_objects=True, frames_per_second= 5, output_video_name="output.mp4")
The target class for detection is set to person and we were able to segment only the people in the video.
target_classes = segment_video.select_target_classes(car = True)
segment_video.process_video("sample.mp4", show_bboxes=True, segment_target_classes= target_classes, extract_segmented_objects=True,save_extracted_objects=True, frames_per_second= 5, output_video_name="output.mp4")
The target class for detection is set to car and we were able to segment only the cars in the video.
import pixellib
from pixellib.instance import instance_segmentation
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
target_classes = segment_video.select_target_classes(person=True)
segment_video.process_video("sample.mp4", show_bboxes=True, segment_target_classes= target_classes, extract_segmented_objects=True,
save_extracted_objects=True, frames_per_second= 5, output_video_name="output.mp4")
We can use the same model to perform semantic segmentation on camera. This can be done by few modifications to the code used to process video file.
import pixellib
from pixellib.instance import instance_segmentation
import cv2
capture = cv2.VideoCapture(0)
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_camera(capture, frames_per_second= 10, output_video_name="output_video.mp4", show_frames= True,
frame_name= "frame")
import cv2
capture = cv2.VideoCapture(0)
We imported cv2 and included the code to capture camera frames.
segment_video.process_camera(capture, show_bboxes = True, frames_per_second = 15, output_video_name = "output_video.mp4", show_frames = True, frame_name = "frame")
In the code for performing segmentation, we replaced the video filepath to capture i.e we are going to process a stream camera frames instead of a video file.We added extra parameters for the purpose of showing the camera frames.
show_frames this parameter handles showing of segmented camera frames and press q to exit.
frame_name this is the name given to the shown camera's frame.
A demo by me showing the output of pixelLib's instance segmentation on camera's feeds using MASK-RCNN model.
This is the modified code below to filter unused detections and detect a target class in a live camera feed.
import pixellib
from pixellib.instance import instance_segmentation
import cv2
capture = cv2.VideoCapture(0)
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
target_classes = segment_video.select_target_classes(person=True)
segment_video.process_camera(capture, segment_target_classes=target_classes, frames_per_second= 10, output_video_name="output_video.mp4", show_frames= True,frame_name= "frame")
import pixellib
from pixellib.instance import instance_segmentation
import cv2
capture = cv2.VideoCapture(0)
segment_video = instance_segmentation()
segment_video.load_model("mask_rcnn_coco.h5")
target_classes = segment_video.select_target_classes(person=True)
segment_video.process_camera(capture, segment_target_classes=target_classes, frames_per_second= 10, extract_segmented_objects=True,
save_extracted_objects=True,output_video_name="output_video.mp4", show_frames= True,frame_name= "frame")
PixelLib now supports the ability to adjust the speed of detection according to a user's needs. The inference speed with a minimal reduction in the accuracy of detection. There are three main parameters that control the speed of detection.
average fast rapid
By default the detection speed is about 1 second for a processing a single image using Nvidia GeForce 1650. Using Average Detection Mode
import pixellib
from pixellib.instance import instance_segmentation
import cv2
capture = cv2.VideoCapture(0)
segment_video = instance_segmentation(infer_speed = "average")
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_camera(capture, frames_per_second= 10, output_video_name="output_video.mp4", show_frames= True,
frame_name= "frame")
In the modified code above within the class instance_segmentation we introduced a new parameter infer_speed which determines the speed of detection and it was set to average. The average value reduces the detection to half of its original speed, the detection speed would become 0.5 seconds for processing a single image.
Using fast Detection Mode
import pixellib
from pixellib.instance import instance_segmentation
import cv2
capture = cv2.VideoCapture(0)
segment_video = instance_segmentation(infer_speed = "fast")
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_camera(capture, frames_per_second= 10, output_video_name="output_video.mp4", show_frames= True,
frame_name= "frame")
In the code above we replaced the infer_speed value to fast and the speed of detection is about 0.35 seconds for processing a single image.
Using rapid Detection Mode
import pixellib
from pixellib.instance import instance_segmentation
import cv2
capture = cv2.VideoCapture(0)
segment_video = instance_segmentation(infer_speed = "rapid")
segment_video.load_model("mask_rcnn_coco.h5")
segment_video.process_camera(capture, frames_per_second= 10, output_video_name="output_video.mp4", show_frames= True,
frame_name= "frame")
In the code above we replaced the infer_speed value to rapid the fastest the detection mode. The speed of detection becomes 0.25 seconds for processing a single image. The rapid detection speed mode achieves 4fps.