Skip to content
Thomas Patton edited this page May 25, 2023 · 3 revisions

Usage

Introduction

This repository for maskrcnn acts as both a ROS package and a Python package with standalone scripts that can be used. If you're interested in actually running the full ROS node to generate predictions as a video stream, see Full Setup. If you just want to run an individual component of the project, see Scripts.

Full Setup

If you want to actually use the package in its full form, here's the setup guide:

This will start publishing image markers and segmentation predictions which can be viewed through rviz .

Note: You will also need to run rosrun tf static_transform_publisher 0.0 0.0 0.0 0.0 0.0 0.0 1.0 camera_color_optical_frame map 1000 if you are already not publishing this transform or similar. In general some transform has to be specified for the destination frame.

Note: If you do not have the MaskRCNN model checkpoint, you will need to download the .pth from here and place it in the maskrcnn python package. There is a setting in src/food_detector/ada_feeding_demo_config.py that allows you to point to the model checkpoint.

The above setting or others can be altered in food_detector/src/ada_feeding_demo_config.py. Note that the maskrcnn settings are at the bottom of this file.

Scripts

There are a few discrete pieces of this project that are useful on their own for various tasks. For a more detailed description of how these scripts, check out their respective links within the documentation page.

  • src/train.py can be used to train the MaskRCNN model given a set of data. This script uses src/dataset.py to build a dataset of training files and then runs through a standard gradient descent loop to update a MaskRCNN model while saving the model after every few epochs.
  • scripts/annotate.py is the bulk of this project. This script is responsible for loading and propagating segmentations as described in the project overview document.
  • scripts/correct_masks.py uses generated masks by the above script along with color images and SAM predictions to correct segmentation masks as described in the project overview document.
Clone this wiki locally