Skip to content

harshdeepkalita/NearestApple

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Apple Detection and Depth Estimation Pipeline

refer to these links to understand the code:

https://github.com/facebookresearch/sam2

https://pytorch.org/hub/intelisl_midas_v2/

  1. Depth Estimation

    • Objective: Generate a depth map for the input image to estimate the relative distances of objects. • Process:

    1. Load a pre-trained depth estimation model (e.g., MiDaS Small for faster inference).
    2. Preprocess the image: • Resize to match the model’s input dimensions. • Normalize pixel values to [0, 1].
    3. Run the depth estimation model to generate the depth map.
    4. Optionally invert the depth values to make closer objects brighter.
  2. SAM Mask Generation

    • Objective: Segment the image into object masks using the Segment Anything Model (SAM). • Process:

    1. Load the SAM model and configure it for automatic mask generation.
    2. Feed the input image into the model to generate segmentation masks.
    3. Each mask includes attributes like segmentation, bbox, and area.
  3. YOLO Object Detection

    • Objective: Detect apples in the image and validate SAM-generated masks. • Process:

    1. Load a pre-trained YOLO model.
    2. Run YOLO on the input image to detect objects.
    3. Extract bounding boxes (bbox) and confidence scores for each detected apple.
    4. Filter detections using a confidence threshold (e.g., conf=0.2).
  4. Filter SAM Masks with YOLO Results

    • Objective: Retain only the SAM masks that overlap with YOLO-detected apple bounding boxes. • Process:

    1. Convert SAM masks to bounding boxes.
    2. Scale YOLO bounding boxes if necessary to match SAM mask resolution.
    3. Calculate the Intersection over Union (IoU) between each SAM mask and YOLO bounding box.
    4. Retain SAM masks with IoU ≥ threshold (e.g., 0.5).
  5. Median Depth Comparison

    • Objective: Determine the relative distances of the retained SAM masks based on depth values. • Process:

    1. For each filtered SAM mask: • Apply the mask to the depth map. • Extract depth values corresponding to the mask area. • Calculate the median depth for the mask.
    2. Compare the median depths: • Identify masks with the smallest (nearest) or largest (farthest) median depth.
    3. Highlight the selected masks: • Use color overlays to visualize the selected masks on the original image.

Pipeline Summary

1.	Input Image: The original image is used as input for both YOLO and SAM.
2.	Depth Map: Generated using MiDaS to estimate distances.
3.	SAM Masks: Automatically segmented using SAM.
4.	YOLO Filtering: YOLO bounding boxes are used to validate SAM masks.
5.	Median Depth: Filtered masks are compared using their median depth values.

Key Parameters to Tune

•	YOLO Confidence Threshold (conf): Adjust for better object detection.
•	IoU Threshold: Controls how closely SAM masks must match YOLO detections.
•	Depth Map Inversion: Invert depth values for better visualization, if needed.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published