Apple Detection and Depth Estimation Pipeline

refer to these links to understand the code:

https://github.com/facebookresearch/sam2

https://pytorch.org/hub/intelisl_midas_v2/

Depth Estimation

• Objective: Generate a depth map for the input image to estimate the relative distances of objects. • Process:
1. Load a pre-trained depth estimation model (e.g., MiDaS Small for faster inference).
2. Preprocess the image: • Resize to match the model’s input dimensions. • Normalize pixel values to [0, 1].
3. Run the depth estimation model to generate the depth map.
4. Optionally invert the depth values to make closer objects brighter.
SAM Mask Generation

• Objective: Segment the image into object masks using the Segment Anything Model (SAM). • Process:
1. Load the SAM model and configure it for automatic mask generation.
2. Feed the input image into the model to generate segmentation masks.
3. Each mask includes attributes like segmentation, bbox, and area.
YOLO Object Detection

• Objective: Detect apples in the image and validate SAM-generated masks. • Process:
1. Load a pre-trained YOLO model.
2. Run YOLO on the input image to detect objects.
3. Extract bounding boxes (bbox) and confidence scores for each detected apple.
4. Filter detections using a confidence threshold (e.g., conf=0.2).
Filter SAM Masks with YOLO Results

• Objective: Retain only the SAM masks that overlap with YOLO-detected apple bounding boxes. • Process:
1. Convert SAM masks to bounding boxes.
2. Scale YOLO bounding boxes if necessary to match SAM mask resolution.
3. Calculate the Intersection over Union (IoU) between each SAM mask and YOLO bounding box.
4. Retain SAM masks with IoU ≥ threshold (e.g., 0.5).
Median Depth Comparison

• Objective: Determine the relative distances of the retained SAM masks based on depth values. • Process:
1. For each filtered SAM mask: • Apply the mask to the depth map. • Extract depth values corresponding to the mask area. • Calculate the median depth for the mask.
2. Compare the median depths: • Identify masks with the smallest (nearest) or largest (farthest) median depth.
3. Highlight the selected masks: • Use color overlays to visualize the selected masks on the original image.

Pipeline Summary

1.	Input Image: The original image is used as input for both YOLO and SAM.
2.	Depth Map: Generated using MiDaS to estimate distances.
3.	SAM Masks: Automatically segmented using SAM.
4.	YOLO Filtering: YOLO bounding boxes are used to validate SAM masks.
5.	Median Depth: Filtered masks are compared using their median depth values.

Key Parameters to Tune

•	YOLO Confidence Threshold (conf): Adjust for better object detection.
•	IoU Threshold: Controls how closely SAM masks must match YOLO detections.
•	Depth Map Inversion: Invert depth values for better visualization, if needed.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
README.md		README.md
pipeline_midas_yolosam.ipynb		pipeline_midas_yolosam.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Apple Detection and Depth Estimation Pipeline

About

Releases

Packages

Languages

harshdeepkalita/NearestApple

Folders and files

Latest commit

History

Repository files navigation

Apple Detection and Depth Estimation Pipeline

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages