diff --git a/.dockerignore b/.dockerignore index b694934..1d17dae 100644 --- a/.dockerignore +++ b/.dockerignore @@ -1 +1 @@ -.venv \ No newline at end of file +.venv diff --git a/.github/ISSUE_TEMPLATE/bugs.yaml b/.github/ISSUE_TEMPLATE/bugs.yaml index acead36..21b736a 100644 --- a/.github/ISSUE_TEMPLATE/bugs.yaml +++ b/.github/ISSUE_TEMPLATE/bugs.yaml @@ -7,12 +7,12 @@ body: attributes: value: | ## Instructions To Reproduce the 🐛 Bug: - + 1. Background explanation - type: textarea attributes: - label: Full runnable code or full changes you made: + label: "Full runnable code or full changes you made:" description: Please provide the code or changes that led to the bug. placeholder: | ``` @@ -22,7 +22,7 @@ body: - type: textarea attributes: - label: What exact command you ran: + label: "What exact command you ran:" description: Describe the exact command you ran that triggered the bug. validations: required: true @@ -51,4 +51,4 @@ body: description: Indicate your environment details. options: - label: "I'm using the latest version!" - - label: "It's not a user-side mistake!" \ No newline at end of file + - label: "It's not a user-side mistake!" diff --git a/.github/ISSUE_TEMPLATE/documentation.yaml b/.github/ISSUE_TEMPLATE/documentation.yaml index f583259..ea40997 100644 --- a/.github/ISSUE_TEMPLATE/documentation.yaml +++ b/.github/ISSUE_TEMPLATE/documentation.yaml @@ -6,7 +6,7 @@ body: attributes: value: | ## 📚 Documentation Issue - + This issue category is for problems about existing documentation, not for asking how-to questions. - type: input diff --git a/.github/ISSUE_TEMPLATE/feature-request.yaml b/.github/ISSUE_TEMPLATE/feature-request.yaml index 8303a6b..a4d5dfe 100644 --- a/.github/ISSUE_TEMPLATE/feature-request.yaml +++ b/.github/ISSUE_TEMPLATE/feature-request.yaml @@ -6,7 +6,7 @@ body: attributes: value: | ## 🚀 Feature - + A clear and concise description of the feature proposal. - type: textarea @@ -25,6 +25,6 @@ body: attributes: value: | ## Note - + We only consider adding new features if they are relevant to this library. Consider if this new feature deserves to be here or should be a new library. diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index e538be2..0d2cdbb 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -63,4 +63,4 @@ repos: additional_dependencies: - mdformat-gfm - mdformat_frontmatter - exclude: CHANGELOG.md \ No newline at end of file + exclude: CHANGELOG.md diff --git a/Dockerfile b/Dockerfile index 63f549c..62f1b9f 100644 --- a/Dockerfile +++ b/Dockerfile @@ -21,4 +21,4 @@ RUN pip install -r requirements.txt EXPOSE 8000 # Entry point -CMD ["python3", "app.py"] \ No newline at end of file +CMD ["python3", "app.py"] diff --git a/README.md b/README.md index 0df9e52..3a2fa7e 100644 --- a/README.md +++ b/README.md @@ -43,7 +43,6 @@ docker build --tag lang-segment-anything:latest . docker run --gpus all -p 8000:8000 lang-segment-anything:latest ``` - ### Usage To run the gradio APP: @@ -69,7 +68,6 @@ results = model.predict([image_pil], [text_prompt]) ![fruits.png](/assets/outputs/fruits.png) - ## Acknowledgments This project is based on/used the following repositories: diff --git a/lang_sam/lang_sam.py b/lang_sam/lang_sam.py index b3729b0..3c319d0 100644 --- a/lang_sam/lang_sam.py +++ b/lang_sam/lang_sam.py @@ -20,8 +20,7 @@ def predict( box_threshold: float = 0.3, text_threshold: float = 0.25, ): - """ - Predicts masks for given images and text prompts using GDINO and SAM models. + """Predicts masks for given images and text prompts using GDINO and SAM models. Parameters: images_pil (list[Image.Image]): List of input images. diff --git a/lang_sam/models/gdino.py b/lang_sam/models/gdino.py index 4522a90..86c5b04 100644 --- a/lang_sam/models/gdino.py +++ b/lang_sam/models/gdino.py @@ -1,7 +1,7 @@ -import numpy as np import torch from PIL import Image from transformers import AutoModelForZeroShotObjectDetection, AutoProcessor + from lang_sam.models.utils import get_device_type device_type = get_device_type() diff --git a/lang_sam/models/sam.py b/lang_sam/models/sam.py index 220352e..4ee9e9f 100644 --- a/lang_sam/models/sam.py +++ b/lang_sam/models/sam.py @@ -5,6 +5,7 @@ from omegaconf import OmegaConf from sam2.automatic_mask_generator import SAM2AutomaticMaskGenerator from sam2.sam2_image_predictor import SAM2ImagePredictor + from lang_sam.models.utils import get_device_type DEVICE = torch.device(get_device_type()) diff --git a/lang_sam/models/utils.py b/lang_sam/models/utils.py index bb4f003..d822d39 100644 --- a/lang_sam/models/utils.py +++ b/lang_sam/models/utils.py @@ -1,4 +1,5 @@ import logging + import torch diff --git a/lang_sam/server.py b/lang_sam/server.py index fa545ff..e460af5 100644 --- a/lang_sam/server.py +++ b/lang_sam/server.py @@ -18,8 +18,7 @@ def setup(self, device: str) -> None: print("LangSAM model initialized.") def decode_request(self, request) -> dict: - """ - Decode the incoming request to extract parameters and image bytes. + """Decode the incoming request to extract parameters and image bytes. Assumes the request is sent as multipart/form-data with fields: - sam_type: str @@ -50,15 +49,17 @@ def decode_request(self, request) -> dict: } def predict(self, inputs: dict) -> dict: - """ - Perform prediction using the LangSAM model. + """Perform prediction using the LangSAM model. Yields: dict: Contains the processed output image. """ print("Starting prediction with parameters:") print( - f"sam_type: {inputs['sam_type']}, box_threshold: {inputs['box_threshold']}, text_threshold: {inputs['text_threshold']}, text_prompt: {inputs['text_prompt']}" + f"sam_type: {inputs['sam_type']}, \ + box_threshold: {inputs['box_threshold']}, \ + text_threshold: {inputs['text_threshold']}, \ + text_prompt: {inputs['text_prompt']}" ) if inputs["sam_type"] != self.model.sam_type: @@ -96,8 +97,7 @@ def predict(self, inputs: dict) -> dict: return {"output_image": output_image} def encode_response(self, output: dict) -> Response: - """ - Encode the prediction result into an HTTP response. + """Encode the prediction result into an HTTP response. Returns: Response: Contains the processed image in PNG format. diff --git a/requirements.txt b/requirements.txt index bcb5276..26def0e 100644 --- a/requirements.txt +++ b/requirements.txt @@ -11,4 +11,4 @@ uvloop==0.20.0 --extra-index-url https://download.pytorch.org/whl/cu124 torch==2.4.1 --extra-index-url https://download.pytorch.org/whl/cu124 -torchvision==0.19.1 \ No newline at end of file +torchvision==0.19.1