This is the git repo for CS611 ML Engineering project for airbus ship detection.
- Wong Songhan
- Koh Enyong
- Arnold Ng
- Gabriel Quek
In this project, we tackle the problem of identifying ships in satellite images. We recognize 3 main applications for this problem:
- Maritime Traffic Management – Improves general situational awareness, especially for small vessels not covered by AIS
- Maritime Surveillance & Policing – For detection and tracking of vessels with AIS turned off, which may be engaged in illegal activity
- Naval Warfare – An additional source of intelligence for detecting enemy locations
The dataset was retrieved from Kaggle based on the Attributes of the dataset:
- 192,556 images from Airbus Ship Detection Challenge
- Each image may have multiple ships
- Labels are run-length encoded (RLE), for data compression, need to be converted to single channel image
Visit this Kaggle page for more info
Below are the components of our entire pipeline:
We interactively approach the model building and exploration based on the input dataset. Understanding the dataset and problem well before training and building of our model and their respective components.
Due to the complexity of the input dataset and problem itself, preprocessing of the input data is essential to provide good input data for our pipeline.
In this section, we create a component that computes the data statistics.
Building of model training component that is used by the overall pipeline to be deployed and part of the CI/CD process that retrains the model based on certain triggers.
Component building of evaluation. Evaluation of the output trained model is conducted. Metrics will be output.
Model is deployed to Vertex AI that is used to serve endpoint.
Stringing together of the pipeline, alongside test components that ensures every component in the pipeline is in order before pushing it to the Vertex AI platform.
Using the data statistics generated from Step (3), this notebook is used to aassess new data for train-serve drift.
This notebook provides a demo of calling RESTful api from Endpoint which returns a model prediction result given an input image.
├── LICENSE
├── README.md <- The top-level README
├── build
├── config <- config file for GCP resource
├── provision <- terraform config for GCP resource startup
├── Dockerfile <- Docker file for custom model trainer
├── saved_models <- Trained and serialized model data (for exploratory)
│
├── references <- Data dictionaries, manuals, and all other explanatory materials.
│
├── reports <- Generated analysis as HTML, PDF, LaTeX, etc.
│ └── figures <- Generated graphics and figures to be used in reporting
│
├── src <- Source code for use in this project.
│ ├── __init__.py <- Makes src a Python module
│ │
│ ├── evaluation <- Scripts to generate model evaluation component
│ │ └── eval_component.py
│ │
│ │
│ ├── model_training <- Scripts for custom model training
│ │
│ │── models <- Preprocessing scripts
│ │
│ └── utils <- Common util scripts for data ingest and pre-processing
│ └── common.py
│ └── dataset.py
│
└── tox.ini <- tox file with settings for running tox; see tox.readthedocs.io
Project based on the cookiecutter data science project template. #cookiecutterdatascience