Train and save model on iris dataset:
python dev.model.py > ./local_artifacts/train_log.txt 2>&1
Run inference tests on iris dataset and custom inputs:
python dev.inference.py > ./local_artifacts/inference_log.txt 2>&1
Start the server:
uvicorn serve.inference_api:app --reload
Predict Setosa
curl -X POST "http://localhost:8000/predict" \
-H "Content-Type: application/json" \
-d '{
"sepal_length": 5.1,
"sepal_width": 3.5,
"petal_length": 1.4,
"petal_width": 0.2
}'
sudo docker build -t iris .
sudo docker run -d -p 8000:8000 iris
Use the same curl requests as above for testing.
CPU Version:
flyctl deploy --remote-only --config cpu.fly.toml --dockerfile ./Dockerfile.cpu
GPU Version:
flyctl deploy --remote-only --config gpu.fly.toml --dockerfile ./Dockerfile.gpu
Health Check:
curl -X GET "https://good-old-iris-model.fly.dev/env" \
-H "Content-Type: application/json"
Expected Response:
{ "env_hf_token": true, "env_hf_repo": true, "is_gpu_available": false }
Model Inference:
Predict Virginica
curl -X POST "https://good-old-iris-model.fly.dev/predict" \
-H "Content-Type: application/json" \
-d '{
"sepal_length": 7.7,
"sepal_width": 3.8,
"petal_length": 6.7,
"petal_width": 2.2
}'
Predict Setosa
curl -X POST "https://good-old-iris-model.fly.dev/predict" \
-H "Content-Type: application/json" \
-d '{
"sepal_length": 5.1,
"sepal_width": 3.5,
"petal_length": 1.4,
"petal_width": 0.2
}'
Expected Response example:
{
"predicted_class": 2,
"predicted_class_name": "virginica",
"confidence": 0.54,
"probabilities": {
"setosa": 0.1,
"versicolor": 0.36,
"virginica": 0.54
}
}
- GPU deployment requires manual account review from Fly.io:
✖ Failed: error creating a new machine: failed to launch VM: Your organization is not allowed to use GPU machines. Please contact [email protected]
Please contact [email protected] (Request ID: 01JGHZ65QWG5FNV757MFYJTBQ7-iad) (Trace ID: 7c92684e1b16fd263ee75b2b5b34e7e9)
See forum discussion.
- CI/CD is functional for both CPU and GPU versions, but GPU machine creation fails due to Fly.io restrictions.