This Docker Compose configuration outlines a complete setup for running local AI models using Ollama with a web interface. It's designed to be accessible remotely, with integration of Cloudflare for enhanced security and accessibility.
- Supported NVIDIA GPU
- NVIDIA Container Toolkit
- Docker Compose
- Image:
ghcr.io/open-webui/open-webui:main
- Function: Serves as the web interface for interacting with the Ollama AI models.
- Customization: Adjust
OLLAMA_API_BASE_URL
to match the internal network URL of theollama
service. If runningollama
on the docker host, comment out the existingOLLAMA_API_BASE_URL
and use the provided alternative.
- Image:
ollama/ollama
- Function: Acts as the AI model server, with the capability to utilize NVIDIA GPUs for model inference.
- GPU Utilization: Configured to use NVIDIA GPUs to ensure efficient model inference. Verify your system's compatibility.
- Image:
cloudflare/cloudflared:latest
- Function: Provides a secure tunnel to the web UI via Cloudflare, enhancing remote access security.
- Note: We are using the demo mode by default, so the URL will change each time you restart unless you create an account with cloudflare
-
Volumes: Two volumes,
ollama
andopen-webui
, are defined for data persistence across container restarts. -
Environment Variables: Ensure
OLLAMA_API_BASE_URL
is correctly set. Utilize thehost.docker.internal
address ifollama
runs on the Docker host. -
Deployment:
- Run
docker compose up -d
to start the services in detached mode.
- Run
-
Accessing the Web UI:
- Directly via
http://localhost:8080
if local access is sufficient. - Through the Cloudflare Tunnel URL printed in the docker logs. Run
docker compose logs tunnel
to find the URL for remote access
- Directly via