
Run a complete local LLM stack with one command
Harbor is a containerized CLI toolkit that brings a complete pre-wired LLM stack with hundreds of services to your local machine. It handles Docker Compose orchestration, configuration, and cross-service connectivity so you can run LLM backends like Ollama, llama.cpp, or vLLM alongside frontends like Open WebUI and supporting services like SearXNG, ComfyUI, and LangFuse — all with minimal configuration.
Spin up a complete local LLM stack with a single harbor up command — backends, frontends, and services all pre-wired together
Supports major inference engines (Ollama, llama.cpp, vLLM, TGI, LiteLLM, TabbyAPI, Aphrodite, SGLang) plus dozens of frontends and utilities
Services are automatically connected — running SearXNG enables Web RAG in Open WebUI, Speaches provides OpenAI-compatible STT and TTS out of the box
A companion desktop app providing a clean interface for managing services, configurations, and workflows
Modular enhancement system that adds capabilities like caching, routing, and middleware to your LLM stack
Built-in benchmarking tool for evaluating and comparing LLM performance across different backends and models
Host cache is shared and reused across services — Hugging Face models, Ollama weights, and other artifacts are downloaded once
Set up a complete local AI development environment with inference engines, frontends, and supporting services for model experimentation
Quickly spin up and compare different LLM backends, frontends, and tooling to find the best combination for your workflow
Run AI models and services entirely locally for privacy-sensitive work without sending data to external APIs
Rapidly prototype applications using local LLMs with pre-connected RAG, image generation, voice chat, and code assistance services
Access service CLIs (hf, ollama, etc.) directly via Docker without local installation of each tool
Built on Docker Compose, works across Linux, macOS, and Windows with consistent behavior

Open-source, AI-first business automation