When you start running AI agents in production, costs can spiral quickly. A busy agent hitting GPT-4 for every request can rack up hundreds of euros a month without you noticing. Here’s how we keep costs under control while maintaining reliability.
The core principle: cheap by default
The most important decision we made early on was this: expensive models are never the default. Most requests don’t need GPT-4 or Claude Opus. They need a fast, capable model that can follow instructions and handle straightforward tasks.
We use a routing strategy:
- Simple tasks → cheap fast model (Haiku, GPT-4o-mini)
- Reasoning tasks → mid-tier model (Sonnet, GPT-4o)
- Complex analysis → premium model (Opus, GPT-4)
Only the third category costs real money. And it’s rarely triggered.
The infrastructure
We run everything on a single Hetzner VPS — €7/month for a machine that handles our current load comfortably. On it we run:
- OpenClaw — our AI agent gateway, handles routing and model selection
- n8n — workflow automation, self-hosted
- Nginx — serves our static sites
- Docker — keeps everything containerised and easy to manage
Total infrastructure cost: ~€15-20/month including the VPS, domain, and a small buffer.
The AI API cost
This varies by usage, but with smart routing most clients stay under €30/month in AI API costs. We use OpenRouter which lets us switch models instantly and gives us one bill for all providers.
The lesson
Self-hosting isn’t for everyone. But if you’re building AI systems professionally, understanding the infrastructure layer saves you money and gives you control. Start simple, optimise as you grow.