Running a Company with 20+ Claude Code Agents — 3 Months of Production Data
Three months ago I started replacing human roles with Claude Code agents. Not as a side experiment — as the actual operating model of a real company. Here's what I've learned, including what failed spectacularly.
The short version: It works. But not because of the model. Because of the architecture.
What "AI-operated" actually means
When I say the company is AI-operated, I mean this literally: there are four departments — ICT, Marketing, Operations, Strategy — and each has a Department Head agent that runs 24/7 on its own VPS session. These agents spawn worker agents for specific tasks, monitor results, and escalate to me (the human founder) only when something requires a decision I've explicitly reserved for myself.
My daily involvement: reviewing a handful of Telegram messages, approving social media posts (I still keep a human-in-the-loop for anything public), and making occasional strategic decisions. That's it. Everything else runs.
The tech stack (boring, on purpose)
- Claude Code — Anthropic's terminal environment, each agent runs as its own session
- 3 VPS servers — €20-25/month each (Hetzner/Netcup). CEO + Strategy on one, ICT + Marketing on another, Operations + n8n on third
- n8n — workflow automation, self-hosted. Trial email sequences, social media scheduling, Supabase writes
- Supabase — self-hosted Postgres for user data and agent memory
- Qdrant — vector database for agent long-term memory (what worked, what failed)
- systemd + tmux — keeping agents alive. Not Kubernetes. Not Docker Swarm. tmux.
Total infrastructure cost: ~€10/day including Anthropic API. A small team would cost 100x more.
The hierarchy that actually works
The first architecture I tried was flat: one big agent with access to everything. It failed within a week. The context window bloated, the agent started making inconsistent decisions, and there was no way to debug which "version" of the agent made a given choice.
What works is a strict hierarchy:
CEO Agent (Opus 4.6)
├── ICT Department Head (Sonnet 4.6)
│ ├── DevOps Group Lead
│ │ ├── Server Monitor Worker (Haiku 4.5)
│ │ └── Deploy Worker (Haiku 4.5)
│ └── Development Group Lead
│ └── Code Worker (Sonnet 4.6)
├── Marketing Department Head (Sonnet 4.6)
│ ├── Social Media Worker
│ └── Blog Writer Worker
├── Operations Department Head (Sonnet 4.6)
│ ├── Email Triage Worker
│ └── Bookkeeping Worker
└── Strategy Department Head (Sonnet 4.6)
├── KPI Monitor Worker
└── Research Worker
Each level uses a different model tier: Opus for decisions, Sonnet for coordination, Haiku for execution. This keeps costs rational — you don't need Opus to restart a service.
TEAM_COMMAND.md — how agents know what to do
Every agent has a TEAM_COMMAND.md file that defines its role, responsibilities, and — critically — what it's not allowed to do. This file is the agent's constitution. It's also checked into git.
Example: the Social Media Worker knows it generates posts, formats them, and submits them for human review. It does not post directly. It does not touch other departments. It does not spend more than €2/day on API calls. These aren't suggestions — they're constraints enforced by the prompt.
The failures (the actually useful part)
What actually generates value
After three months, the highest-ROI automations are not the impressive-sounding ones:
- Email triage: 8 hours/week saved. Not glamorous. Huge impact.
- Social media: 290+ Instagram followers from zero, fully automated. One agent, running continuously.
- KPI monitoring: Hourly checks with automatic Telegram alerts. I catch problems before they compound.
- Blog writing: 8 blog posts in 4 weeks, each SEO-optimized. None required more than 5 minutes of my time.
The flashy stuff — automated trading signals, complex multi-step reasoning — generated less value per hour invested than boring process automation.
What I'd tell myself 3 months ago
The course
I documented all of this — the architecture, the TEAM_COMMAND.md templates, the n8n workflows, the failure patterns — in a course. Not a cleaned-up version. The actual files we use in production, with commentary on what we changed and why.
Five modules: Agent Architecture → Claude Code CLI → Prompting for Agents → Tool Use & MCP → Multi-Agent Systems. 35 interactive activities. Audio narration. The full thing.
Claude Code Mastery — Build production agents
The exact architecture, templates, and workflows we use to run 20+ agents in production. 7-day free trial, all 5 modules.
Start free trial → No credit card · Instant access · Cancel anytime