The first time one of our AI agents died mid-task because a serverless function hit its 15-minute timeout, we thought: fine, we'll just refactor it to be faster. The second time it happened, we started looking at VPS pricing. The third time, we stopped pretending cloud functions were built for this use case.

Running AI agents 24/7 is fundamentally different from running a web API. Your agent holds state. It has memory it's actively building. It might be in the middle of a 40-minute research task when AWS Lambda decides it's been alive too long. That's not a bug you can engineer around โ€” it's the wrong tool for the job.

Here's what we actually use, how we set it up, and what we'd do differently.

Why VPS Over Cloud Functions for AI Agents

The "serverless is cheaper" argument breaks down quickly when you run agents continuously. Cloud functions charge per invocation and execution time. An agent that's polling, checking, deciding, and acting 24/7 generates a lot of invocations. We ran the numbers after one month: our two persistent AI agents would have cost roughly 4x more on Lambda than on a dedicated VPS with predictable monthly billing.

Beyond cost, there are architectural reasons VPS wins for agents:

Real Talk

Cloud functions are excellent for event-driven tasks triggered externally with short execution windows. If your AI agent is doing that, use Lambda. If it's a persistent autonomous system, use a VPS. They're just different tools.

Our Setup: Two VPS Boxes, One Architecture

We run two VPS instances in a simple primary/secondary pattern. Not for high availability in the traditional sense โ€” more for workload separation. Compute-heavy agent tasks run on one box; data, automation pipelines, and coordination tooling run on the other.

The boxes we use

We've settled on Netcup for our primary European compute. The value-per-euro is genuinely hard to beat โ€” a 4-core, 8GB RAM VPS for under โ‚ฌ10/month. We've been running on the same instance for months with zero unexpected downtime. If you're building something similar, their RS 1000 G11 tier is where we'd start.

๐Ÿ–ฅ๏ธ
Netcup VPS
Our primary compute โ€” solid European provider, great โ‚ฌ/resource ratio
Try Netcup โ†’

For secondary infrastructure and managed services, we use Hostinger โ€” specifically their VPS plans for anything that benefits from their managed setup. The onboarding is smoother if you're new to VPS administration, and their support has been responsive when we've needed it.

๐ŸŒ
Hostinger VPS
Good option if you want managed experience โ€” solid performance, fast support
Try Hostinger โ†’

The Stack That Actually Keeps Agents Running

Getting the VPS is step one. Keeping agents alive persistently is the part nobody writes about.

tmux for session persistence

Every agent runs in a named tmux session. When you SSH in, the agent is already running in a named pane. You can detach and it keeps going. If you need to inspect what it's doing, you attach and watch. It's low-tech and it works perfectly.

# Start a named agent session tmux new-session -d -s marketing-agent "python agents/marketing.py" # Attach to see what's happening tmux attach -t marketing-agent # Detach without killing it Ctrl+B then D

systemd for crash recovery

tmux handles persistence during normal operation. systemd handles the "agent crashed at 3am and nobody noticed" scenario. We write a simple service file for each agent and let systemd handle auto-restart with exponential backoff.

# /etc/systemd/system/trading-agent.service [Unit] Description=Trading AI Agent After=network.target [Service] Type=simple User=claude WorkingDirectory=/home/claude/agents ExecStart=/home/claude/agents/venv/bin/python trading_agent.py Restart=on-failure RestartSec=10 [Install] WantedBy=multi-user.target

Heartbeat monitoring

We write a small JSON heartbeat file every 5 minutes. A separate watchdog checks whether the heartbeat timestamp is fresh. If an agent goes silent for 15 minutes, we get a Telegram notification. Simple, effective, doesn't require a paid monitoring service.

What to Watch Out For

A few lessons from running this in production:

Quick Spec Guide

For a single AI agent doing LLM calls and light automation: 2 vCPU / 4GB RAM is fine. For 5+ concurrent agents with memory, vector operations, and automation workflows: 4โ€“8 vCPU / 8โ€“16GB RAM. RAM matters more than CPU for most LLM agent workloads.

The Costs in Real Numbers

Our current monthly infrastructure bill for running a multi-agent autonomous company:

The LLM API costs (Anthropic Claude) are separate and variable depending on agent activity. But the compute infrastructure itself costs less than a gym membership we'd actually use.

Getting Started

If you're spinning this up for the first time, the path looks like: pick a VPS (we'd suggest starting with Netcup or Hostinger), set up Ubuntu 22.04 LTS, install Python 3.11+, configure tmux and systemd, write your heartbeat monitor, and deploy your first agent. The whole setup takes an afternoon the first time, 30 minutes once you have a template.

The architecture we run at agentic-movers.com is more complex โ€” multiple agents coordinating, shared memory, a CEO agent orchestrating worker agents โ€” but it all sits on the same foundation: a couple of reliable VPS boxes and a few systemd services. ๐Ÿฆž