Inside OpenClaw — why every user gets their own sandboxed container
Most AI products multi-tenant a single process and call it a day. We give every user their own Docker container with zero public egress. Here is why that mattered.
When you start building an AI assistant, the obvious move is to multi-tenant. One server, one process, all users share it. It is cheap, it is fast to ship, and it is what most products do.
We did not do that. Every account gets its own sandboxed container — we call it OpenClaw — that boots on demand, runs the agent loop in isolation, and hibernates when idle. Free users stay in Auto mode with a tiny credit budget; paid users get more usage and manual controls.
This was not a small choice. Containers are heavier, slower to cold-start, and harder to operate than a shared process. We did it anyway because the security and product properties you get back are worth more than the operational cost.
The isolation shape
An OpenClaw container is locked down hard. Every Linux capability is dropped, the root filesystem is read-only, the process runs as a non-root user, and new privileges are forbidden. Memory is capped at 1 GiB with no swap, CPU at one core, and the PID limit is 256 so a runaway loop cannot fork-bomb the host.
CapDrop: ALL
User: 1000:1000
ReadonlyRootfs: true
SecurityOpt: [no-new-privileges]
PidsLimit: 256
Memory: 1024MB
CpuLimit: 1.0
Tmpfs:
/tmp: rw,nosuid,nodev,noexecBut the part I am proudest of is the network. The container sits on a Docker network with `internal: true`. There is no route to the public internet. Zero. The container literally cannot reach the outside world.
Why not just multi-tenant?
Three reasons, in increasing order of importance.
Blast radius
In a shared process, a memory leak, a crash, or a misbehaving tool call hits everyone. With per-user containers, a bad message in user A's session cannot touch user B. The crash-loop guard kills a container after five rapid restarts; the inactivity watchdog hibernates it after sixty minutes idle. One user's bad day does not become everyone's outage.
Per-user persona memory
The container is the natural unit for personalisation. Each user's persona, their long-term memories, their preferences — all of that lives in mounted volumes that only their container can read. A shared process would have to do this in a database with careful row-level checks, and one bad query is one user reading another user's memories. Containers make the boundary physical.
Clean billing boundary
Every LLM call from the container goes through our credit-proxy. The proxy pre-deducts ten credits, runs the call, then finalises the actual cost — refunding any surplus to the user's ledger. Streaming requests are rejected at the proxy with a 400, so the credit accounting cannot drift. The container has no ability to bypass this. It physically cannot reach OpenRouter directly because the network does not let it.
The cost
Containers cold-start in a few hundred milliseconds. We pre-warm a small pool and hibernate idle ones, so the latency tax is real but bounded — most messages hit a warm container. The orchestrator runs on a single VPS today and scales horizontally when we need to.
The cost we accepted was operational complexity. We monitor each container, restart crashed ones, hibernate idle ones, and clean up state. None of this is novel — Kubernetes does it at scale every day — but we kept it deliberately small and Docker-Compose-shaped so we can reason about every moving piece.
For a product where a single prompt-injection bug could leak someone's OAuth tokens, this trade was easy. Isolation first. Speed second.