This HN comment really hits home on some thinking I’ve been doing recently:

Am I missing something? Why is everyone talking about sandboxes when it comes to OpenClaw? To me it’s like giving your dog a stack of important documents, then being worried he might eat them, so you put the dog in a crate, together with the documents. I thought the whole problem with that idea was that in order for the agent to be useful, you have to connect it to your calendar, your e-mail provider and other services so it can do stuff on your behalf, but also creating chaos and destruction.

Inherently there is no real security model for Openclaw. There is an agent that can do whatever with a set of credentials in a VM somewhere. It can leak those credentials, sell them on the dark web or edit its own SOUL.md.

Okay so what should the response be? AI is certainly not worthless, and we should allow only the people willing to forgo all sense of security to be the ones to reap the benefits. The proper way is to have all governance and controls in place for agents so that even the tin-foil hatters among us feel comfortable adopting AI. So we’ve drawn the picture of what doesn’t work - what does the picture of what does work look like?

It looks more complicated than Openclaw in a box. Let me explain why that complexity is load-bearing.

Gateway

Admins need a single chokepoint to control access. This isn’t complicated — ideally it’s a stateless, high-performance proxy — but it has to exist. The gateway is where you put your kill switch. If an agent goes rogue, you don’t want to hunt down 47 different service connections and revoke them one by one. You flip one switch. It’s also where you get observability. You cannot govern what you cannot see. Every tool call, every LLM invocation, every MCP request flows through one place, and now you have somewhere to enforce rate limits, apply guardrails, and emit traces. None of the other pieces work without this.

Token Vault

Don’t give the dog your keys.

When Openclaw connects to your calendar or Salesforce, it holds credentials. Probably in a config file. Probably with more permissions than it actually needs. That’s the dog with your documents. The sandbox doesn’t help because the credentials are already inside it.

A token vault handles credentials out-of-band. The agent never holds your Salesforce token directly. When it needs to take an action, the gateway requests a short-lived, scoped token from the vault for exactly that operation. When the operation completes, the token expires. There’s nothing to exfiltrate.

This also unlocks patterns that are otherwise impossible. Many enterprise systems (Salesforce, ServiceNow) don’t support service accounts at all. They only support user-based auth. On-Behalf-Of (OBO) flows through a token vault let an agent act in the context of a real user, with that user’s actual permissions, without ever directly holding their credentials. You can’t build a real multi-tenant agent without this.

Sandboxed Compute & Storage

Sandboxes are right. They’re solving the wrong problem when you’ve already given the agent your keys — but sandboxes absolutely still belong in the picture.

Agents are genuinely good at bash. Give them a real shell environment and they can do remarkable things. The key distinction is between “shell access to the host machine” and “a purpose-built sandboxed environment with exactly the tools they need.”

A proper sandbox provides isolated compute that can’t reach your cloud metadata APIs, your Kubernetes control plane, or anything on your internal network without explicit permission through the gateway. When we ran a full adversarial audit of ours: testing DNS exfiltration, AWS IMDSv2, Kubernetes service account tokens, shell access, filesystem access. Every vector came back blocked. Not through obscurity, but through defense in depth across network, platform, container, and runtime layers independently.

The sandbox also gives agents somewhere to live between runs. Persistent storage means agents can accumulate memory over time, share organization-wide tools and best practices, and build up skills without requiring an ever-growing context window to be passed around.

Audit Log and Transcripts

You want to know why the agent did a thing, not just what it did.

This is what makes the kill switch actually usable. If you stop an agent mid-run, you need to understand what it had already done and whether any damage assessment is needed. It’s what makes performance reviews possible: was this agent actually making good decisions, or was it going off-script in ways you’d want to catch? And it’s what makes debugging feasible at all when something goes wrong.

Full transcripts: inputs, outputs, tool calls, token usage, the agent’s reasoning chain. All correlated with trace IDs tied back to the gateway. Without this, you’re flying blind.

Summary

None of this is an argument that Openclaw is bad. I’ve used it. If you’re a developer running it on a dedicated machine with limited access and scope, the threat model is manageable. You’re not running it on company cloud infrastructure. You’re not giving it access to production systems. The documents and the dog are both yours.

The problem shows up when organizations try to scale that model. When the IT team decides “just run it in a VM” for each department. When someone decides the sandbox is sufficient governance for production use. It isn’t. The threat model is completely different at that point.

Gateway → Token Vault → Sandboxed Compute → Audit Trail.

That’s the minimum required to give anyone (developer, security team, CIO, etc) actual control over their agents. Once you have it, you stop worrying about the dog, because you stopped giving it your keys in the first place.