cd /news/ai-safety/where-openclaw-security-is-heading · home topics ai-safety article
[ARTICLE · art-22898] src=openclaw.ai pub= topic=ai-safety verified=true sentiment=· neutral

Where OpenClaw Security Is Heading

OpenClaw Security is developing new protections for its AI personal assistant platform, including filesystem boundary controls called fs-safe and a network egress routing layer called Proxyline. The platform aims to address security risks from agentic systems that can read files, run commands, and access networks on a user's machine, with fs-safe preventing path traversal bugs and Proxyline enforcing network policies at the egress point. These measures are part of OpenClaw's effort to become a trusted, auditable AI assistant that balances powerful capabilities with security boundaries.

read8 min publishedMay 15, 2026

Our goal is for OpenClaw to become a trusted way to run a powerful AI personal assistant.

OpenClaw can read files, run commands, install plugins, talk to the network, and act on a real machine for a real user. Power like that is easy to describe as dangerous. The concern is fair. Powerful does not have to mean blind, unbounded, or impossible to audit.

Some of this has landed. Some is rolling out. Some is still in flight. Some is research. I want to be clear about the difference, because posts that blur those lines mislead readers.

Filesystem boundaries and fs-safe #

OpenClaw runs on your machine. That means it can touch your documents, your codebases, and your photos.

The filesystem risk people usually reach for first is path traversal. That risk is real, but it is also only one symptom of a bigger class of bugs: unclear boundaries. Code thinks it is writing inside one root, then a symlink, absolute path, archive extraction, or sloppy join makes it cross another.

fs-safe is one answer to that. It is the set of

safe filesystem patternsOpenClaw had already been growing, pulled into a shared library so core code, plugins, and adjacent services can use the same root-bounded primitives.

It is not a sandbox. A plugin that is allowed to run arbitrary shell commands can still do arbitrary shell-command things. fs-safe

protects against boundary-crossing bugs in filesystem code.

Writing inside a plugin workspace should work. Traversal and absolute-path writes outside that workspace should fail. Plugin authors should not have to reimplement those checks.

The next step is making these primitives the expected pattern for plugins on ClawHub too. Bypassing them is not automatically malicious, but it is security-relevant. Over time, that kind of choice should count against a plugin’s trust posture.

The safest filesystem call is still the one we do not make. That is the security motivation behind the in-flight SQLite runtime-state refactor. Sessions, transcripts, scheduler state, and plugin state belong in a typed database with clear ownership and transactions, not loose files. Moving runtime state into SQLite removes whole categories of filesystem access from the runtime path.

Network egress and Proxyline #

Agentic systems make SSRF harder than it is in a normal web service. In a normal service, user-controlled URLs are often the exception. In an agent runtime, user-controlled or model-influenced URLs are normal product behavior. “Fetch this URL because someone, or something, asked for it” is normal work.

We started with the obvious approach: validate the URL before fetching it. That is not enough. Validation resolves DNS, the fetch resolves DNS again, and the answer can change between the two. A host that pointed at a public IP during validation can point at a metadata endpoint by the time the request leaves.

The fix has to move closer to egress.

Proxyline is our Node-process routing layer for that. It installs process-global routing for Node networking surfaces and sends traffic through the proxy you configured. The configured proxy is where the connect-time policy should live: block metadata addresses, private ranges, loopback canaries, and whatever else your environment needs blocked.

Proxyline routes. The proxy enforces.

It also gives operators observability. If you already run a managed proxy, you can route OpenClaw through it and watch destinations, rates, and blocked attempts from infrastructure you already trust.

Proxyline is not a perfect cage around every possible byte. Raw sockets, native modules, unusual transports, early-captured agents, and non-OpenClaw child processes can still bypass a Node-level guardrail. But for ordinary OpenClaw network paths, moving the control point from “a wrapper remembered to validate this URL” to “egress flows through a proxy policy” is a much better shape.

The validation path is simple: example.com

should pass, a loopback canary should fail, and openclaw proxy validate should prove the configured route behaves that way.

Plugin trust on ClawHub #

ClawHub has to be the authority for plugin trust and provenance when a plugin comes from ClawHub. OpenClaw should consume those signals during install and update, rather than rely only on local inspection after the fact.

The ClawHub pipeline is a mix of signals: ClawScan, VirusTotal, static analysis, metadata checks, source provenance, and manual moderation. None of those is magic. Scanners are noisy in different ways, and a pipeline that screams about everything teaches users to ignore it.

That is where ClawHub can do something a local install flow cannot. It can attach trust evidence to a specific package version. It can say this release is clean, suspicious, held, quarantined, revoked, or malicious. It can block downloads for malicious or quarantined releases. It can show users what changed and why.

Plugins can come from GitHub, a private registry, or a file someone sends you. That is not going away, and OpenClaw should not pretend users do not own their own machines.

What we can do is make the safe path better. Publish on ClawHub. Get scanned. Attach evidence. Let users weigh that evidence before install.

We are also exploring higher-trust tiers above the baseline: official packages, trusted publishers, and packages held to stricter review expectations. For plugins that live outside ClawHub, we want scanning to reach them too, but the exact product shape still needs work.

If ClawHub marks a release as malicious and quarantined, the ClawHub install path should refuse it. That is the bar.

Command approvals and prompt fatigue #

Prompts arrive faster than anyone can read them. After a few minutes, users flip on YOLO mode so work can continue. At that point the prompts train the user to stop reading.

Fixing this means fewer prompts, and better prompts.

The accuracy part starts with parsing. String matching is not enough. If an allowlist or blocklist only sees the outer command, wrappers become a bypass. A policy that understands rm

but cannot see inside bash -c "rm -rf ~/something"

is not a policy users should trust.

The shell approval path now evaluates inner command chains for common shell -c

wrappers. If the inner chain contains an executable that is not allowed, the wrapper should not make it safe. The command highlighter also uses Tree-sitter to surface what OpenClaw found inside wrappers.

PowerShell has its own traps; we fail closed for forms we do not understand, and broader support is on the roadmap.

Parsing is the easier half. The harder half is deciding when to ask.

A static approval policy either prompts on everything that might be risky, or relies on a fixed allow/deny list that cannot tell whether a command fits the current task.

The question users actually care about: did I want this to happen?

That is why we are experimenting with contextual approval. The goal is not “never prompt.” The goal is that prompts mean something — and when they do, the user should stop and read.

For OpenAI users we expose Auto Review, a codex-specific feature which replaces manual approval at the sandbox boundary with a separate reviewer agent.

The same approval requests can be forwarded into chat surfaces, so the operator does not have to keep a local terminal in view.

Slack can render native Block Kit approval prompts when interactivity is enabled. Exec approvals use channels.slack.execApprovals.*

; approvers come from channels.slack.execApprovals.approvers

or commands.ownerAllowFrom

.

{
  commands: {
    ownerAllowFrom: ["slack:U12345678"],
  },
  channels: {
    slack: {
      execApprovals: {
        enabled: "auto",
        target: "dm",
      },
    },
  },
}

Telegram supports approver DMs and optional prompts in the originating chat or forum topic. Inline approval buttons require channels.telegram.capabilities.inlineButtons

to allow the target surface.

{
  commands: {
    ownerAllowFrom: ["telegram:123456789"],
  },
  channels: {
    telegram: {
      execApprovals: {
        enabled: "auto",
        approvers: ["123456789"],
        target: "dm",
      },
      capabilities: {
        inlineButtons: "all",
      },
    },
  },
}

iMessage approval reactions landed too. A Like tapback maps to allow-once

, a Dislike tapback maps to deny

, and allow-always

remains an explicit /approve <id> allow-always

reply. The reacting handle must be an explicit approver, and OpenClaw ignores cross-device self-tapbacks so the bot cannot approve itself.

{
  approvals: {
    exec: {
      enabled: true,
      mode: "targets",
      targets: [
        { channel: "imessage", to: "+15555550123" },
      ],
    },
  },
  channels: {
    imessage: {
      allowFrom: ["+15555550123"],
    },
  },
}

The full forwarding model lives in Exec approvals - advanced, with channel-specific details in the Slack, Telegram, and iMessage docs.

Static analysis #

OpenClaw has had a lot of GitHub Security Advisories. The first job was plugging holes. The next job is making sure the same bug class does not come back.

After an advisory is patched, it is tempting to call it done. A GHSA is evidence about a bug class, not just one bug. The question after triage is: can we find all the code that looks like this?

For that, we use OpenGrep with a precise rulepack. Each rule is tied to an advisory, report, or review finding. The baseline goal is regression detection: if the same vulnerable shape returns, CI catches it before review does. The better goal is variant detection: catch nearby versions of the same mistake.

Precision is everything. A noisy rule is worse than no rule, because it teaches the team to ignore the channel.

Today the checked-in precise OpenGrep rulepack has 148 rules. It runs on PR diffs, and the full scan can be run manually. New patched advisories become candidates for new rules.

CodeQL runs alongside for deeper semantic coverage. It is slower and noisier to maintain; we use both.

What This Means for OpenClaw Users #

OpenClaw is not becoming less powerful. We are making the boundaries easier to see and defend.

We are not going to promise risk-free agents. Anyone promising that is selling something, or has not shipped enough yet.

What we can promise is the direction. OpenClaw can stay powerful while becoming more defensible. That is what we are building.

── more in #ai-safety 4 stories · sorted by recency
sponsored brought to you by zahid.host 4,200+ EU-deployed projects
reading about agents? ship yours in a single git push.

Run your AI side-project on zahid.host

EU-based hosting, git-push deploys, automatic HTTPS, no cold starts. Free tier with a custom domain — perfect for shipping the agent you just read about.

$git push zahid main
Live at https://your-agent.zahid.host
Get free account → Pricing
from €0/mo · no card required
LIVE [news/where-openclaw-secur…] indexed:0 read:8min 2026-05-15 ·