Series: Build an Agent Harness

  1. Have You Built an Agent Harness Yet?
  2. AI doesn't remember your project, Markdown does
  3. Do We Even Need Multiple Tools?
  4. Sandboxing an AI Harness on macOS
  5. Teaching Skills to an AI Harness
  6. Replacing Bash with Swift in an AI Harness
6 May 2026 15min read

Sandboxing an AI Harness on macOS

In the previous post we gave superpowers to the harness in the form of a bash tool. Even doing it just for the learning experience, it was already very eye-opening. It made this little Swift harness feel much closer to the real thing.

And that is exactly why I don’t want to hand-wave the next part away.

We haven’t spent the time to make it production-ready, and we won’t because that’s not the purpose of this series. There are no carefully tuned prompts. No tailored rules for different models. No approval flows. No guardrails worth the name. It is just a tiny learning project.

But learning projects need to teach real lessons, and one of them is that giving an LLM shell access changes the nature of the problem immediately. It is no longer just “can the model inspect the project better?” It is also “what can this thing touch now, and what did I accidentally give it access to?”

So before happily moving on with the harness, I wanted to stop and look directly at that danger.

The dangers of free execution

It’s easy to see what power we have given to the LLM:

<Turn 1>
You: Inspect this project and the surrounding projects so you can understand the environment better.
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"pwd && echo '--- ROOT ---' && ls -la && echo '--- PARENT ---' && ls -la .. && echo '--- GRANDPARENT ---' && ls -la ../../"}}
tool-success: bash -> {"stdout":"\/Users\/me\/code\/swiftagentharness\n--- ROOT --...
<Step 2>
Assistant: I inspected the current project and nearby directories.
Current project:
- `swiftagentharness`
...
Sibling projects in `../`:
...
Higher-level directories in `../../`:
...

We can see how even with the workspace guidance in the prompt, the model can easily use bash to escape the workspace.

It’s not hard to imagine what would happen if the model decided to use rm or inspect files that should be totally private, like credentials.

What are the options then

A common typical answer for this is to bring the cannons and use Docker to create a fully virtual and controlled environment so the AI can’t harm anything outside of it. And although that is the prevailing mentality across many developer communities, it’s not the only solution, nor the best or most elegant one.

So just for the sake of exploration and learning, let’s try something else. Let’s use native operating system tooling to sandbox execution. After all, if the entire revolution of mobile apps could happen with system sandboxes, surely we can benefit from the same work.

What other harnesses do

Before going further, I wanted to see whether this was one of those cases where I was overthinking it and everybody else had already moved on to some obvious better approach.

The short answer is no. The pattern I kept seeing was much more pragmatic.

OpenAI’s own local shell guide explicitly warns that running arbitrary shell commands can be dangerous and says you should sandbox execution or add strict allow and deny lists. That already tells us something important. Even if your product is all about agentic shell use, the default assumption is still that free execution is too much.

Claude Code’s sandboxing docs are probably the clearest public explanation of the approach. On macOS they use Seatbelt. On Linux they use bubblewrap. Filesystem access and network access are treated as two different problems, and all child processes inherit the same restrictions.

Anthropic also extracted that work into an open-source project, sandbox-runtime. Their README is very explicit about what it does: native OS sandboxing primitives plus proxy-based network filtering. That last bit is worth noticing. The serious versions of this don’t stop at filesystem restrictions.

Goose Desktop’s macOS sandbox docs show a very similar shape. Again, native sandboxing on macOS, plus extra machinery around network control.

And even when looking at OpenAI Codex in the wild, you can see the same reality leak through. OpenAI points to Codex CLI as a reference implementation from the shell guide, and their issue tracker has real macOS reports that mention the Seatbelt sandbox directly, like this network-related issue.

So the pattern seems pretty clear:

  • If you want the strongest isolation, people reach for containers.
  • If you want native process sandboxing on macOS, people still seem to reach for Seatbelt through sandbox-exec.
  • If you care about network too, the more serious tools add a proxy on top.

For me that is enough indication that is worth spending some time understanding Seatbelt and native sandboxing and seeing how it can improve our tiny harness.

Seatbelt

Seatbelt is the low-level sandboxing facility in macOS. It is one of the pieces behind the strong sandboxing model Apple has used for years across macOS and iOS. So when we talk about app sandboxes and processes being limited by the operating system, this is the machinery that is under the hood.

That was already a good sign for this little experiment. If the whole mobile app world has been able to rely on OS sandboxing for such a long time, there is probably something useful here for us too, even if our use case is much smaller and much messier than a normal app. Apple’s own documentation around App Sandbox and the macOS App Sandbox configuration is mostly focused on proper app development, entitlements, containers, helper tools, and that whole world. That is not exactly our situation, but it tells us the underlying model is very real.

The slightly confusing part is that there is a difference between the modern Apple recommendation and the practical tool we can use from a little CLI harness.

The modern recommendation is App Sandbox. That is the official, entitlement-based way to sandbox a macOS app. But our harness is not a signed app with entitlements and containers. It is just a command-line tool that spawns /bin/bash.

That is where sandbox-exec comes in.

sandbox-exec is a command-line wrapper around that lower-level sandbox facility. You give it a policy, then tell it which command to run, and it launches that command under those restrictions. That is why it keeps showing up in agent harnesses and AI tools on macOS. It fits the shape of the problem much better than App Sandbox does.

The idea is very simple:

sandbox-exec -p '(version 1) ...rules here...' /bin/bash -lc 'pwd'

That -p argument is an inline sandbox profile. You can also load profiles from files, but for our tiny harness an inline profile is probably enough.

And this matters too: child processes inherit the sandbox. That means if we launch bash under Seatbelt restrictions, and then bash launches git, swift, ls, or anything else, those processes keep running inside the same boundary. That is exactly what we want. It would be quite useless to sandbox only the first command and then accidentally let the rest of the process tree escape 🤣.

There is one awkward detail though. Apple marks sandbox-exec as deprecated in the man page. So this is not me saying “this is the shiny modern blessed API.” It is more like “this is the old but still practical door into Seatbelt for command-line process sandboxing.” For our purposes, and also looking at what other harnesses are doing, that is still good enough to learn something real from it.

Let’s wire it into the harness

We already had a bash tool that used Subprocess.run. So we don’t have to redesign the whole harness around sandboxing. We just need a small change that can prove the idea.

Let’s start with preparing the sandbox as a custom method on Executable:

extension Executable {
    func runSandboxed<
        Input: InputProtocol,
        Output: OutputProtocol,
        Error: ErrorOutputProtocol
    >(
        arguments: [String] = [],
        environment: Environment = .inherit,
        workingDirectory: FilePath? = nil,
        input: Input = .none,
        output: Output,
        error: Error = .discarded,
        sandboxProfile: String
    ) async throws -> ExecutionRecord<Output, Error> {
        let resolvedExecutable = try self.resolveExecutablePath(in: environment)
        let wrappedArguments = ["-p", sandboxProfile, resolvedExecutable.string] + arguments

        return try await Subprocess.run(
            .path("/usr/bin/sandbox-exec"),
            arguments: Arguments(wrappedArguments),
            environment: environment,
            workingDirectory: workingDirectory,
            input: input,
            output: output,
            error: error
        )
    }
}

This part is actually quite simple. We are still just spawning a process, but with the difference that instead of spawning /bin/bash directly, we spawn sandbox-exec, give it the Seatbelt profile, and then tell it to launch the real executable inside that sandbox.

Then the bash tool can just call this instead of the original Subprocess.run. I’ve added an argument to the harness to be able to use the sandbox only when desired, mostly to try the differences and for learning purposes.

let result = if sandboxEnabled {
    try await executable.runSandboxed(
        arguments: ["-c", command],
        workingDirectory: workingDirectory,
        output: capturedIO,
        error: capturedIO,
        sandboxProfile: SeatbeltProfile.workspaceBoundary(for: workspaceRoot)
    )
} else {
    try await Subprocess.run(
        executable,
        arguments: ["-c", command],
        workingDirectory: workingDirectory,
        output: capturedIO,
        error: capturedIO
    )
}

That shape was exactly what I wanted. The harness is still the same little harness. The bash tool is still the same bash tool. We are not replacing it with a different concept. We are just giving it a stricter execution path when sandboxing is enabled.

Just FYI, in my final implementation I also included a sandbox subcommand, that helped me run commands directly inside the sandbox without going through the LLM at all. That ended up being really handy for testing the profile itself without having to convince the LLM to do what I needed.

But of course, the interesting part is the profile.

I’m not gonna lie. I’m not an expert on sandboxing so what I’ve got working probably has flaws, but after some trial and error it seems to work for a few use cases. And that’s all I need for the learning process :)

For example, the first version was too strict and basically useless. Even pwd and ls were crashing. So this turned into a very practical exercise in seeing what macOS tools actually need to function under Seatbelt.

<Turn 1>
You: List the files in this project and summarize what it is. Use bash.
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"ls -la"}}
tool-success: bash -> {"stdout":"","stderr":"","exit_code":"signaled(6)"}

The key unlock was importing Apple’s built-in system.sb base profile, and then layering our own file restrictions on top. After that, the profile started becoming actually useful instead.

(version 1)
(deny default)
(import "system.sb")

;; Start from a deny-all profile, then re-allow only the minimum needed
;; for "run bash inside the workspace" to work at all.
(allow process-exec)
(allow process-fork)
(allow signal (target self))
(allow sysctl-read)
(allow mach-lookup)

(allow file-read*
    (subpath "/Applications")
    (subpath "/bin")
    (subpath "/dev")
    (subpath "/System")
    (subpath "/usr")
    (subpath "/var/folders")
    (subpath "/private/var/folders")
    (subpath "/var/select")
    (subpath "/private/var/select")
    (subpath "<workspace>")
    (subpath "<sandbox-temp-directory>"))

(allow file-read-metadata
    (literal "<workspace-parent>"))

(allow file-write*
    (subpath "/var/folders")
    (subpath "/private/var/folders")
    (subpath "<workspace>")
    (subpath "<sandbox-temp-directory>"))

(deny file-write*
    (subpath "<workspace>/.git"))

It’s all quite self explanatory but here is my own description of what’s happening:

  1. We deny everything by default, and we will only enable what we want.
  2. Import Apple’s base profile, this is what makes pretty much everything work :D
  3. Allow a few process related rules so we can actually call tools.
  4. Allow reads on the system locations that bash, ls, git, swift, and friends need in practice, plus the workspace and dedicated sandbox temp directory.
  5. Allow metadata reads on the parent of the workspace because even something like ls -la peeks at .. while listing the current directory.
  6. Allow writes in the workspace and the dedicated sandbox temp directory, including its var/folders aliases that some commands use under the hood.
  7. But deny writes to the git folder, this was just to try the ruleset.

Note that because we’re making a custom subfolder in the temporary directory, we have to update the bash tool instructions to include that path. Otherwise the LLM won’t have any clue where it can write temporary files.

Want the finished project?

The whole point of this post is that you build it yourself, and all the important ideas are already here. But if you want to support my writing, or you just want to save time, I packaged the project for you to download.

Get the harness by supporting me on Buy me a coffee

The nice part is that once this was in place we could actually test the boundary in a very concrete way.

<Turn 1>
You: run ls and pwd
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"pwd && ls"}}
tool-success: bash -> {"stdout":"/Users/me/code/swiftagentharness
AGENTS.md
Package.resolved
...","stderr":"","exit_code":"exited(0)"}
<Step 2>
Assistant: /Users/me/code/swiftagentharness
AGENTS.md
...
<Turn 2>
You: do it at the home folder
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"cd ~ && pwd && ls"}}
tool-success: bash -> {"stdout":"/Users/me
","stderr":"cd: error retrieving current directory: getcwd: cannot access parent directories: Operation not permitted
ls: .: Operation not permitted
","exit_code":"exited(1)"}
<Step 2>
Assistant: Tried in `~`, but permissions prevented listing it.
Output: `/Users/me`
Errors:
- `getcwd: cannot access parent directories: Operation not permitted`
- `ls: .: Operation not permitted`
<Turn 3>
You: write it
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"printf 'workspace write test\n' > scratch.txt && ls -l scratch.txt && cat scratch.txt"}}
tool-success: bash -> {"stdout":"-rw-r--r--@ 1 me staff 21 ... scratch.txt
workspace write test
","stderr":"","exit_code":"exited(0)"}
<Step 2>
Assistant: Wrote `scratch.txt` in the workspace successfully.
<Turn 4>
You: prove that writing into .git failed
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"printf 'git dir write test\n' > .git/scratch-test 2>&1; status=$?; if [ $status -eq 0 ]; then echo WRITE_OK; else echo WRITE_FAILED; fi; exit $status"}}
tool-success: bash -> {"stdout":"WRITE_FAILED
","stderr":"/bin/bash: .git/scratch-test: Operation not permitted
","exit_code":"exited(1)"}
<Step 2>
Assistant: Writing into `.git` failed with `Operation not permitted`.
<Turn 5>
You: prove that writing ../sandbox-escape.txt failed
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"printf 'escape test\n' > ../sandbox-escape.txt 2>&1; status=$?; if [ $status -eq 0 ]; then echo WRITE_OK; else echo WRITE_FAILED; fi; exit $status"}}
tool-success: bash -> {"stdout":"WRITE_FAILED
","stderr":"/bin/bash: ../sandbox-escape.txt: Operation not permitted
","exit_code":"exited(1)"}
<Step 2>
Assistant: Writing `../sandbox-escape.txt` failed with `Operation not permitted`.
<Turn 0>
You: write a random file in a temporary file
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"tmpfile=$(mktemp); head -c 32 /dev/urandom | base64 > \"$tmpfile\"; printf '%s' \"$tmpfile\""}}
tool-success: bash -> {"stdout":"/var/folders/.../swiftagentharness-sandbox/<workspace-hash>/tmp.sjpO8WrRyF","stderr":"","exit_code":"exited(0)"}
<Step 2>
Assistant: Created a temporary file with random content at `/var/folders/.../swiftagentharness-sandbox/<workspace-hash>/tmp.sjpO8WrRyF`

It’s actually pretty cool to see that with so little we already have a more secure harness. Now there are some restrictions and safety around what we let the AI do. It still has bash superpowers, but now with a bit more responsibility.

What this does not solve

Now, we need to be honest about what this is and what it is not. This does not make the harness “safe” in any absolute sense. It just makes it safer.

The profile still has to allow a bunch of system paths so normal commands can even run. That means this is not some perfect tiny prison where the process only knows about the repo. It is a constrained execution environment, sure, but it is not perfect isolation.

Also, this is not solving network access. Some of the more serious tools pair sandboxing with proxy-based network control. We are not doing that here. So this post is not a guide to production-grade AI isolation. It is a guide to learning how native sandboxing on macOS can help, and where it stops helping. Just be careful out there.

But sandboxing is not a substitute for judgment. A good harness still needs clear instructions, approvals where appropriate, sane defaults, and probably other layers around command execution. If you want the strongest isolation, containers and VMs still exist for a reason.

And let’s be honest, even with all of that, with enough tools and computing power, LLMs could probably cause trouble on any environment, so this is just about minimizing those risks at the obvious levels.

Good enough to learn something

We’ve transformed some obscure features in popular harnesses and a trust me bro attitude, to an investigation into the implications of letting AI execute and what’s one of the possible mitigations. That’s very valuable.

Now I understand a bit better what macOS sandboxing actually looks like in practice. I understand the gap between App Sandbox and sandbox-exec. I understand why other harnesses still use Seatbelt. I understand that the policy itself is only half the story, and that the tool prompt also has to describe the environment correctly. And most importantly, I understand that “workspace root” and “security boundary” are not the same thing unless we do the work to make them the same.

So this is probably not good enough to ship blindly. Not good enough to call secure. But definitely good enough to learn something real.

And that is good enough for me.

If you enjoyed this post

Continue reading