Series: Build an Agent Harness

Have You Built an Agent Harness Yet?
AI doesn't remember your project, Markdown does
Do We Even Need Multiple Tools?
Sandboxing an AI Harness on macOS
Teaching Skills to an AI Harness
Replacing Bash with Swift in an AI Harness

10 May 2026 15^min read

Teaching Skills to an AI Harness

Skills are everywhere right now. People talk about them as if they were some magical incantation that will make AI more powerful while also keeping it under control.

But as with pretty much everything else in this space, once you look closely, it is much less mystical than it sounds. It is mostly markdown.

If you already understood how AGENTS.md works, then skills are not that different. They are just a more structured and reusable way of doing a very similar thing.

So the question I wanted to answer was simple. What does it actually take to teach skills to a tiny AI harness?

So what is a skill

In its most basic form a skill is just a markdown file containing a reusable prompt for the LLM. Just like it happened with the AGENTS.md, the goal is to fill the context with useful information so the back and forth with the model improves.

In practice, a skill has a bit more. It also contains a bit of metadata and can contain extra files. For this, there is a standard for skills, which is honestly great because it means we are not talking only about one product’s private feature, but something compatible across harnesses.

The spec describes a skill as a folder with a markdown file called SKILL.md. That file contains the actual instructions in markdown, plus a bit of metadata like a name and a description as front matter.

That extra metadata is one of the most important aspects of skills since it lets harnesses follow a progressive disclosure principle. Skills can be discovered and loaded immediately, but just their name and description. Then, only when the skill is needed, the entire instruction content is loaded. This means the LLM context is not filled with unnecessary tokens. That progressive disclosure part is one of my favorite ideas here, unsurprisingly as it’s one of the core tenets of Swift language design.

It is funny too, because progressive disclosure is one of those principles that make so much sense for humans. Don’t dump everything at once. Show the right thing at the right moment. And it is interesting to see the same instinct apply to LLMs too.

Discovery

The first actual implementation step is to find the skills themselves. As the standard and the harnesses I looked at support two locations, I wanted to implement the same. This would allow me to already test the harness with the real skills I’m using everywhere else.

There are two main locations where skills can live:

a repo-local .agents/skills folder, for workflows that belong to the project
a user-level ~/.agents/skills folder, for global workflows available everywhere

But discovery is not only finding the right files and folders. It is also policy. And although the security of loading skills is not something we will get into here, it’s a can of worms that is worth considering. A repo-local skill that I wrote for the project is not the same thing as a skill I dumped into my home folder. So the harness has to decide not only where to scan, but also what kind of instructions it is willing to trust and surface automatically.

For our purposes, the actual code ended up being a small dedicated type for this. We can make a catalog type so the harness has specific knowledge and control of the skills.

struct SkillCatalog {
    let skills: [SkillDefinition]
    let workspaceSkillCount: Int
    let userSkillCount: Int

    static func load(workspaceRoot: URL, homeDirectory: URL = URL(fileURLWithPath: NSHomeDirectory())) -> Self {
        let workspaceSkills = loadSkills(
            in: workspaceRoot.appendingPathComponent(".agents/skills", isDirectory: true),
            source: .workspace
        )
        let userSkills = loadSkills(
            in: homeDirectory.appendingPathComponent(".agents/skills", isDirectory: true),
            source: .user
        )

        var discoveredByName = [String: SkillDefinition]()

        // Workspace skills load first, so they win name collisions.
        for skill in workspaceSkills + userSkills {
            let key = skill.name.lowercased()
            guard discoveredByName[key] == nil else { continue }
            discoveredByName[key] = skill
        }

        return SkillCatalog(
            skills: discoveredByName.values.sorted { ... },
            workspaceSkillCount: workspaceSkills.count,
            userSkillCount: userSkills.count
        )
    }
}

The loader walks both directories recursively, looks for files named exactly SKILL.md, and builds an in-memory catalog.

That final step looks like this:

private static func loadSkill(at url: URL, source: SkillSource) -> SkillDefinition? {
    guard let content = try? String(contentsOf: url, encoding: .utf8) else {
        return nil
    }

    let parsed = parseSkillFile(content)
    let fallbackName = url.deletingLastPathComponent().lastPathComponent
    let name = parsed.metadata["name"]?.nonEmpty ?? fallbackName
    let description = parsed.metadata["description"]?.nonEmpty ?? "No description provided."

    return SkillDefinition(
        name: trimQuotes(from: name),
        description: trimQuotes(from: description),
        path: url,
        source: source
    )
}

As you can see, the loop that loads the skills just skips entries whose name already exists. Since workspace skills are loaded first, that means they win collisions. If a repository wants to provide its own version of a skill, that should override my personal default for that session.

I also made the harness print what it discovered at startup, just like it already did for AGENTS.md. That way I can immediately verify that discovery is working before even asking the model to do anything:

Loaded workspace skills (2)
Loaded user skills (29)
Discovered skills: harness-super-test, harness-swift-helper, agent-browser, ..., find-skills, ...
Loaded AGENTS.md (11 lines)

Parsing

Once the harness finds the skills, it has to parse them. The good part is that there is nothing special about it, if you’ve ever worked with a markdown-based static site generator, you will be familiar with YAML front matter. That’s what needs to be extracted, just the name and description. Maybe validate them a bit. But at this stage the harness does not keep the rest of the markdown in memory. It only keeps the catalog metadata and the path to the file.

Each discovered SKILL.md gets loaded as text. Then the harness looks for the front matter delimiters, parses the little metadata block, extracts name and description, and stores just enough information to build the catalog.

private static func parseSkillFile(_ content: String) -> (metadata: [String: String], instructions: String) {
    let lines = content.components(separatedBy: .newlines)

    guard lines.first == "---" else {
        return ([:], content.trimmingCharacters(in: .whitespacesAndNewlines))
    }

    var metadata = [String: String]()

    for line in lines.dropFirst() {
        if line == "---" { break }
        guard let separator = line.firstIndex(of: ":") else { continue }

        let key = line[..<separator].trimmingCharacters(in: .whitespaces)
        let value = line[line.index(after: separator)...].trimmingCharacters(in: .whitespaces)

        guard key.isEmpty == false, value.isEmpty == false else { continue }
        metadata[key] = value
    }

    return (metadata, content)
}

We already have skills on disk. We already know what they are. We already know enough to tell the model they exist. But we have not paid the token cost of loading them all. Not yet.

Updating the system prompt

We now have a catalog with the metadata of the available skills. The harness knows them, but the LLM still doesn’t. So just like with AGENTS.md, we now need to inject the catalog into the system prompt so it is available in the model context.

let catalog = skillCatalog.skills
    .map(\.promptBlock)
    .joined(separator: "\n")

skillsPrompt = """
Available skills:
These skills provide specialized instructions for specific tasks. When a task matches a skill's description, read the `SKILL.md` at the listed location before proceeding.
<available_skills>
\(catalog)
</available_skills>
"""

var promptBlock: String {
    """
    <skill>
    name: \(name)
    description: \(description)
    location: \(path.path)
    </skill>
    """
}

That last field matters more than it may seem.

The location is what lets the model bridge the gap between “I know this skill exists” and “now I want to actually use it”. Without that field, the model would know the skill exists but would still need some other mechanism to get to the file.

And once that prompt change exists, we can finally test the interesting part.

Implicit activation

Once the skills are in context, implicit activation is almost the purest version of the idea. The model sees that a skill exists, sees what it is for, and decides to load it when the task matches. We will look at explicit activation later, but this is the best place to start.

For this harness, that also means something very concrete. I am not going to add a special activate_skill tool right now. If the point of the project is learning, then I would rather keep the mechanism visible. The model sees a skill in the catalog, notices the path, and reads the file with the normal file-reading tool.

A dedicated activation tool is still a valid design, and maybe even the better one for a more serious agent. It gives tighter control, better permissions, and cleaner analytics. But for this harness it would hide the mechanism too early. I want to see the model load the markdown directly.

To make that work in practice, I had to do one small but important change. The harness already knew how to discover user-level skills in ~/.agents/skills, but the original implementation of the read_file tool and the sandbox would limit access to that directory because it sits outside the workspace.

So I updated two things:

the read_file tool can now read both the workspace and ~/.agents/skills
the sandbox profile for bash can now read ~/.agents/skills too

let globalSkillsRoot = URL(fileURLWithPath: NSHomeDirectory())
    .appendingPathComponent(".agents/skills", isDirectory: true)
    .standardizedFileURL

let isInsideWorkspace = resolvedPath == rootPath || resolvedPath.hasPrefix(rootPath + "/")
let isInsideGlobalSkills = resolvedPath == globalSkillsPath || resolvedPath.hasPrefix(globalSkillsPath + "/")

guard isInsideWorkspace || isInsideGlobalSkills else {
    throw HarnessError.workspaceViolation(path)
}

;; Global skills
(allow file-read*
    (subpath "/Users/me/.agents/skills"))

That keeps the mechanism simple and visible. The model sees a skill in the catalog, notices the path, and reads the markdown like any other file.

Here is the local skill case:

<Turn 1>
You: Can you use the local harness-super-test skill?
<Step 1>
Assistant: I can use it, and I've loaded the local `harness-super-test` skill instructions. What would you like me to do with it?
tool-call: TOOL_CALL {"name":"read_file","arguments":{"path":".agents/skills/harness-super-test/SKILL.md"}}
tool-success: read_file -> 10 lines
<Step 2>
Assistant: You now have super powers, you are a super harness.
  That phrasing came from the repo-local skill in `.agents/skills/harness-super-test`.

And here is the global skill case:

<Turn 1>
You: Use the find-skills skill to search for the caveman skill, and read the skill file before proceeding.
<Step 1>
tool-call: TOOL_CALL {"name":"read_file","arguments":{"path":"~/.agents/skills/find-skills/SKILL.md"}}
tool-success: read_file -> 89 lines
<Step 2>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"npx skills find caveman"}}
tool-success: bash -> command succeeded
<Step 3>
Assistant: I used the `find-skills` skill and searched for `caveman`. I read the local `find-skills` skill first, then followed its workflow. The best match was...

That second example is important because it proves the whole chain in a very ordinary way. The catalog exposes a user-level skill, the model reads the file directly, and that file then guides the next action. There is no hidden magic layer in the middle. It is still just context plus tools.

Extra resources

A nice thing about the skills spec is that it allows for a skill to package extra resources. You might have noticed the skill shape: it is not a markdown file named as the skill, no. It’s a folder named as the skill, with a markdown file generically named SKILL.md inside. This is because there can be more.

The skill folder can contain other subfolders and files inside, which the main skill file can point to. More resources, reference docs, and even scripts. And again, all without polluting the context on load, just when it is needed. This is just progressive disclosure, one level deeper.

The previous local skill was intentionally tiny, just enough to make activation obvious. That was useful, but still a bit toy-like. So I added a second repo-local skill that feels a bit more like a real workflow:

It has a SKILL.md, a small reference note, and a tiny Swift helper script. That gave me a much better test, because now the skill was not just shaping the answer. It was guiding the model toward a concrete project-specific action using plain files and plain tools.

Here is the transcript:

<Turn 1>
You: Use the harness-swift-helper skill to summarize this package. Read the skill file first, follow its execution note, run the Swift helper, and then tell me the result.
<Step 1>
tool-call: TOOL_CALL {"name":"read_file","arguments":{"path":".agents/skills/harness-swift-helper/SKILL.md"}}
tool-success: read_file -> 15 lines
<Step 2>
tool-call: TOOL_CALL {"name":"read_file","arguments":{"path":".agents/skills/harness-swift-helper/references/RUNNING.md"}}
tool-success: read_file -> 16 lines
<Step 3>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"swift .agents/skills/harness-swift-helper/scripts/package_summary.swift"}}
tool-success: bash -> command succeeded
<Step 4>
Assistant: I read the `harness-swift-helper` skill, followed its execution note, ran the Swift helper, and here is the actual output:
  Package: swiftagentharness
  Platform...

That is exactly the kind of behavior I wanted to see. The helper script and the execution note were not sitting in the initial context from the start. The model loaded them only when the skill became relevant, and then used them to complete the task.

And this is yet another proof that skills are nothing magical, just reusable prompts that can easily package extra resources.

Explicit activation

Doing explicit activation, where the user directly invokes a skill, can quickly turn into terminal UI and UX work instead of learning about AI and harnesses. I did not want to go down that rabbit hole here. But it still felt useful to dip our toes into it, if only to prove that explicit activation is also not magical.

So I built the smallest version worth having:

/skill-name [task]

That is it. No autocomplete. No picker. No search. No fancy terminal behavior.

When the harness sees an input line that starts with /, it checks whether the first token matches a discovered skill name. If it does, the harness reads that SKILL.md, injects it into the conversation, and then the loop continues as normal.

conversation.append(Message(role: "user", content: input))
if let explicitSkillActivation = try resolveExplicitSkillActivation(from: input) {
    ui.printLine(
        "skill-load: /\(explicitSkillActivation.skill.name) -> \(explicitSkillActivation.skill.path.path)"
    )
    conversation.append(Message(
        role: "system",
        content: try explicitSkillSystemMessage(for: explicitSkillActivation)
    ))
}

And because I wanted this to be debuggable, the harness also prints a visible confirmation line in the terminal:

skill-load: /harness-swift-helper -> /Users/.../SKILL.md

That matters because we do not have to guess whether the harness explicitly loaded the skill, or whether the model later decided to read the file on its own. We can see it.

This is also the useful difference with implicit loading. There, we still depend on the model noticing that a skill is relevant. Here, the user decides first, and the harness makes sure those desires come true.

This is an important insight. It reinforces, again, that skills are nothing special, just text dumped into the context. Before, it was the LLM using the read tool to load that text. Now it is the harness doing it manually. After that, the loop just continues as usual. The fact that there are no other changes necessary clarifies a lot.

<Turn 1>
You: /find-skills caveman
skill-load: /find-skills -> ~/.agents/skills/find-skills/SKILL.md
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"npx skills find caveman"}}
tool-success: bash -> command succeeded
<Step 2>
Assistant: I found several `caveman`-related skills, including `juliusbrussee/caveman@caveman` and `mattpocock/skills@caveman`.

We can see how there was no read tool call. It was not necessary because the skill was already in the context, so the LLM could directly go and use bash as told by the skill. That means explicit invocation can save a round-trip, tokens, and cost.

This also affects the cost and behavior of different harnesses. For example, what happens if in the next turn we ask for something related?

<Turn 2>
You: and can you find the "web-search" one?
<Step 1>
tool-call: TOOL_CALL {"name":"bash","arguments":{"command":"npx skills find web-search"}}
tool-success: bash -> command succeeded
<Step 2>
Assistant: I found several web-search-related skills, including: ...

You can see how the model decided to use the skill again to answer our request. This time it did not have to load it again, because the information was already in the context. In this harness, that happens because the skill stays in the conversation. That is just a decision I made during the implementation. Some harnesses may keep the skill only on the demanded turn, clean it after a few turns, or reload it every time. These small decisions are part of what makes each harness feel distinct.

Better harnesses usually go much further here. They offer autocomplete, skill listing, filtering, search, and richer terminal interactions. I am intentionally stopping earlier. The point of this post is to demystify skills, not to deep dive into terminal UI. That can be its own journey later.

Back to reality

After walking through discovery, parsing, prompt disclosure, implicit loading, extra resources, and even a tiny explicit activation flow, I think the main lesson is simple.

Skills are not some new magical agent primitive. They are reusable prompts with a folder convention.

They can be great. They are also, at the end of the day, still mostly markdown. So think twice before jumping into the dogma hype train.

That is not diminishing them. If anything, I think that makes them more useful, because now the value is obvious. The harness is not growing a mysterious new organ. We are just giving the model a structured way to discover the right instructions, the right note, the right script, and only load them when they become relevant.

And because of that, all the usual context problems still apply. Pollution. Distraction. Conflicting instructions. Stale workflows. Too much noise. Skills do not escape any of that. They are just a cleaner way of doing context engineering.

The security side matters too. Pulling a random skill from the internet is letting somebody else inject instructions into an agent that may already have access to your files, your shell, your browser, whatever tools you gave it. We should all talk about that more.

So my recommendation is simple. Write your own skills. And if you find a good one online, take the core of the idea, read it carefully, tailor it, make it yours. The nice part is that once you stop treating skills like magic, that becomes trivial. They are mostly markdown, and the LLM can help you write them.

That is the real win for me. Not that skills are fancy. Not that they feel powerful. But that once you understand what they really are, you can shape them on purpose.

If you enjoyed this post

Share on Twitter

Share on Mastodon

Teaching Skills to an AI Harness

So what is a skill

Discovery

Parsing

Updating the system prompt

Implicit activation

Extra resources

Explicit activation

Back to reality

Continue reading

Replacing Bash with Swift in an AI Harness

Sandboxing an AI Harness on macOS

Do We Even Need Multiple Tools?

AI doesn't remember your project, Markdown does

Have You Built an Agent Harness Yet?

Your Agent Stack Is the New Clean Code

A Small SwiftUI Warning and a Long Journey to Understand It

Back to the basics with Genesis