Agent metaSKILLs

Agent MetaSKILLs are small workflow programs for AI agents. Like regular agent SKILLs, they are meant to be automatically loaded on-demand when the user agent determines they are needed.

They are a dynamic extension to a regular agent skill: they sit next to a normal SKILL.md file and help an agent do the parts of a task that should be repeatable, bounded, and easy to inspect.

1. Motivation

Static skills are great when the useful thing is instruction:

When deploying, run the deploy command, check the health endpoint, and report
the result.

But many useful agent workflows are not just instructions. They are loops:

You can describe those loops in Markdown, but then the model has to remember and execute the loop correctly every time. That is fragile. A metaskill makes the loop explicit:

In short: use a static skill for guidance; use a metaskill when the skill needs to run a small, repeatable procedure.

2. A tiny metaSKILL

A metaskill is a skill directory with two files:

skills/
  require-heading/
    SKILL.md
    SKILL.star

SKILL.md describes when the skill should be used and documents the expected input keys:

---
name: require-heading
description: Draft text and retry until it includes a required heading.
metaskill: SKILL.star
metaskill_language: starlark
---

Use this when the user wants a short draft that must include one exact heading.

Input keys:
- `task`: The writing task description.
- `heading`: The exact heading text that must appear in the draft.

SKILL.star contains the workflow:

def run(input):
    task = input["task"]
    heading = input["heading"]

    for i in range(3):
        result = ask(task, {
            "purpose": "draft",
            "max_turns": 4,
        })

        if heading in result["answer"]:
            return {
                "status": "accepted",
                "answer": result["answer"],
                "attempts": i + 1,
            }

        trace("missing-heading", {
            "attempt": i + 1,
            "heading": heading,
        })

        task = (
            "Revise the draft so it includes this exact heading:\n" +
            heading +
            "\n\nPrevious draft:\n" +
            result["answer"]
        )

    return {
        "status": "exhausted",
        "answer": result["answer"],
        "warning": "The required heading was still missing.",
    }

The author writes a normal, small program. The host supplies ask, command, and trace. The program does not get raw filesystem, process, network, or environment access.

3. Package format

A metaskill is still a skill. It MUST live in a directory containing SKILL.md. The dynamic workflow is a program file in the same directory. The conventional program filename is SKILL.star.

skills/
  test-fix/
    SKILL.md
    SKILL.star

SKILL.md is the single source of metadata. It MUST start with YAML frontmatter.

Required fields:

Field Type Meaning
name string Stable skill identifier.
description string Short catalog description used by users and routing models.

Optional fields:

Field Type Meaning
metaskill string Relative path from the skill directory to the program file. Defaults to SKILL.star when that file exists.
metaskill_language string Program language identifier. This version of the specification defines starlark. Defaults to starlark when a metaskill file is present.

Name rules:

Description rules:

Input documentation:

Program file rules:

Hosts MUST ignore unknown metadata fields unless they intentionally implement an extension. Extension fields SHOULD be prefixed, for example x_myhost_feature.

4. Discovery

Skill search locations are host-defined. A portable metaskill package MUST NOT depend on a specific installation directory.

A conforming host discovers metaskills by:

  1. Finding skill directories that contain SKILL.md.
  2. Parsing and validating name and description.
  3. Resolving metaskill, or defaulting to SKILL.star when it exists.
  4. Rejecting metaskill paths that escape the skill directory.
  5. Rejecting unsupported metaskill_language values for execution.
  6. Marking executable metaskills in the skill catalog.

If multiple discovered skills have the same name, the host MUST choose a deterministic winner according to documented precedence rules.

Catalog entries SHOULD stay compact:

- test-fix: Run tests and ask the agent to fix failures. (metaskill: starlark)

5. Activation and execution

Activation and execution are separate.

Activation loads SKILL.md instructions into context. It should not execute code.

Execution runs the metaskill program. It can spend nested model calls, run allowed commands, and cause side effects through the host. A host SHOULD expose execution through an operation named run_metaskill.

If a host supports skill mentions such as $test-fix, those mentions SHOULD activate the skill instructions. They SHOULD NOT automatically execute the metaskill unless the user clearly asked for that behavior or the host has a separate documented policy.

6. Runtime language

This specification defines Starlark, a safe subset of Python, as the portable metaskill language.

A metaskill program MUST define:

def run(input):
    return "done"

The host calls run(input) exactly once per execution. input is a JSON-like dictionary supplied by the caller. It SHOULD contain only strings, numbers, booleans, null values, lists, and dictionaries.

The Starlark environment MUST be hermetic by default:

Hosts MAY support additional languages in the future by adding new metaskill_language values. A metaskill that wants to be portable across hosts SHOULD use starlark.

7. Host API

A metaskill exposes three global functions to the program:

Function Purpose
ask(prompt, opts={}) Run a bounded nested agent turn.
command(argv, opts={}) Run an argv-style command through host policy.
trace(kind, data={}) Append structured trace data.

Portable metaskills MUST use these global functions. A host MAY also expose an equivalent namespaced object such as agent.ask, but that is an extension and MUST NOT be required by portable metaskills.

ask(prompt, opts={})

ask asks the host agent to do one nested piece of work. It uses the host's normal model, tools, sandbox, cancellation, and reporting behavior.

Parameters:

Name Type Meaning
prompt string Prompt for the nested agent turn.
opts.purpose string Optional label for traces and reports.
opts.max_turns integer Optional nested turn cap. Hosts MUST clamp this to a documented maximum.

Return value:

{
  "answer": "text returned by the nested agent",
  "exhausted": false,
  "turns": 6,
  "truncated": false
}

answer MUST be a string. exhausted SHOULD indicate whether the nested turn hit a host limit. turns SHOULD be the number of turns used when known; hosts MAY report the granted cap when exact usage is unavailable. truncated MUST be true when answer was shortened before returning to the metaskill.

command(argv, opts={})

command runs a command through the host's normal command policy. It MUST NOT bypass allowlists, approvals, middleware, sandbox roots, lifecycle hooks, or reporting. Shell strings are not part of the portable API.

Parameters:

Name Type Meaning
argv list of strings Command and arguments. MUST be non-empty.
opts.timeout integer Optional timeout in seconds. Hosts MUST clamp this to a documented maximum.

Return value:

{
  "ok": true,
  "exit_code": 0,
  "result": "stdout and stderr or host error text",
  "truncated": false
}

ok MUST be true only when the command was allowed, started, did not time out, and exited successfully. exit_code SHOULD be the process exit code when known. result MUST be a string. Host-level failures SHOULD use an error: prefix in result. truncated MUST be true when result was shortened before returning to the metaskill.

trace(kind, data={})

trace records what happened. Use it for small facts that help a user or implementer understand the run later.

Parameters:

Name Type Meaning
kind string Short event label.
data dictionary Optional JSON-like event data.

Trace calls MUST NOT consume ask or command budgets. Hosts MUST cap the number and size of trace entries.

8. Budgets and limits

A conforming host MUST enforce finite budgets. Recommended defaults:

Limit Default
Nested ask calls 5
command calls 10
Wall-clock execution timeout 300 seconds
Trace entries 100
Characters per nested answer 20,000
Characters per command result 20,000
Characters in final result string 20,000

The run_metaskill operation MAY allow callers to set max_ask_calls and max_command_calls within host-defined bounds. Hosts SHOULD check cancellation and timeout before every host API call. Hosts SHOULD also interrupt or isolate pure script computation so large loops cannot evade wall-clock limits.

9. Result contract

Metaskill execution returns a string at the host tool boundary.

On success, the string SHOULD contain a short completion header followed by one JSON object:

[Metaskill: test-fix completed]
{"status":"accepted","answer":"Tests pass.","trace":[...]}

The JSON object is the result envelope. The top-level answer is the primary text a calling model should use. Other fields are structured metadata.

run(input) return values are normalized by the host:

Script return value Normalized envelope
None {"status":"ok","answer":""}
string {"status":"ok","answer":<string>}
dictionary The dictionary, with missing status defaulted to ok and missing answer defaulted to an empty string.

Any other return type MUST fail with an error: result.

On failure, the host MUST return a string beginning with error:. Failures include unknown metaskill name, static skill passed to run_metaskill, disallowed external metaskill, unavailable runtime, unreadable program file, syntax errors, runtime errors, exhausted budgets, timeout, cancellation, and invalid return values.

When the success envelope exceeds the host's result limit, the host SHOULD truncate trace first and answer last. Truncation MUST be explicit in the JSON, for example with answer_truncated: true or a trace entry whose kind is truncated.

10. Standard tool schema

Hosts that expose tools to a model SHOULD expose metaskill execution with this JSON-schema-shaped function:

{
  "name": "run_metaskill",
  "description": "Run a dynamic skill workflow by name.",
  "parameters": {
    "type": "object",
    "properties": {
      "name": {
        "type": "string",
        "description": "The metaskill name to execute."
      },
      "input": {
        "type": "object",
        "description": "Input data passed to the metaskill program."
      },
      "max_ask_calls": {
        "type": "integer",
        "minimum": 1,
        "description": "Maximum nested model calls."
      },
      "max_command_calls": {
        "type": "integer",
        "minimum": 0,
        "description": "Maximum command calls."
      }
    },
    "required": ["name", "input"]
  }
}

When the host knows the executable metaskill names, it SHOULD add an enum to the name property.

If the caller passes an empty or missing input, the host SHOULD return a helpful error that includes the skill instructions so the caller can learn the expected keys and retry without an additional round trip.

11. Practical examples

Run tests and ask for fixes

This metaskill runs a test command. If the command fails, it asks the host agent to fix the failure, then tries again.

def run(input):
    test_command = input.get("test_command", ["pytest", "-q"])

    for i in range(3):
        result = command(test_command, {"timeout": 120})

        trace("test-run", {
            "attempt": i + 1,
            "ok": result["ok"],
        })

        if result["ok"]:
            return {
                "status": "accepted",
                "answer": "Tests passed.",
                "attempts": i + 1,
            }

        ask(
            "The tests failed. Fix the problem, then stop.\n\n" +
            result["result"],
            {
                "purpose": "fix-tests",
                "max_turns": 10,
            },
        )

    return {
        "status": "exhausted",
        "answer": result["result"],
        "warning": "Tests still failed after 3 attempts.",
    }

Example input:

{
  "test_command": ["pytest", "tests/test_parser.py", "-q"]
}

Require a checklist

This metaskill asks for a short checklist and retries until all required words appear.

def run(input):
    task = input["task"]
    required = input.get("required", ["install", "test", "rollback"])

    for i in range(3):
        result = ask(task, {
            "purpose": "draft-checklist",
            "max_turns": 4,
        })

        missing = []
        for word in required:
            if word not in result["answer"].lower():
                missing.append(word)

        if not missing:
            return {
                "status": "accepted",
                "answer": result["answer"],
                "attempts": i + 1,
            }

        trace("missing-words", {
            "attempt": i + 1,
            "missing": missing,
        })

        task = (
            "Revise the checklist. It must include these words: " +
            ", ".join(missing) +
            "\n\nPrevious checklist:\n" +
            result["answer"]
        )

    return {
        "status": "exhausted",
        "answer": result["answer"],
        "missing": missing,
    }

Example input:

{
  "task": "Write a deployment checklist for a small web app.",
  "required": ["install", "test", "rollback"]
}

Review a diff once

This metaskill keeps the workflow simple: get a diff through host policy, ask for a focused review, and return the answer.

def run(input):
    diff = command(["git", "diff"], {"timeout": 30})

    if not diff["ok"]:
        return {
            "status": "error",
            "answer": diff["result"],
        }

    review = ask(
        "Review this diff for bugs only. Ignore style nits.\n\n" +
        diff["result"],
        {
            "purpose": "bug-review",
            "max_turns": 6,
        },
    )

    return {
        "status": "ok",
        "answer": review["answer"],
    }

12. Security model

Metaskills are code, so they need an explicit trust model.

Default policy:

Prompt-injection posture:

Sandbox posture:

13. Agent behavior guidelines

An AI agent using a metaskill-capable host SHOULD: