1. Fundamentals 2. Skills 3. Multi-Agent 4. Tokens 5. Performance 6. Applications 7. Accuracy 8. Future 9. DevOps 10. GSD 11. Hooks 12. Teams 13. Orchestration 14. Memory 15. Case Studies References
2

Skills — The Modular Capability Layer

Research current as of: April 2026

Section Contents

2.1 What Are Skills?

Skills are the modular capability layer of Claude Code and compatible AI agent systems. They extend what an agent can do by providing specialized instructions, workflows, and domain knowledge packaged as Markdown files. Unlike tools, which give agents new actions (read a file, call an API), skills give agents new knowledge — the procedural understanding of how to approach complex, multi-step tasks.

Definition: What Is a Skill?

A skill is a SKILL.md file containing YAML frontmatter (metadata) and Markdown body (instructions). Skills live in well-known directories, are discovered automatically by the agent runtime, and are loaded into context when relevant. They follow the Agent Skills open standard, released by Anthropic in December 2025 and adopted across the industry.

The core insight behind skills is progressive disclosure. An agent with 100 skills does not load all 100 into its context window at startup. Instead, only the skill name and a short description (~50 tokens each) are present initially. When a user invokes a skill or the agent determines one is relevant, the full instructions are loaded. This keeps context lean while making extensive specialized knowledge available on demand.

Users interact with skills through slash commands. Typing /commit in Claude Code invokes the commit skill. Typing /deploy staging invokes a deploy skill with "staging" as an argument. But skills also activate autonomously: if an agent's description says "Use when writing tests," the agent loads it automatically when it encounters a testing task.

The Merger: Commands and Skills

Claude Code supports two overlapping mechanisms that create slash commands: commands (files in .claude/commands/) and skills (files in .claude/skills/). Both produce entries in the / menu. The distinction is that skills add optional features beyond what commands provide:

In practice, the terms are often used interchangeably. If you need a quick custom command, drop a Markdown file in .claude/commands/. If you need auto-activation, description-based discovery, subagent isolation, or any of the advanced features, use a skill directory with SKILL.md.

The Open Standard and Industry Adoption

Agent Skills were first announced by Anthropic on October 16, 2025, as a feature of Claude Code. On December 18, 2025, the specification was published as an open standard at agentskills.io/specification, establishing a vendor-neutral format for packaging agent capabilities. The decision to open-source the specification was deliberate: skills are more valuable when they are portable across agent platforms.

The adoption curve has been rapid. Within weeks of the open standard release, OpenAI adopted the Agent Skills format for Codex CLI and ChatGPT. Microsoft integrated skills support into VS Code and GitHub Copilot. Cursor, Goose, Amp, and OpenCode added native support. By April 2026, the community had published over 71,000 skills on the agentskills.io registry, and the GitHub repository for community-contributed skills had surpassed 20,000 stars.

71,000+
Community Skills Available
20,000+
GitHub Stars
8+
Agent Platforms with Native Support
~50
Tokens per Skill (Idle)

The cross-platform compatibility means a skill written for Claude Code works in Cursor, OpenCode, Goose, and any other agent that implements the agentskills.io specification. This portability is a significant advantage over platform-specific plugin systems. A team can invest in building high-quality skills knowing they are not locked into a single AI agent platform.

Skills vs Commands vs Plugins vs Tools

Aspect Skills Commands Tools Plugins
Nature Procedural knowledge with metadata Inline prompt injection Executable functions with I/O schemas Vendor-specific extensions
Format SKILL.md with YAML frontmatter Markdown file (no frontmatter) Code with JSON Schema definitions Varies by platform
Location .claude/skills/*/SKILL.md .claude/commands/*.md Built-in, MCP servers, or SDK Platform-dependent
Activation User invocation or auto-trigger by description User invocation only (slash command) Agent calls explicitly during execution Platform-specific hooks
Side Effects None (instructions that guide tool use) None (prompt text injection) Can modify files, call APIs, run code Varies
Auto-Activation Yes (description-based) No No (agent decides when to call) Varies
Subagent Support Yes (context: fork) No N/A Varies
Portability Cross-platform (open standard) Claude Code only MCP standard, or platform-specific Platform-locked
Token Cost (idle) ~50 tokens (name + description) ~50 tokens (name only) Schema definition per tool Varies

Key Insight: Skills Orchestrate Tools

Skills and tools are complementary, not competitive. A skill provides the knowledge of how to approach a task — which tools to call, in what order, with what parameters, and how to handle edge cases. Tools provide the capability to execute specific operations. The most effective agents combine rich skills (domain expertise) with powerful tools (concrete actions). A deploy skill, for example, instructs the agent to run specific Bash commands, check specific files, and verify specific outputs — but the actual work is done by the Bash, Read, and Grep tools.

2.2 Skill Storage and Discovery

Skills are discovered from four scope levels, each with a distinct purpose and priority. When skill names collide across scopes, higher-priority scopes win. This hierarchy enables enterprises to enforce standards, individuals to customize their workflow, and projects to define domain-specific behavior.

The Four Scope Levels

1. Enterprise (Highest Priority)

Managed through organizational settings. Enterprise administrators push skills to all users via managed configuration. These skills cannot be overridden by lower scopes and enforce organizational policies — coding standards, security review checklists, compliance workflows.

Location: Managed settings (internal API)

2. Personal

User-specific skills stored in the home directory. These follow the user across every project they work on — personal commit conventions, preferred code review patterns, writing style guides. Available in every Claude Code session regardless of which project is open.

Location: ~/.claude/skills/

3. Project

Repository-level skills checked into version control alongside the code they serve. These are the most common scope — project-specific deployment procedures, architecture guidelines, testing conventions. Shared with every developer who clones the repo.

Location: .claude/skills/

4. Plugin (Lowest Priority)

Skills bundled with published packages or plugins. These are consumed automatically when a plugin is installed, providing ready-made expertise for specific frameworks or services. Plugin skills contribute to the agent's capabilities without manual configuration.

Location: <plugin>/skills/

Automatic Discovery Mechanics

Claude Code scans skill directories recursively, supporting nested subdirectories for organizational purposes. A monorepo with packages/auth/.claude/skills/ and packages/billing/.claude/skills/ discovers skills from both locations. Each skill directory must contain a SKILL.md file — this is the only required file. The directory name is irrelevant to discovery; what matters is the name field in frontmatter (or the filename if no name is specified).

Skills are also discovered from directories added via the --add-dir flag. This enables sharing skills across related projects without duplicating files. If you maintain a shared skills repository, you can add it to any project session:

Bash
# Add shared skills from another directory
claude --add-dir ~/shared-skills

# Skills in ~/shared-skills/.claude/skills/ are now discoverable
# They merge with the project's own skills

Directory Structure

A skill directory has one required file and any number of supporting files:

Directory Structure
.claude/skills/
  deploy/
    SKILL.md              # Required: frontmatter + instructions
    templates/
      nginx.conf.template # Supporting file referenced by SKILL.md
      docker-compose.yml  # Supporting file referenced by SKILL.md
    examples/
      staging-deploy.md   # Example referenced by SKILL.md
  code-review/
    SKILL.md              # Another skill
    checklists/
      security.md         # Loaded on demand when SKILL.md references it
      performance.md

Discovery Priority Resolution

When two skills share the same name, the higher-priority scope wins: Enterprise > Personal > Project > Plugin. Within the same scope, the skill closest to the working directory takes precedence (relevant for monorepos). This means an enterprise security-review skill always overrides a project-level one with the same name, ensuring organizational policies cannot be bypassed.

Monorepo Patterns

Monorepos present a unique challenge for skill discovery because different packages within the same repository may need different skills. Claude Code handles this through nested discovery: skill directories at any depth within the project are scanned and merged. Consider a typical monorepo structure:

Monorepo Skill Layout
monorepo/
  .claude/skills/              # Root-level skills (shared across all packages)
    commit-conventions/
      SKILL.md
    ci-pipeline/
      SKILL.md
  packages/
    auth/
      .claude/skills/          # Auth-specific skills
        oauth-flow/
          SKILL.md
    billing/
      .claude/skills/          # Billing-specific skills
        stripe-integration/
          SKILL.md
    frontend/
      .claude/skills/          # Frontend-specific skills
        component-patterns/
          SKILL.md

In this layout, every developer in the monorepo gets the root-level commit-conventions and ci-pipeline skills. Developers working in packages/auth/ additionally get the oauth-flow skill. The agent automatically discovers skills from the working directory upward to the repository root, loading package-specific skills alongside shared ones. If a package skill collides with a root skill by name, the package-level skill wins (closer to the working directory takes priority within the same scope level).

For organizations with shared skill libraries maintained separately from application code, the --add-dir approach provides a clean separation. The shared skills repository can be versioned, tested, and published independently, and individual projects consume them by reference rather than by copy. This pattern is particularly effective when combined with enterprise-managed settings that ensure certain critical skills are always present.

2.3 SKILL.md Format and Frontmatter Reference

Every skill begins with a SKILL.md file. The file has two parts: YAML frontmatter (metadata between --- delimiters) and a Markdown body (the actual instructions). The frontmatter controls how the skill is discovered, when it activates, and how it executes. The body contains the instructions the agent follows.

SKILL.md
---
name: deploy-staging
description: >
  Deploy the application to the staging environment. Use when the user asks
  to deploy, push to staging, or test in a staging environment.
argument-hint: [branch-name]
allowed-tools:
  - Bash
  - Read
  - Grep
disable-model-invocation: true
---

# Deploy to Staging

## Prerequisites
1. Ensure you are on the correct branch
2. Run the test suite: `npm test`
3. Check for uncommitted changes

## Steps
1. Build the application: `npm run build`
2. Run database migrations: `npm run migrate:staging`
3. Deploy via rsync: `rsync -avz dist/ staging.example.com:/app/`
4. Verify deployment: `curl -s https://staging.example.com/health`

## Rollback
If the health check fails, redeploy the previous version:
```
git checkout HEAD~1
npm run build
rsync -avz dist/ staging.example.com:/app/
```

Complete Frontmatter Reference

The following table documents every frontmatter field available in the Agent Skills specification as of April 2026. Fields marked "Recommended" significantly improve skill behavior but are not strictly required for the file to be valid.

Field Required Type Description
name No string Display name, becomes the /slash-command. Lowercase letters, hyphens, and digits only. Maximum 64 characters. If omitted, the directory name is used. Nested directories create namespaced names: gsd/plan-phase becomes /gsd:plan-phase.
description Recommended string Describes what the skill does and when to use it. This is the primary signal Claude uses for auto-activation. Truncated at 250 characters in the system prompt. Front-load the most important use case in the first sentence.
argument-hint No string Displayed during autocomplete to hint at expected arguments. Example: [issue-number], [branch-name], [file-path]. Purely cosmetic — does not enforce argument structure.
disable-model-invocation No boolean When true, only the user can invoke this skill via slash command. Claude will never auto-activate it. Default: false. Use for destructive operations, deployment workflows, or anything that should require explicit human intent.
user-invocable No boolean When false, the skill is hidden from the / autocomplete menu. Claude can still auto-activate it based on the description. Default: true. Use for background skills that should activate contextually but not clutter the command palette.
allowed-tools No list Tools Claude can use without asking permission while the skill is active. Example: [Bash, Read, Grep]. Does not add new tools — only pre-approves existing ones. Useful for skills that run automated pipelines where permission prompts would interrupt flow.
model No string Model override while the skill is active. Example: claude-sonnet-4-20250514. Allows a lightweight skill to run on a faster model, or a complex reasoning skill to use a more capable model. The override applies only during skill execution.
effort No string Effort level for reasoning: low, medium, high, or max. Currently only supported on Opus models. Controls how much compute the model allocates for reasoning before generating output. Higher effort means deeper thinking but more tokens consumed.
context No string Execution context. The only supported value is fork, which runs the skill in an isolated subagent with its own context window. The subagent's output is summarized back to the parent. Use for expensive or long-running operations that should not consume the main context.
agent No string Specifies which subagent type to use when context: fork is set. Can reference built-in agent types or custom agents defined in .claude/agents/. Controls the personality and capabilities of the forked subagent.
hooks No object Hooks scoped to the skill's lifecycle. Can define PreToolUse, PostToolUse, and other hook types that only fire while this skill is active. Enables skill-specific automation without affecting global behavior.
paths No list Glob patterns that limit auto-activation. The skill only auto-activates when the user is working with files matching these patterns. Example: ["src/**/*.ts", "tests/**"]. Does not affect manual invocation via slash command.
shell No string Shell to use for !`command` substitutions: bash (default) or powershell. Only affects shell injection preprocessing, not the Bash tool itself.

The 250-Character Description Limit

The description field is truncated at 250 characters in the system prompt. Every loaded skill contributes its name and truncated description to the context at all times. With 100 skills, that is roughly 5,000 tokens of persistent overhead. This is why description quality matters: front-load the critical use case, keep it under 250 characters, and be specific about when the skill should activate, not just what it does. Vague descriptions like "Helps with development tasks" waste the budget and trigger false activations.

2.4 Types of Skill Content

Skill content falls into two broad categories based on how the agent uses it. Understanding this distinction helps you design skills that activate correctly and deliver the right kind of value.

Reference Content

Conventions, patterns, domain knowledge, and architectural guidelines. Reference skills run inline in the main conversation context. They enrich the agent's understanding without taking explicit action. The agent absorbs the instructions and applies them to its ongoing work.

Examples:

  • Coding conventions and style guides
  • Architecture decision records
  • API documentation and patterns
  • Domain-specific terminology
  • Security checklists

Typically: user-invocable: false, auto-activates via description

Task Content

Step-by-step instructions for specific actions with defined inputs and outputs. Task skills often use disable-model-invocation: true because they perform consequential operations that should require explicit human intent.

Examples:

  • Deployment workflows
  • Database migration procedures
  • Release management checklists
  • Code generation templates
  • Incident response runbooks

Typically: disable-model-invocation: true, user invokes via /skill-name

Many production skills blend both categories. A deployment skill might start with reference content (environment configuration, naming conventions) and then transition to task content (step-by-step deployment commands). The important thing is that the frontmatter flags match the skill's intent: if it performs destructive operations, set disable-model-invocation: true regardless of how much reference content it contains.

SKILL.md — Reference Skill
---
name: api-conventions
description: >
  REST API conventions for this project. Use when creating endpoints,
  reviewing API code, or discussing API design.
user-invocable: false
---

# API Conventions

## Naming
- Resources are plural nouns: `/users`, `/orders`
- Actions use sub-resources: `/users/123/activate`
- Query parameters use camelCase: `?pageSize=20`

## Response Format
All responses use the envelope pattern:
```json
{
  "data": { ... },
  "meta": { "page": 1, "total": 42 }
}
```

## Error Handling
Errors return appropriate HTTP status codes with a body:
```json
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "Email is required",
    "field": "email"
  }
}
```
SKILL.md — Task Skill
---
name: db-migrate
description: >
  Run database migrations against the specified environment. Creates backup
  first. Use when user asks to migrate, update schema, or run migrations.
argument-hint: [environment]
disable-model-invocation: true
allowed-tools:
  - Bash
  - Read
---

# Database Migration

## Arguments
- `$1` = target environment (staging, production)

## Steps
1. Verify environment: `echo "Migrating: $1"`
2. Create backup: `pg_dump $DB_URL > backup-$(date +%s).sql`
3. Run migrations: `npx prisma migrate deploy`
4. Verify schema: `npx prisma db pull --print`
5. Report results to user

2.5 Invocation Control — Who Can Trigger What

Two frontmatter fields control who can invoke a skill and when its content loads into context. Understanding their interaction is critical for designing skills that activate correctly.

The 2x2 Matrix

Configuration User Can Invoke Claude Can Invoke When Content Loads
Default (no flags) Yes Yes Description always in context; full content on invoke
disable-model-invocation: true Yes No Not in context until user invokes via /command
user-invocable: false No (hidden from menu) Yes Description always in context; full content when Claude decides
Both flags set No No Never loads (effectively disabled)

Choosing the Right Configuration

Default: Open Access

The skill appears in the / menu and Claude can auto-activate it. Best for general-purpose skills like code review, test generation, or documentation helpers. The description is always present in context (~50 tokens), and full content loads when either the user or Claude triggers it.

Use when: The skill is safe, non-destructive, and broadly useful.

disable-model-invocation: User-Only

Claude cannot auto-activate the skill. The user must explicitly type the slash command. The skill's description is not loaded into context, saving token budget. Best for destructive operations (deploy, migrate, delete) or expensive workflows.

Use when: The skill performs consequential actions requiring explicit human intent.

user-invocable: false — Background

The skill is hidden from the / autocomplete menu but Claude can still load it automatically based on the description. Best for reference skills that should silently enrich the agent's behavior without cluttering the user-facing command list.

Use when: The skill provides contextual knowledge the agent should apply automatically.

Both Flags: Disabled

Setting both disable-model-invocation: true and user-invocable: false effectively disables the skill. It cannot be triggered by anyone. This is only useful for temporarily deactivating a skill without deleting it.

Use when: Temporarily disabling a skill during debugging or refactoring.

2.6 String Substitutions and Dynamic Context

Skills support string substitutions and shell injection for dynamic content. These mechanisms allow skills to adapt their instructions based on arguments, session state, and live system data rather than containing only static text.

Argument Substitutions

Variable Resolves To Example
$ARGUMENTS The full argument string passed after the slash command /deploy staging --force"staging --force"
$ARGUMENTS[0] First space-delimited argument (zero-indexed) /deploy staging"staging"
$ARGUMENTS[1] Second argument /deploy staging fast"fast"
$1, $2, ... Shorthand for $ARGUMENTS[0], $ARGUMENTS[1], etc. Same as indexed form

Environment Variables

Variable Resolves To
${CLAUDE_SESSION_ID} The unique ID of the current Claude Code session. Useful for logging, state tracking, or creating session-specific artifacts.
${CLAUDE_SKILL_DIR} Absolute path to the directory containing this SKILL.md file. Enables referencing supporting files portably: ${CLAUDE_SKILL_DIR}/templates/config.yaml.

Shell Injection

Skills can embed live system data by executing shell commands during preprocessing. This runs before the skill content reaches Claude, so the agent sees the command's output, not the command itself.

SKILL.md — Inline Shell Injection
---
name: review-pr
description: Review the current PR with full diff context.
argument-hint: [pr-number]
---

# PR Review

## PR Summary
!`gh pr view $1 --json title,body,author --jq '.title + "\n" + .body'`

## Full Diff
```!
gh pr diff $1
```

## Review Checklist
- [ ] Are there breaking changes?
- [ ] Are new dependencies justified?
- [ ] Do tests cover the changes?
- [ ] Is the commit history clean?

Review this PR and provide feedback on each checklist item.

Two shell injection syntaxes are available:

SKILL.md — Dynamic Context with Shell Injection
---
name: session-status
description: Show current project state and recent activity.
---

# Session Status

## Current Branch
!`git branch --show-current`

## Recent Commits
```!
git log --oneline -10
```

## Modified Files
```!
git status --short
```

## Node Version
!`node --version`

Based on this context, summarize the current project state
and suggest what to work on next.

Security Consideration: Shell Injection

Shell injection runs with the user's permissions during skill preprocessing. The shell frontmatter field controls whether bash or powershell is used. Be careful with argument interpolation in shell commands — $ARGUMENTS values come from user input and could contain shell metacharacters. For enterprise environments, consider whether shell injection skills should require review before deployment.

2.7 Supporting Files

While SKILL.md is the only required file, production skills often include supporting files that provide templates, examples, scripts, and reference material. The key principle is that SKILL.md should be a focused entrypoint (under 500 lines recommended), and it should reference supporting files so Claude knows they exist and when to load them.

Recommended Structure

Directory Layout
.claude/skills/publish-pipeline/
  SKILL.md                    # Entrypoint: ~200 lines of core instructions
  templates/
    pandoc-html.template      # HTML template for document conversion
    pandoc-pdf.template       # PDF template with XeLaTeX styling
  scripts/
    ftp-sync.sh               # FTP deployment script
    validate-output.sh        # Post-build validation
  examples/
    basic-publish.md          # Minimal example for simple documents
    multi-format.md           # Example with HTML + PDF output

Referencing Supporting Files

Claude does not automatically read every file in a skill directory. The SKILL.md must tell Claude about supporting files and when to use them. There are two patterns:

SKILL.md — Explicit References
---
name: publish-pipeline
description: Build and deploy documents as HTML/PDF with FTP sync.
---

# Publish Pipeline

## Templates
When building HTML output, read and use the template at:
`${CLAUDE_SKILL_DIR}/templates/pandoc-html.template`

When building PDF output, read and use the template at:
`${CLAUDE_SKILL_DIR}/templates/pandoc-pdf.template`

## Deployment
After building, run the FTP sync script:
`bash ${CLAUDE_SKILL_DIR}/scripts/ftp-sync.sh`

## Examples
For reference on usage patterns, see:
- `${CLAUDE_SKILL_DIR}/examples/basic-publish.md`
- `${CLAUDE_SKILL_DIR}/examples/multi-format.md`

The Python Script Pattern

A common pattern for skills that generate visual output (charts, diagrams, formatted reports) is to include a Python script that the skill instructs Claude to run:

SKILL.md — Python Generator Pattern
---
name: coverage-report
description: Generate visual test coverage report with charts.
allowed-tools:
  - Bash
  - Read
  - Write
---

# Coverage Report Generator

1. Run the test suite with coverage: `npm test -- --coverage --json`
2. Parse the JSON coverage output
3. Run the visualization script:
   `python3 ${CLAUDE_SKILL_DIR}/scripts/generate-charts.py coverage.json`
4. The script outputs an HTML report to `coverage-report.html`
5. Open and summarize the key findings for the user

500-Line Guideline

Keep SKILL.md under 500 lines. When a skill's instructions grow beyond this, extract reference material into supporting files. Long skills consume more context when loaded, and large Markdown files are harder to maintain. The SKILL.md should contain the essential decision logic and step sequencing; detailed reference data belongs in supporting files that are loaded on demand via Read.

2.8 Running Skills in Subagents

By default, a skill's content is injected into the main conversation context. For expensive or long-running skills, this is wasteful — the skill instructions and intermediate work consume context that the main conversation needs. The context: fork setting solves this by running the skill in an isolated subagent.

How Forked Skills Work

  1. User invokes /skill-name or Claude auto-activates the skill
  2. Instead of injecting the skill content into the main context, the runtime spawns a subagent — a new Claude instance with its own, clean context window
  3. The skill's Markdown body becomes the subagent's task prompt
  4. The subagent executes independently, using tools, reading files, and producing output
  5. When the subagent completes, its output is summarized back to the parent context
  6. The parent sees the result without having consumed context on the intermediate steps
SKILL.md — Forked Execution
---
name: deep-review
description: >
  Comprehensive code review with security, performance, and architecture
  analysis. Runs in isolated context to avoid consuming main session budget.
context: fork
agent: code-reviewer
allowed-tools:
  - Read
  - Grep
  - Glob
---

# Deep Code Review

Perform a comprehensive review of the codebase changes:

1. Run `git diff HEAD~1` to see recent changes
2. For each changed file:
   - Check for security vulnerabilities (injection, auth bypass, secrets)
   - Check for performance issues (N+1 queries, unbounded loops, large allocations)
   - Check for architectural violations (layer crossing, circular dependencies)
3. Compile findings into a structured report with severity levels
4. Suggest specific fixes for each finding

The Agent Field

The agent field selects which subagent type runs the forked skill. This can be a built-in agent type or a custom agent defined in .claude/agents/:

Agent Type Purpose When to Use
Explore Read-only exploration and analysis Skills that only need to read and report, not modify
Plan Planning and analysis with limited tool access Skills that produce plans, estimates, or recommendations
General-purpose (default) Full agent with standard tool access Skills that need to read, write, and execute
Custom (.claude/agents/*.md) Specialized agent with custom system prompt Skills requiring domain-specific personality or constraints

The Inverse Relationship

There is a dual relationship between skills and subagents. A skill can specify context: fork to run in a subagent, and conversely, a subagent can specify skills in its definition to load specific skills into its context. This creates two patterns:

Skill → Subagent

A skill uses context: fork to run in isolation. The skill content becomes the subagent's instructions. The user sees the skill; the subagent is an implementation detail.

Defined in: SKILL.md frontmatter

Subagent → Skills

A subagent (launched via the Task tool or defined in .claude/agents/) specifies which skills to load. The agent inherits domain knowledge from those skills. The skill content enriches the subagent's capabilities.

Defined in: Agent definition or Task tool call

Context Isolation Tradeoffs

Forked skills save context in the main conversation but introduce a communication boundary. The subagent cannot see the parent's conversation history (it starts with a clean context), and the parent only sees a summary of the subagent's work. This means forked skills work best for self-contained tasks that do not require deep awareness of the ongoing conversation. For skills that need to reference prior discussion or build on earlier context, run them inline (without context: fork).

Effort and Model Overrides in Forked Skills

When a skill runs in a forked context, the model and effort frontmatter fields become particularly powerful. A computationally expensive analysis skill can specify model: claude-opus-4-6-20250401 with effort: max for deep reasoning, while the parent conversation continues on a faster model. Conversely, a simple formatting or validation skill can use model: claude-sonnet-4-20250514 with effort: low to complete quickly and cheaply.

This creates an economic model for skill execution. Each forked skill can independently optimize its cost-performance tradeoff without affecting the rest of the session. In our production system, we use this pattern for the simplify skill: three parallel subagents each analyze different quality dimensions, and since they are forked, their combined intermediate reasoning does not consume the parent's context window. The parent receives only the consolidated findings.

Hooks Scoped to Skills

The hooks frontmatter field enables skill-specific automation that only fires while the skill is active. This is a more surgical alternative to global hooks defined in settings.json. A deployment skill can define a PreToolUse hook that validates every Bash command against a safety checklist before execution, without imposing that overhead on normal development work:

SKILL.md — Skill-Scoped Hooks
---
name: deploy-production
description: Production deployment with safety hooks.
disable-model-invocation: true
hooks:
  PreToolUse:
    - matcher: Bash
      command: "python3 ${CLAUDE_SKILL_DIR}/validate-command.py"
---

# Production Deployment
...

Skill-scoped hooks are particularly useful for security-sensitive operations. A database migration skill might define a hook that prevents DROP TABLE commands. A file management skill might hook into Write operations to prevent modifications outside a specific directory. These constraints apply only while the skill is active and disappear when it completes, providing targeted safety rails without global overhead.

2.9 Bundled Skills

Claude Code ships with five built-in skills that demonstrate best practices and provide immediately useful capabilities. These skills are available in every session without any configuration.

/batch

Parallel codebase changes with worktree isolation

Breaks a large codebase modification into 5-30 independent units and executes them in parallel using git worktrees for isolation. Each unit runs in its own subagent with its own copy of the codebase, preventing conflicts. Results are merged and optionally submitted as pull requests. Ideal for refactoring tasks that touch many files with similar patterns — renaming a function across 50 files, updating import paths, applying a consistent code style change.

Key features: Worktree isolation prevents conflicts, automatic PR creation, progress tracking across parallel units, merge conflict detection and resolution.

/claude-api

API reference loading for Anthropic SDKs

Loads the official API reference documentation for Anthropic's SDKs in Python, TypeScript, Java, Go, Ruby, C#, and PHP. When you are building an application that uses the Claude API or Anthropic SDK, this skill ensures the agent has current, accurate reference material for function signatures, parameter types, error handling patterns, and authentication setup.

Key features: Multi-language coverage, current documentation, proper type annotations, authentication patterns, streaming and batch API examples.

/debug

Session debug logging and troubleshooting

Enables debug logging for the current Claude Code session, showing internal state, tool calls, skill activations, and context management decisions. Use when Claude is behaving unexpectedly — skills not triggering, tools failing silently, or context budget being exceeded. The debug output helps diagnose issues with skill discovery, hook execution, and permission resolution.

Key features: Skill activation traces, tool call logs, context budget reporting, hook execution logs, permission resolution chains.

/loop

Recurring prompt execution on interval

Runs a prompt or slash command on a recurring interval. Example: /loop 5m /health-check runs the health-check skill every 5 minutes. Defaults to 10-minute intervals if no duration is specified. Useful for monitoring tasks, periodic status checks, or continuous integration workflows during development. The loop continues until explicitly stopped or the session ends.

Key features: Configurable intervals, slash command chaining, automatic stop on session end, persistent monitoring across context resets.

/simplify

Three parallel review agents for code quality

Launches three parallel subagents that independently review changed code for reuse opportunities, quality issues, and efficiency improvements. Each agent focuses on a different dimension: one looks for code that could be extracted into shared utilities, one checks for correctness and edge cases, and one analyzes performance characteristics. Findings are aggregated and any issues found are automatically fixed.

Key features: Parallel analysis (3 agents), automatic fix application, reuse detection, performance analysis, quality scoring.

Bundled Skills as Design Templates

The five bundled skills collectively demonstrate most of the key skill design patterns discussed in this section. /batch demonstrates context: fork with parallel execution and worktree isolation. /claude-api demonstrates reference content that loads on demand without any task execution. /debug demonstrates diagnostic tooling that introspects the agent runtime itself. /loop demonstrates temporal execution patterns with interval-based repetition. /simplify demonstrates the multi-agent review pattern where several forked agents analyze the same codebase independently and aggregate findings.

When building custom skills, these bundled implementations serve as proven templates. The /batch pattern, in particular, is frequently adapted for domain-specific parallel workflows — batch documentation updates, batch test generation, batch migration scripts. The key insight from /batch is that worktree isolation (each subagent gets its own git worktree) eliminates the coordination problem that plagues most parallel file modification approaches. Instead of trying to merge concurrent edits to the same working directory, each agent works in isolation and results are merged at the git level where conflict detection is well-understood.

The /simplify pattern is equally instructive: by splitting a single review task into three independent analyses (reuse, quality, efficiency), the skill achieves more thorough coverage than a single-pass review. Each subagent can focus deeply on its dimension without context competition. This multi-perspective pattern generalizes to any task where different evaluation criteria benefit from independent attention.

2.10 Restricting Skill Access

Claude Code provides several mechanisms to control which skills can run, who can invoke them, and how much context budget they consume. These controls are essential for enterprise environments, shared teams, and security-sensitive projects.

Permission-Level Controls

settings.json — Permission Controls
{
  // Deny the Skill tool entirely - disables ALL skills
  "permissions": {
    "deny": ["Skill"]
  }
}

{
  // Allow only specific skills
  "permissions": {
    "deny": ["Skill"],
    "allow": [
      "Skill(commit)",
      "Skill(deploy-staging)"
    ]
  }
}

{
  // Allow a skill and all its sub-skills
  "permissions": {
    "allow": ["Skill(gsd *)"]
  }
}

{
  // Deny specific skills while allowing others
  "permissions": {
    "deny": [
      "Skill(deploy-production)",
      "Skill(db-migrate)"
    ]
  }
}

Frontmatter-Level Controls

Beyond permissions, the frontmatter provides two skill-level controls:

Description Budget Control

The environment variable SLASH_COMMAND_TOOL_CHAR_BUDGET controls the total character budget for skill descriptions in the system prompt. When the combined descriptions exceed this budget, skills are truncated or dropped to fit. This is a blunt instrument but useful when you have many skills and need to control context overhead:

Bash
# Reduce the description budget to 5000 characters
export SLASH_COMMAND_TOOL_CHAR_BUDGET=5000

# Start Claude Code with the reduced budget
claude

Layered Security Model

Effective skill access control combines multiple layers: enterprise-managed skills set organizational baselines, permissions in settings.json control tool-level access, disable-model-invocation prevents autonomous activation of dangerous skills, and paths globs scope activation to relevant file types. In our production system, deployment and database skills all use disable-model-invocation: true, while reference skills like coding conventions use user-invocable: false to stay invisible but active.

2.11 Production Skill Patterns — From Our 34+ Skills

Over 10 months of production use, we have built and refined 34+ skills that power an autonomous development workflow. These skills organize into five functional categories: orchestration, communication, pipeline, workflow, and state management. Each category addresses a distinct challenge in multi-agent systems.

The following patterns are drawn from real production skills running daily in our codebase (gsd-skill-creator). Code examples are simplified for clarity but reflect actual implementations.

Orchestration Skills

Orchestration skills coordinate multiple agents, dispatch work, and track progress. They are the "management layer" that decides who does what and monitors completion.

Orchestration

fleet-mission

Parallel agent fleet dispatch with progress tracking and result aggregation. Launches N agents simultaneously, monitors their completion, and merges results. Each agent receives an isolated work assignment and returns structured output.

SKILL.md excerpt
---
name: fleet-mission
description: >
  Parallel agent fleet dispatch with progress tracking and result
  aggregation. Launch N agents, monitor completion, merge results.
context: fork
allowed-tools:
  - Task
  - Read
  - Write
  - Bash
---

# Fleet Mission Protocol

1. Parse mission manifest from $ARGUMENTS
2. For each work unit, launch a subagent via Task tool
3. Monitor agent completion via status files
4. Aggregate results into unified report
5. Handle failures: retry once, then report partial results
Orchestration

mayor-coordinator

The Gastown convoy model for multi-agent coordination. Creates convoys (groups of related agents), dispatches work via the sling-dispatch skill, monitors progress through nudge-sync, and coordinates cross-convoy dependencies. Named after the "mayor" pattern where a central coordinator manages a neighborhood of workers.

Orchestration

sling-dispatch

Seven-stage work routing pipeline: fetch (pull work item) → allocate (assign agent) → prepare (load context) → hook (pre-execution checks) → execute (run work) → verify (validate output) → complete (retire work item). Each stage has explicit success and failure transitions.

Orchestration

gupp-propulsion

Interrupt controller that converts polled-to-proactive agent execution. Instead of agents polling for work, GUPP (Get Up and Push Protocol) detects pending work items and pushes them to idle agents. Configurable thresholds determine when to push versus wait. Dead man's switch detects stalled agents and redistributes their work.

Communication Skills

Communication skills solve the inter-agent messaging problem. In a multi-agent system, agents need to signal each other, pass data, and coordinate timing. These skills implement different communication patterns for different durability and timing requirements.

Communication

mail-async

Durable asynchronous messaging channel. Implements write-once, read-many filesystem mail using JSON message files. Messages persist across sessions and agent restarts. Each message has a sender, recipient, subject, body, and read/unread status. Agents check their mailbox at startup and process unread messages.

Message Format
{
  "id": "msg-2026-04-07-001",
  "from": "executor-3",
  "to": "mayor",
  "subject": "Phase 12 complete",
  "body": "All 4 plans executed. 0 failures. Ready for verify.",
  "timestamp": "2026-04-07T14:32:00Z",
  "read": false
}
Communication

nudge-sync

Synchronous immediate signaling channel. Implements latest-wins single-file nudge pattern for real-time agent coordination. Unlike mail-async (durable, queued), nudge-sync overwrites previous state with current state. Used for progress reporting, heartbeats, and coordination signals where only the latest value matters.

Communication

hook-persistence

Pull-based work assignment channel implementing GUPP. Manages single-active-work-item hooks — an agent checks the hook file, claims the work item if available, and releases it when done. Prevents multiple agents from claiming the same work. The hook file acts as a lightweight mutex.

Pipeline Skills

Pipeline skills implement multi-stage processing flows with defined inputs, transformations, and outputs. Each stage has explicit success and failure handling.

Pipeline

done-retirement

Seven-stage completion pipeline: validate (check work meets acceptance criteria) → commit (create git commit) → push (push to remote) → submit (create PR or update tracker) → notify (signal dependent agents) → cleanup (archive working files) → terminate (release agent resources). Each stage is idempotent and can be retried independently.

Pipeline

refinery-merge

Deterministic merge queue pattern (DMA — deterministic merge automation). Processes merge requests sequentially: checks out branch, rebases onto main, runs tests, merges if green, rejects if red. Prevents merge conflicts from accumulating by processing one branch at a time in FIFO order.

Pipeline

publish-pipeline

Markdown to HTML/PDF build with FTP sync to production. Converts documents using Pandoc with custom templates, applies brand styling, validates output, and deploys via FTP. Proven at scale: 190+ research projects, 2,400+ files published through this pipeline.

Workflow Skills

Workflow skills manage the lifecycle of development work — from planning through execution to verification. They encode institutional knowledge about how work moves through the system.

Workflow

gsd-workflow

The central lifecycle router. Routes incoming work requests through the GSD lifecycle: discuss → plan → execute → verify. Determines which phase a work item is in, what needs to happen next, and which agents to involve. This is the "traffic controller" skill that all other workflow skills feed into.

Workflow

session-awareness

Project state recovery on session start. Reads project state files (STATE.md, ROADMAP.md), checks git status, identifies the current phase and milestone, and presents a briefing to the user. Ensures every new session starts with full context of where things were left off, even after a complete context reset.

Workflow

context-handoff

Session continuity documents. When a session is ending or pausing, this skill generates a structured handoff document that captures: current state, in-progress work, decisions made, open questions, and next steps. The next session reads this document to resume exactly where the previous session stopped.

Workflow

beautiful-commits

Conventional commit message generation. Analyzes staged changes, drafts a commit message following the Angular convention (type(scope): subject), and creates the commit. Enforces imperative mood, 72-character subject lines, meaningful body content, and proper semantic structure. Used on every commit in our system.

State Management Skills

State management skills handle persistence, recovery, and runtime abstraction. They ensure the system survives crashes, restarts, and context resets.

State

beads-state

Git-friendly, crash-recoverable state persistence. Manages agent identities, work items, and progress through JSON state files that are designed to merge cleanly in git. Uses copy-on-write semantics: the current state is always a complete snapshot, not a delta from a previous state. If the system crashes mid-write, the previous valid state is still intact.

State File Structure
{
  "version": 3,
  "timestamp": "2026-04-07T14:30:00Z",
  "agents": {
    "executor-1": { "status": "active", "task": "phase-12-plan-3" },
    "executor-2": { "status": "idle", "task": null }
  },
  "work_queue": ["phase-12-plan-4", "phase-12-plan-5"],
  "completed": ["phase-12-plan-1", "phase-12-plan-2"]
}
State

runtime-hal

Runtime HAL (Hardware Abstraction Layer) for multi-runtime agent orchestration. Detects which AI assistant is running (Claude Code, Codex, Gemini, Cursor) and adapts skill behavior accordingly. Each runtime has different tool names, permission models, and context window sizes. The HAL normalizes these differences so skills can be portable across runtimes.

How Skills Compose: A Worked Example

To understand how these patterns work together in practice, consider what happens when a developer types /gsd-execute-phase 12 in our production system. This single command triggers a cascade of skill compositions:

  1. gsd-workflow (workflow) receives the request and determines that phase 12 needs execution. It reads the ROADMAP.md and PLAN files to understand the phase structure and dependencies.
  2. sling-dispatch (orchestration) takes the phase's plan files and enters its 7-stage pipeline: it fetches plan files, allocates agents to each plan, prepares context for each agent, runs pre-execution hooks, then dispatches execution.
  3. fleet-mission (orchestration) launches parallel subagents — one per plan in the phase. Plans in Wave 1 run simultaneously; Wave 2 waits for Wave 1 to complete.
  4. beads-state (state) tracks each agent's status, writing crash-recoverable state files after every significant operation. If the system fails during execution, the state file shows exactly which plans completed and which did not.
  5. nudge-sync (communication) provides real-time progress updates. Each executor agent writes a nudge file with its current status, which the fleet-mission coordinator polls to track overall progress.
  6. Each executor loads beautiful-commits (workflow) to create properly formatted commits for its completed work.
  7. When all executors complete, done-retirement (pipeline) validates the results, creates a summary commit, pushes to the remote, and archives the working files.
  8. Finally, session-awareness (workflow) updates the project state files so the next session knows phase 12 is complete and phase 13 is ready.

This entire sequence is driven by skills. No hardcoded logic exists in the agent's base behavior — the skills provide all the procedural knowledge. This means the workflow can be modified by editing SKILL.md files, without changing any code. The composition is emergent: each skill focuses on one responsibility, and the orchestration skills wire them together through convention (well-known file paths, state file formats, nudge protocols).

Production Metrics

This skill composition pattern has powered the execution of 190+ research projects, 58 software releases, and thousands of individual plan executions. The median phase execution (3-5 plans, single wave) completes in under 8 minutes. Multi-wave phases with parallel executors complete complex 10-plan phases in under 20 minutes. The crash recovery rate — successfully resuming after a context reset or system failure — exceeds 95%, enabled by the beads-state persistence pattern.

2.12 Skill Design Guidelines

Effective skills are the result of deliberate design decisions about description quality, content structure, invocation patterns, and testing. The following guidelines are distilled from building and maintaining 34+ production skills over 10 months.

Description Writing

The description is the most important field in a skill's frontmatter. It serves three purposes simultaneously: it tells Claude when to auto-activate the skill, it tells users what the skill does in the / menu, and it contributes to the persistent context budget. Every word must earn its place.

Good Descriptions

  • "Deploy to staging environment. Use when user asks to deploy, push to staging, or test in staging."
  • "Generate test cases for functions and components. Use when writing tests or user mentions 'test'."
  • "Conventional commit messages following Angular format. Use when committing changes."

Each starts with what it does, then states when to use it.

Bad Descriptions

  • "Helps with development tasks" — Too vague, triggers on everything
  • "A comprehensive tool for managing the complete deployment lifecycle including staging, production, and preview environments with support for rollbacks, canary releases, and blue-green deployments" — Too long, truncated at 250 chars
  • "Deployment" — Too short, insufficient activation signal

Vague descriptions waste budget and cause false activations.

Description Formula

[What it does] + [When to use it], under 250 characters. Front-load the primary use case in the first clause. Include specific trigger words that Claude can match against user requests. If the skill handles a specific file type or framework, name it explicitly.

Content Structure

Testing Skills

Test both activation modes for every skill:

  1. Manual invocation: Type the slash command and verify it executes correctly. Test with different arguments, no arguments, and malformed arguments.
  2. Auto-activation: Describe the problem the skill solves in natural language and verify Claude loads the skill. Test with variations of the trigger phrases. Verify the skill does not activate for unrelated requests.
  3. Edge cases: Test with large inputs, missing prerequisites (files not found, tools not available), and concurrent invocations (if the skill is used in multi-agent contexts).

Troubleshooting

Skill Not Triggering

Causes: Description too vague or missing trigger keywords. disable-model-invocation: true set unintentionally. paths globs not matching current files. Skill directory missing SKILL.md.

Fix: Add specific trigger phrases to description. Check frontmatter flags. Verify directory structure with /debug. Ensure SKILL.md filename is exact (case-sensitive).

Skill Triggering Too Often

Causes: Description too broad ("Helps with development"). No paths constraint. Description keywords match common requests.

Fix: Narrow the description to specific use cases. Add paths globs to limit activation scope. Use more specific trigger words. Consider disable-model-invocation: true if it should be manual-only.

Description Cut Short

Causes: Description exceeds 250-character truncation limit. Critical trigger words appear after the truncation point.

Fix: Rewrite to front-load the primary use case and trigger keywords in the first 250 characters. Move secondary details to the Markdown body.

Forked Skill Losing Context

Causes: The forked subagent starts with a clean context and cannot see the parent conversation. The skill relies on information from prior messages.

Fix: Make the skill self-contained — include all necessary context in the SKILL.md or pass it via arguments. Alternatively, remove context: fork and run the skill inline.

Shell Injection Failing

Causes: Command not found in the skill preprocessing environment. Shell syntax incompatible with the configured shell (bash vs powershell). Argument interpolation producing invalid commands.

Fix: Use absolute paths for commands. Set the shell frontmatter field explicitly. Quote argument substitutions: !`gh pr view "$1"`.

Too Many Skills Consuming Context

Causes: Large number of skills with descriptions all loaded into the system prompt. Each skill adds ~50 tokens of persistent overhead.

Fix: Set SLASH_COMMAND_TOOL_CHAR_BUDGET to limit description budget. Mark infrequently-used skills with disable-model-invocation: true. Consolidate related skills. Trim descriptions to essentials.

Skill Lifecycle Best Practices

1. Start Simple

Begin with a command file in .claude/commands/. When you need auto-activation, subagent isolation, or argument hints, upgrade to a full skill directory with SKILL.md. Do not over-engineer from the start.

2. Iterate on Descriptions

The description determines activation behavior. After deploying a skill, observe when it triggers (and when it does not). Adjust trigger phrases based on real usage patterns. This is empirical tuning, not one-time design.

3. Version Control Skills

Project skills belong in version control. Treat SKILL.md files with the same discipline as code: review changes, test before merging, and include in pull requests. Skill changes can alter agent behavior as significantly as code changes.

4. Monitor Context Budget

Track the total token overhead of your skill descriptions. With 100 skills at 50 tokens each, that is 5,000 tokens permanently consumed. Use /debug to see current budget usage. Prune or consolidate skills that are not earning their context cost.

5. Separate Concerns

One skill, one responsibility. A deploy skill should not also handle monitoring. A code review skill should not also format code. Composability comes from combining focused skills, not from building Swiss Army knife skills.

6. Document Exit Conditions

Every task skill should clearly state when it is "done." What output does it produce? What state does the system end in? How does the user verify success? Without clear exit conditions, the agent may loop indefinitely or stop prematurely.

Skill Architecture at Scale

Our production system demonstrates that skills scale to complex autonomous workflows. The 34+ skills organize into a layered architecture where each layer builds on the one below:

Layer Skills Purpose Example
Primitives beads-state, nudge-sync, mail-async Basic building blocks: persistence, signaling, messaging State file read/write, heartbeat signals
Protocols hook-persistence, gupp-propulsion Coordination patterns built on primitives Work claiming, push-based execution
Pipelines sling-dispatch, done-retirement, refinery-merge Multi-stage processing flows Work routing, completion handling, merge queue
Orchestration fleet-mission, mayor-coordinator Multi-agent coordination Parallel execution, convoy management
Workflow gsd-workflow, session-awareness, beautiful-commits Developer-facing lifecycle management Phase routing, session recovery, commit generation

Each layer depends only on layers below it. Primitive skills have no skill dependencies. Protocol skills use primitive skills. Pipelines compose protocols. Orchestration skills dispatch pipelines. Workflow skills tie everything together into a coherent developer experience. This layered approach enables individual skills to be tested, replaced, and evolved independently.

The analogy to software architecture is intentional and precise. Just as a well-designed software system separates concerns into layers (data access, business logic, presentation), a well-designed skill system separates agent capabilities into composable layers. The primitive layer handles I/O and persistence. The protocol layer handles coordination semantics. The pipeline layer handles multi-step process flows. The orchestration layer handles agent topology. The workflow layer handles human interaction. Each layer has its own testing surface and failure modes, and problems at one layer can be diagnosed and fixed without affecting the others.

This architecture was not designed upfront. It emerged over 10 months of iterative development, driven by real problems encountered in production. The earliest skills were monolithic — a single 800-line SKILL.md that tried to handle orchestration, state management, and workflow logic. As the system grew to 20+ skills, the monolithic approach became unmaintainable: changes to state persistence broke orchestration, changes to orchestration broke workflow routing. The refactoring into layered, single-responsibility skills was driven by the same forces that drive the evolution of any production software system — the need for independent changeability and testability.

Key Takeaway: Skills Are Infrastructure

Skills are not just convenience wrappers around prompts. At production scale, they become the infrastructure layer of an autonomous agent system. Like microservices in a distributed system, well-designed skills have clear interfaces, single responsibilities, and explicit dependencies. The investment in skill design pays dividends in system reliability, maintainability, and the ability to evolve agent behavior without rebuilding from scratch. Our 34+ skill system has proven this at scale: 190+ completed research projects, 58 software releases, 21,298 passing tests, and over 600,000 lines of published content — all orchestrated by skills.