A Comprehensive Technical Guide to the Agent Skills Specification
Research current as of: January 2026
Agent Skills represent a paradigm shift in how we extend AI agent capabilities.1IndustryEquipping agents for the real world with Agent SkillsView Source Launched by Anthropic in October 2025 and released as an open standard in December 2025, Agent Skills provide a lightweight, standardized format for packaging specialized knowledge and workflows that AI agents can discover and load dynamically.
Agent Skills are organized folders of instructions, scripts, and resources that agents can discover and load dynamically to perform better at specific tasks.2IndustrySkills vs Tools for AI Agents: Production GuideView Source They are modular capabilities packaged as Markdown files with YAML frontmatter, containing metadata and instructions that tell an agent how to perform a specific task.
Anthropic officially unveiled Agent Skills at their product launch event, introducing the concept of modular, discoverable agent capabilities.
Agent Skills released as an open standard at agentskills.io/specification, enabling cross-platform and cross-product reuse.
OpenAI adopted the Agent Skills format for Codex CLI and ChatGPT, solidifying it as an industry standard.
Major adoption by Microsoft (VS Code, GitHub), Cursor, Goose, Amp, OpenCode, and other leading AI development tools.
Agent Skills solve several critical challenges in AI agent development:
Skills use progressive disclosure to manage context, loading only metadata (~50 tokens) at startup and full instructions (2,000-5,000 tokens) only when needed.1IndustryEquipping agents for the real world with Agent SkillsView Source
Skills separate "know-how" (procedural knowledge) from "can-do" (tools/actions), enabling agents that are powerful, reliable, and compliant.4AcademicTool Learning with Foundation ModelsView Paper
Build once, use everywhere. Skills are version-controlled, shareable, and work across compatible agents like Cursor, Claude Code, and OpenCode.
The open standard ensures interoperability across different agent platforms, preventing vendor lock-in and enabling ecosystem growth.
Understanding the distinction between Skills, Tools, and Plugins is crucial for effective agent design.5AcademicToolformer: Language Models Can Teach Themselves to Use ToolsView Paper While these terms are sometimes used interchangeably, they represent fundamentally different approaches to extending AI capabilities.
| Aspect | Skills | Tools | Plugins |
|---|---|---|---|
| Nature | Procedural knowledge and workflows | Executable functions with defined I/O | Vendor-specific extensions |
| Represents | What agents know | What agents can do | Platform-bound capabilities |
| Format | Markdown with YAML frontmatter | Code functions with schemas | Varies by platform |
| Activation | Suggested, agent decides when to load | Explicitly called by agent | Platform-specific invocation |
| Side Effects | None (instructions only) | Can modify external state | Varies |
| Portability | Cross-platform (open standard) | Varies (depends on protocol) | Platform-locked |
| Security Surface | Minimal (prompt-based) | Requires authentication & validation | Platform-dependent |
| Token Cost | Progressive (50-5000 tokens) | Fixed (schema definition) | Varies |
A skill is a suggestion - the agent autonomously decides whether it needs that context and loads it when appropriate. A tool is an action - the agent explicitly calls it to perform an operation with real-world effects.
Skills and tools are complementary, not competitive. The most effective agents use both:
For simple bots, tools might be enough. But if you're building digital employees, you need skills to encode domain expertise and tools to execute actions.2IndustrySkills vs Tools for AI Agents: Production GuideView Source The separation ensures agents are not just powerful, but also reliable, compliant, and efficient.
The core innovation of Agent Skills is their three-phase progressive disclosure mechanism,1IndustryEquipping agents for the real world with Agent SkillsView Source which keeps agents fast while giving them access to extensive knowledge on demand.
At startup, the agent scans available skills directories (e.g., .claude/skills/) and parses only the YAML frontmatter from each SKILL.md file. This creates a lightweight index of available capabilities.
# Discovery phase loads only metadata
name: api-design
description: REST API design best practices and conventions
# Total: ~50 tokens loaded into context
When a user's request matches a skill's description, the agent reads the full SKILL.md file into its context. The description acts as a semantic trigger for skill activation.
# When user asks: "Design a REST API for user management"
# Agent activates api-design skill
# Loads full SKILL.md: ~2,000-5,000 tokens
Agent reasoning:
- Task: "Design a REST API"
- Matches: api-design skill description
- Action: Load full skill instructions
The agent follows the loaded instructions and accesses specific resources (scripts, templates, assets) within the skill folder only when the instructions reference them.
This architecture allows agents to maintain hundreds of skills while keeping initial context usage minimal. For example, 100 skills × 50 tokens = 5,000 tokens at startup, versus 100 skills × 3,000 tokens = 300,000 tokens if all instructions were always loaded.
A skill is a directory containing at minimum a SKILL.md file, with optional subdirectories for supporting resources:
my-skill/
├── SKILL.md # Required: Instructions + metadata
├── scripts/ # Optional: Executable code
│ ├── setup.py
│ └── process.sh
├── references/ # Optional: Documentation
│ ├── api-spec.json
│ └── examples.md
└── assets/ # Optional: Templates, configs
├── template.yaml
└── config.json
Every SKILL.md file consists of two parts: YAML frontmatter (metadata) and Markdown body (instructions).
| Field | Required | Description | Constraints |
|---|---|---|---|
name |
Yes | Unique identifier for the skill | Max 64 chars, lowercase, numbers, hyphens |
description |
Yes | What the skill does and when to use it | Max 1024 chars, non-empty |
license |
No | License identifier (e.g., Apache-2.0) | SPDX identifier recommended |
metadata |
No | Additional info (author, version, etc.) | Freeform YAML |
allowed-tools |
No | Experimental: pre-approved tools | Space-delimited list |
compatibility |
No | System requirements, network needs | Freeform text |
The description is the most important field - it's what the LLM uses to decide which skill to activate.6AcademicSymbol Tuning Improves In-Context Learning in Language ModelsView Paper Be specific and clear about what the skill does AND when to use it. Poor descriptions lead to skills never being activated or being activated incorrectly.
There are two main approaches to integrating skills into agent systems:
The agent operates within a computer environment (bash/unix) where skills are activated when models issue shell commands:
# Agent activates skill via filesystem
cat /path/to/my-skill/SKILL.md
# Access bundled resources
python /path/to/my-skill/scripts/process.py
cat /path/to/my-skill/references/api-spec.json
The agent functions without a dedicated computer environment and instead implements tools allowing models to trigger skills:
// Agent activates skill via tool call
{
"tool": "activate_skill",
"skill_name": "my-skill"
}
// Access bundled resources via tool
{
"tool": "read_skill_asset",
"skill_name": "my-skill",
"asset_path": "scripts/process.py"
}
Here's a minimal SKILL.md file demonstrating the required structure:
---
name: skill-name
description: A clear, concise description of what this skill does and when to use it
---
# Skill Name
Brief overview of the skill's purpose.
## Instructions
Step-by-step instructions for the agent to follow:
1. First action
2. Second action
3. Third action
## Examples
Example scenarios showing the skill in action.
A real-world skill for REST API design best practices:
---
name: api-design
description: REST API design best practices and conventions. Use when designing or reviewing REST APIs.
license: Apache-2.0
metadata:
author: API Standards Team
version: 1.2.0
updated: 2026-01-15
---
# API Design Guidelines
Follow these conventions when designing REST APIs to ensure consistency,
scalability, and developer-friendly interfaces.
## URL Structure
- Use plural nouns for resources: `/users`, `/orders`
- Use kebab-case for multi-word resources: `/order-items`
- Nest related resources: `/users/{id}/orders`
- Keep URLs shallow (max 3 levels deep)
- Avoid verbs in URLs (use HTTP methods instead)
## HTTP Methods
- **GET**: Retrieve resources (safe, idempotent)
- **POST**: Create new resources (not idempotent)
- **PUT**: Replace entire resource (idempotent)
- **PATCH**: Partial update (not necessarily idempotent)
- **DELETE**: Remove resource (idempotent)
## Response Codes
### Success Codes
- `200 OK`: Successful GET, PUT, PATCH, or DELETE
- `201 Created`: Successful POST (include Location header)
- `204 No Content`: Successful DELETE with no response body
### Client Error Codes
- `400 Bad Request`: Invalid request syntax or parameters
- `401 Unauthorized`: Missing or invalid authentication
- `403 Forbidden`: Valid auth but insufficient permissions
- `404 Not Found`: Resource doesn't exist
- `409 Conflict`: Request conflicts with current state
- `422 Unprocessable Entity`: Validation errors
### Server Error Codes
- `500 Internal Server Error`: Unexpected server error
- `503 Service Unavailable`: Temporary unavailability
## Request/Response Format
All requests and responses should use JSON with consistent structure:
```json
{
"data": { ... },
"meta": {
"timestamp": "2026-01-30T10:30:00Z",
"version": "1.0"
},
"errors": []
}
```
## Pagination
For list endpoints, use cursor-based pagination:
```json
{
"data": [...],
"pagination": {
"next_cursor": "abc123",
"prev_cursor": "def456",
"has_more": true
}
}
```
## Versioning
- Use URL versioning: `/v1/users`
- Maintain at least 2 versions concurrently
- Announce deprecation 6 months in advance
## Error Handling
Always return structured error responses:
```json
{
"errors": [
{
"code": "VALIDATION_ERROR",
"message": "Email address is invalid",
"field": "email"
}
]
}
```
## When Designing a New API
1. Identify all resources and their relationships
2. Define URL structure following conventions above
3. Map operations to HTTP methods
4. Design request/response schemas
5. Document all endpoints with examples
6. Review with API Standards Team
A skill that generates interactive HTML tree visualizations of project structure:
---
name: codebase-visualizer
description: Generate an interactive collapsible tree visualization of your codebase. Use when exploring a new repo, understanding project structure, or identifying large files.
allowed-tools: Bash(python *)
compatibility: Requires Python 3.8+
---
# Codebase Visualizer
Generate an interactive HTML tree view that shows your project's file structure
with collapsible directories, file sizes, and syntax highlighting.
## Usage
Run the visualization script from your project root:
```bash
python ~/.claude/skills/codebase-visualizer/scripts/visualize.py .
```
This will generate `codebase-tree.html` in the current directory.
## Options
- `--exclude`: Patterns to exclude (default: node_modules, .git, __pycache__)
- `--max-depth`: Maximum directory depth (default: unlimited)
- `--output`: Output file name (default: codebase-tree.html)
## Example
```bash
python visualize.py . --exclude "*.pyc,dist,build" --max-depth 5
```
## Output
The generated HTML includes:
- Collapsible directory tree
- File size indicators
- File type icons
- Search functionality
- Dark/light theme toggle
A skill for creating consistent releases and changelogs:
---
name: git-release
description: Create consistent Git releases and changelogs. Use when preparing version releases, generating changelogs, or tagging releases.
allowed-tools: Bash(git *)
---
# Git Release Management
Creates consistent releases and changelogs by analyzing merged PRs,
proposing version bumps, and generating release notes.
## Semantic Versioning
Follow semantic versioning (MAJOR.MINOR.PATCH):
- **MAJOR**: Breaking changes (incompatible API changes)
- **MINOR**: New features (backward-compatible)
- **PATCH**: Bug fixes (backward-compatible)
## Release Process
1. **Analyze Changes**
- Review commits since last release
- Identify breaking changes, features, and fixes
- Determine appropriate version bump
2. **Generate Changelog**
- Group changes by type (Breaking, Features, Fixes)
- Extract PR titles and numbers
- Include contributor credits
3. **Create Release**
- Update version in relevant files
- Commit changelog
- Create Git tag
- Push tag to remote
4. **Publish Release Notes**
- Use changelog content
- Include upgrade instructions if breaking
- Link to detailed documentation
## Changelog Format
```markdown
# Changelog
## [2.1.0] - 2026-01-30
### Breaking Changes
- Removed deprecated `oldMethod()` API (#123)
### Features
- Added new authentication flow (#124)
- Improved performance of data processing (#125)
### Bug Fixes
- Fixed memory leak in background worker (#126)
- Corrected timezone handling (#127)
### Contributors
@username1, @username2, @username3
```
## Commands
```bash
# Create a new release
git tag -a v2.1.0 -m "Release v2.1.0"
git push origin v2.1.0
# Generate changelog
git log v2.0.0..HEAD --pretty=format:"%s (%h)" --merges
```
An enterprise skill for handling HR-related queries:
---
name: hr-questions
description: Answers HR-related questions including policies, benefits, leave requests, onboarding, and employee guidelines. Use for any HR or people operations questions.
metadata:
department: Human Resources
sensitivity: confidential
version: 3.0.0
---
# HR Questions Processing
Provides accurate, policy-compliant answers to employee HR questions
across benefits, policies, leave management, and onboarding.
## Question Categories
### Benefits
- Health insurance coverage and enrollment
- 401(k) contribution limits and matching
- PTO accrual and usage policies
- Parental leave policies
- Tuition reimbursement
### Policies
- Code of conduct
- Remote work policies
- Expense reimbursement
- Equipment policies
- Confidentiality agreements
### Leave Management
- Sick leave
- Vacation time
- Personal days
- FMLA eligibility
- Bereavement leave
### Onboarding
- New hire checklist
- First day procedures
- System access requests
- Training requirements
## Response Guidelines
1. **Always cite policy source**: Reference the specific policy document
2. **Include effective dates**: Policies may have changed
3. **Escalate sensitive issues**: Direct to HR for personal situations
4. **Maintain confidentiality**: Never share other employees' information
5. **Stay current**: Refer to resources/ directory for latest policies
## Example Interaction
**Question**: "How much PTO do I accrue per year?"
**Answer**:
According to the PTO Policy (effective 2026-01-01):
- 0-2 years: 15 days per year (1.25 days/month)
- 3-5 years: 20 days per year (1.67 days/month)
- 6+ years: 25 days per year (2.08 days/month)
PTO accrues monthly and can be used as soon as it's available.
Maximum carryover is 5 days per year.
For questions about your specific accrual, contact hr@company.com.
## Escalation Scenarios
Immediately direct to HR for:
- Harassment or discrimination concerns
- Performance improvement plans
- Termination questions
- Salary negotiations
- Medical accommodations
- Legal matters
Skill discovery is the foundation of efficient skill management. Modern agent systems implement sophisticated discovery mechanisms to index and surface relevant skills without overwhelming the context window.
Agents scan designated skills directories during initialization:
# Common skill directory locations
~/.claude/skills/ # Claude Code
~/.cursor/skills/ # Cursor
~/.config/opencode/skills/ # OpenCode
./skills/ # Project-specific skills
# Discovery process
1. Scan directories for SKILL.md files
2. Parse YAML frontmatter
3. Extract name + description
4. Build in-memory index
5. Total tokens: ~50 per skill
For agents without filesystem access, discovery happens via dedicated tools:
{
"tool": "list_skills",
"response": [
{
"name": "api-design",
"description": "REST API design best practices..."
},
{
"name": "git-release",
"description": "Create consistent releases..."
}
]
}
The agent uses the skill descriptions for semantic matching against user requests.8AcademicHuggingGPT: Solving AI Tasks with ChatGPT and its FriendsView Paper This approach mirrors the task-routing pattern from multi-model orchestration research:
Craft skill descriptions to maximize semantic matching accuracy. Include key terms, synonyms, and use cases. For example: "REST API design best practices and conventions. Use when designing, reviewing, or documenting RESTful APIs, web services, or HTTP endpoints."
Loading all skill instructions at startup - wasteful and slow:
// DON'T DO THIS
startup() {
for skill in all_skills:
load_full_instructions(skill) // 100 skills × 3000 tokens = 300k tokens!
}
Load instructions only when skills are activated:
// RECOMMENDED APPROACH
startup() {
for skill in all_skills:
load_metadata_only(skill) // 100 skills × 50 tokens = 5k tokens
}
on_user_request(request) {
matching_skills = semantic_match(request, all_skills)
for skill in matching_skills:
load_full_instructions(skill) // Only 1-3 skills typically
}
Some systems preload likely-needed skills based on conversation context:
// ADVANCED: Predictive preloading
on_conversation_start(project_type) {
if project_type == "web_app":
preload(["api-design", "security-best-practices", "database-schema"])
}
When multiple skills match a request, agents use ranking to determine activation order:
Primary ranking factor. Skills with descriptions most similar to the user request score highest.
Recently updated skills may be prioritized to ensure agents use current best practices.
Frequently used skills can be ranked higher based on historical activation patterns.
Some systems allow manual priority settings in metadata for critical organizational skills.
Professional skill management requires a structured lifecycle approach.9IndustryAnnouncing skills on Tessl: the package manager for agent skillsView Source Emerging platforms now provide comprehensive lifecycle management tools:
Identify knowledge gaps, define scope, write skill specification, determine required tools and resources.
Write SKILL.md with metadata and instructions, create supporting scripts and resources, test with target agent platforms.
Test skill activation accuracy, validate instruction clarity, measure token efficiency, gather user feedback.
Version control with Git, distribute to agent environments, document usage and examples, monitor activation patterns.
Track skill usage metrics, identify activation failures, collect agent performance data, assess outcome quality.
Refine descriptions for better matching, improve instruction clarity, reduce token count, update for new best practices.
Skills require versioning strategies to manage evolution over time:
---
name: api-design
description: REST API design best practices
metadata:
version: 2.1.0
min_agent_version: 1.5.0
updated: 2026-01-30
changelog: Added GraphQL guidelines
---
Maintain multiple versions in separate directories:
skills/ ├── api-design-v1/ ├── api-design-v2/ └── api-design-v3/
Use Git branches for version management:
main (latest stable) v2.x (maintenance) v1.x (deprecated)
Single skill with version metadata, agent selects based on compatibility.
There's currently no built-in versioning system in the Agent Skills specification. Organizations must implement their own versioning strategies. Best practice: Use semantic versioning in metadata and maintain at least one major version for backward compatibility.
Ensuring skill quality requires systematic evaluation:
| Metric | Target | Measurement |
|---|---|---|
| Activation Accuracy | > 90% | Skill activated when appropriate |
| False Positive Rate | < 5% | Skill activated inappropriately |
| Token Efficiency | < 5000 tokens | Full SKILL.md token count |
| Instruction Clarity | > 95% | Agent follows instructions correctly |
| Outcome Quality | > 85% | Task completed successfully |
When skills become outdated or are superseded, follow a structured deprecation process:
---
name: old-api-design
description: DEPRECATED: Use api-design-v2 instead. Legacy REST API guidelines.
metadata:
deprecated: true
deprecation_date: 2026-01-01
replacement: api-design-v2
removal_date: 2026-07-01
---
# DEPRECATED: Old API Design
This skill is deprecated and will be removed on 2026-07-01.
**Use instead**: api-design-v2
This skill is maintained for backward compatibility only.
Several platforms have emerged to manage the full skill lifecycle:
In January 2026, Tessl announced a developer-grade package manager for skills,9IndustryAnnouncing skills on Tessl: the package manager for agent skillsView Source providing tools to evaluate quality, a registry of evaluated skills, and a platform to manage the full lifecycle: build, evaluate, distribute, and optimize.
The official repository maintained by Anthropic serves as the reference implementation:
OpenAI adopted the Agent Skills standard in December 2025:
Launched in December 2025, SkillsMP is the first comprehensive marketplace for agent skills:
github.com/muratcankoylan/Agent-Skills-for-Context-Engineering
Comprehensive collection for context engineering, multi-agent architectures, and production systems.
github.com/skillmatic-ai/awesome-agent-skills
Curated list of high-quality agent skills across domains, with quality ratings and usage examples.
Partners include Atlassian (Jira/Confluence), Figma, Canva, Stripe, and Zapier.
Production-grade skills for enterprise integrations.
The Agent Skills standard has been adopted across major AI development platforms:
| Platform | Adoption Date | Integration Level | Skill Directory |
|---|---|---|---|
| Claude Code | Oct 2025 | Native | ~/.claude/skills/ |
| Cursor | Dec 2025 | Native | ~/.cursor/skills/ |
| VS Code Copilot | Dec 2025 | Native (Microsoft) | .vscode/skills/ |
| GitHub Copilot | Dec 2025 | Native (Microsoft) | .github/skills/ |
| OpenCode | Dec 2025 | Native | ~/.opencode/skills/ |
| Goose | Dec 2025 | Native | ~/.goose/skills/ |
| Amp | Jan 2026 | Native | ~/.amp/skills/ |
| Letta | Jan 2026 | Native | ~/.letta/skills/ |
The rapid adoption across platforms demonstrates the industry's recognition of Agent Skills as a unifying standard.15IndustryOpenAI Function Calling GuideView Source This cross-platform compatibility enables organizations to invest in skill development once and deploy everywhere.
The skill description is the most critical component - it determines when and how often your skill activates.
description: "API design"
Too vague, won't match user requests effectively.
description: "REST API design best practices and conventions. Use when designing, reviewing, or documenting RESTful APIs."
Specific, includes key terms and use cases.
Skills that generate standardized outputs from templates:
# Use cases: Report generation, boilerplate code, documentation
1. Gather required information from user
2. Select appropriate template from assets/
3. Populate template with user data
4. Validate output format
5. Present generated content
Multi-pass processes with increasing depth:
# Use cases: Code review, security audits, quality analysis
1. Broad initial scan - identify areas of interest
2. Medium-depth analysis - examine flagged areas
3. Deep dive - detailed investigation of issues
4. Synthesis - compile findings into report
Linear, deterministic workflows where each step depends on the previous:
# Use cases: Data processing, CI/CD, deployment
1. Validation - ensure prerequisites met
2. Processing - execute core operations
3. Verification - confirm success
4. Cleanup - finalize and document
Branching logic based on conditions:
# Use cases: Troubleshooting, configuration, routing
1. Assess initial conditions
2. Branch based on criteria:
- If condition A: follow path 1
- If condition B: follow path 2
- Otherwise: follow default path
3. Execute path-specific instructions
4. Converge at output step
For extensive documentation, use references/ directory:
my-skill/
├── SKILL.md # Keep concise (~2000 tokens)
└── references/
├── detailed-guide.md # Full documentation
└── examples.md # Extensive examples
# In SKILL.md:
For detailed examples, see references/examples.md
Each skill should have one clear purpose.11AcademicLIMA: Less Is More for AlignmentView Paper Research shows that carefully curated skill content outperforms verbose instructions:
Skills can reference other skills for complex workflows.12AcademicAutoGen: Enabling Next-Gen LLM Applications via Multi-Agent ConversationView Paper Multi-agent frameworks demonstrate that skill distribution across agents enables emergent problem-solving capabilities:
---
name: full-api-review
description: Complete API review covering design, security, and documentation
---
# Full API Review
This skill orchestrates multiple specialized skills for comprehensive review:
1. Activate api-design skill
- Review URL structure, HTTP methods, response codes
2. Activate api-security skill
- Review authentication, authorization, data validation
3. Activate api-documentation skill
- Review OpenAPI spec, examples, error documentation
4. Compile findings into unified report
It's strongly recommended to use Skills only from trusted sources. Skills provide Claude with new capabilities through instructions and code, which makes them powerful but also means a malicious Skill can direct Claude to invoke tools or execute code in ways that don't match the Skill's stated purpose.
---
name: safe-file-processor
description: Process files safely
allowed-tools: Bash(python scripts/process.py) Read Write
---
# This skill can only:
# - Run specific Python script
# - Read files
# - Write files
# Cannot: Execute arbitrary commands, access network, etc.
Before deploying a skill, test these scenarios:9IndustryAnnouncing skills on Tessl: the package manager for agent skillsView Source
□ Name is unique and follows naming convention (lowercase, hyphens)
□ Description is clear, specific, and includes use cases
□ Instructions are concise and actionable
□ Code examples are syntactically correct
□ Referenced files exist in correct directories
□ Token count is under 5,000 for full SKILL.md
□ Tested on target agent platforms
□ Security review completed for scripts/
□ Documentation includes examples
□ Metadata includes version and author
Skills can support multiple languages for global teams:
multilingual-skill/
├── SKILL.md # English (default)
├── SKILL.es.md # Spanish
├── SKILL.fr.md # French
└── SKILL.ja.md # Japanese
# Agent selects based on user's language preference
Some advanced systems generate skills dynamically based on runtime context.13AcademicCREATOR: Tool Creation for Disentangling Abstract and Concrete ReasoningView Paper Research demonstrates LLMs can create their own tools and skills as reusable knowledge artifacts:
// Generate project-specific skill from codebase analysis
analyze_codebase() {
conventions = extract_patterns(codebase)
generate_skill({
name: "project-conventions",
description: "Project-specific coding conventions",
instructions: conventions
})
}
Track skill performance metrics for optimization:
| Metric | Purpose | Action Threshold |
|---|---|---|
| Activation Rate | How often skill is used | < 1/month → Consider deprecation |
| Success Rate | Task completion percentage | < 80% → Refine instructions |
| False Positive Rate | Inappropriate activations | > 10% → Improve description |
| Average Token Usage | Context efficiency | > 5000 → Optimize content |
| User Satisfaction | Quality perception | < 4/5 → Review and update |
Organizations deploying skills at scale need governance frameworks to ensure quality, security, and compliance.
Enhanced metadata standards for better discovery, automated quality scoring, cross-platform skill marketplaces.
Skill composition frameworks, versioning APIs, automated testing tools, performance benchmarking suites.
AI-assisted skill generation, dynamic skill optimization, advanced analytics dashboards, enterprise governance platforms.
Industry-specific skill libraries, federated skill registries, standardized evaluation frameworks, certification programs.
Agent Skills represent a fundamental advancement in how we extend AI agent capabilities.14IndustryLangChain Tools DocumentationView Source By providing a standardized, efficient, and portable format for packaging procedural knowledge, they enable:
Invest time in crafting precise descriptions that maximize semantic matching accuracy.
Keep skills focused on one clear purpose for better composability and maintainability.
Optimize content to stay under 5,000 tokens while maintaining clarity.
Implement versioning, monitoring, and governance for production deployments.
With 71,000+ community-created skills, adoption by major platforms (Microsoft, OpenAI, Anthropic, Cursor), and emerging lifecycle management platforms, Agent Skills are positioned to become the standard way organizations package and share AI agent knowledge. The separation of "know-how" from "can-do" enables building agents that are not just powerful, but reliable, compliant, and efficient - essential characteristics for enterprise deployment.
Practical Claude Code patterns for implementing the skill concepts from this section. These examples demonstrate custom agent definitions, skill composition, and tool restriction patterns based on the in-context learning research.3AcademicAn Explanation of In-Context Learning as Implicit Bayesian InferenceView Paper
Define specialized agents with focused prompts and tool sets. This implements the skill separation pattern described in Section 2, where "know-how" (the prompt) is separated from "can-do" (the tools).10AcademicSkill-it: A Data-Driven Skills FrameworkView Paper
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
# Define a specialized code review agent
code_reviewer = AgentDefinition(
description="Expert code reviewer for security and quality analysis.",
prompt="""You are a senior security engineer conducting code review.
Focus on:
- Security vulnerabilities (injection, XSS, auth issues)
- Performance bottlenecks
- Code quality and maintainability
Provide actionable, specific recommendations.""",
tools=["Read", "Glob", "Grep"] # Read-only access
)
async for message in query(
prompt="Use the code-reviewer agent to analyze src/auth/",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Glob", "Grep", "Task"],
agents={"code-reviewer": code_reviewer}
)
):
if hasattr(message, "result"):
print(message.result)
import { query, AgentDefinition } from "@anthropic-ai/claude-agent-sdk";
// Define a specialized code review agent
const codeReviewer: AgentDefinition = {
description: "Expert code reviewer for security and quality analysis.",
prompt: `You are a senior security engineer conducting code review.
Focus on: security vulnerabilities, performance, code quality.
Provide actionable, specific recommendations.`,
tools: ["Read", "Glob", "Grep"]
};
for await (const message of query({
prompt: "Use the code-reviewer agent to analyze src/auth/",
options: {
allowedTools: ["Read", "Glob", "Grep", "Task"],
agents: { "code-reviewer": codeReviewer }
}
})) {
if ("result" in message) console.log(message.result);
}
Compose multiple skills for complex workflows, demonstrating the multi-agent coordination research findings.12AcademicAutoGen: Multi-Agent ConversationView Paper
from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
# Skill 1: Documentation expert
doc_writer = AgentDefinition(
description="Technical documentation specialist.",
prompt="Write clear, comprehensive documentation for code and APIs.",
tools=["Read", "Write", "Glob"]
)
# Skill 2: Test writer
test_writer = AgentDefinition(
description="Unit test and integration test specialist.",
prompt="Write comprehensive tests with edge cases and mocking.",
tools=["Read", "Write", "Bash"]
)
# Compose skills: orchestrator delegates to specialists
async for message in query(
prompt="""For the auth module:
1. Use doc-writer to document the public API
2. Use test-writer to add missing unit tests""",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Write", "Glob", "Bash", "Task"],
agents={
"doc-writer": doc_writer,
"test-writer": test_writer
}
)
):
pass
Limit tools to create focused, safe agents. This pattern follows the principle of least privilege discussed in security-conscious skill design.
from claude_agent_sdk import query, ClaudeAgentOptions
# Read-only analysis agent (safe for production code)
async for message in query(
prompt="Analyze the database schema and identify optimization opportunities",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Glob", "Grep"], # No Write/Edit/Bash
permission_mode="default" # Still prompts for approval
)
):
pass
# Write-enabled agent with approval workflow
async for message in query(
prompt="Refactor utils.py to improve error handling",
options=ClaudeAgentOptions(
allowed_tools=["Read", "Edit", "Glob"], # Edit but no Write (safer)
permission_mode="acceptEdits" # Auto-approve edits
)
):
pass
The GSD workflow system implements skills through structured markdown files, following the composable workflow patterns from academic research.8AcademicHuggingGPT: Solving AI Tasks with ChatGPTView Paper
# GSD skills are defined in .claude/get-shit-done/
# Each workflow is a skill with specific triggers and outputs
# Trigger the planning skill
claude "/gsd:initialize"
# Execute a specific plan (uses plan execution skill)
claude "/gsd:execute-phase .planning/phases/01-setup/01-01-PLAN.md"
# Skills can be composed: planning -> execution -> summary
GSD implements skill composition patterns from academic research through its agent definition system. Each agent specifies identity via name/description metadata (~50 tokens loaded at discovery), capability boundaries via tools list, and procedural knowledge via Markdown body (loaded when activated).
| GSD Agent | Purpose | Key Pattern | Research Mapping |
|---|---|---|---|
gsd-executor |
Execute plans with atomic commits | Deviation rules, checkpoints | ReAct bounded autonomy |
gsd-verifier |
Goal-backward verification | Must-have checking | Outcome-focused planning12 |
gsd-planner |
Create executable plans | Task breakdown, dependency analysis | Tool learning decomposition4 |
gsd-phase-researcher |
Domain research | Context7 first, verify before asserting | Retrieval-augmented generation |
Research current as of: January 2026
This comprehensive guide was also compiled from the following sources: