AI Agent Skills: Definition and Creation

1. Introduction to Agent Skills

Agent Skills represent a paradigm shift in how we extend AI agent capabilities.^{1IndustryEquipping agents for the real world with Agent SkillsAnthropic, October 2025View Source} Launched by Anthropic in October 2025 and released as an open standard in December 2025, Agent Skills provide a lightweight, standardized format for packaging specialized knowledge and workflows that AI agents can discover and load dynamically.

What Are Agent Skills?

Agent Skills are organized folders of instructions, scripts, and resources that agents can discover and load dynamically to perform better at specific tasks.^{2IndustrySkills vs Tools for AI Agents: Production GuideArcade AI, January 2026View Source} They are modular capabilities packaged as Markdown files with YAML frontmatter, containing metadata and instructions that tell an agent how to perform a specific task.

71,000+

Community Skills Available

20,000+

GitHub Stars on Skills Repo

~50

Tokens for Metadata Only

Dec 2025

Open Standard Released

1.1 The Skills Timeline

October 16, 2025

Anthropic officially unveiled Agent Skills at their product launch event, introducing the concept of modular, discoverable agent capabilities.

December 18, 2025

Agent Skills released as an open standard at agentskills.io/specification, enabling cross-platform and cross-product reuse.

December 2025

OpenAI adopted the Agent Skills format for Codex CLI and ChatGPT, solidifying it as an industry standard.

January 2026

Major adoption by Microsoft (VS Code, GitHub), Cursor, Goose, Amp, OpenCode, and other leading AI development tools.

1.2 Why Skills Matter

Agent Skills solve several critical challenges in AI agent development:

Context Efficiency

Skills use progressive disclosure to manage context, loading only metadata (~50 tokens) at startup and full instructions (2,000-5,000 tokens) only when needed.^{1IndustryEquipping agents for the real world with Agent SkillsAnthropic, October 2025View Source}

Knowledge Separation

Skills separate "know-how" (procedural knowledge) from "can-do" (tools/actions), enabling agents that are powerful, reliable, and compliant.^{4AcademicTool Learning with Foundation ModelsQin et al., ACL 2024View Paper}

Reusability

Build once, use everywhere. Skills are version-controlled, shareable, and work across compatible agents like Cursor, Claude Code, and OpenCode.

Standardization

The open standard ensures interoperability across different agent platforms, preventing vendor lock-in and enabling ecosystem growth.

2. Skills vs Tools vs Plugins: A Technical Comparison

Understanding the distinction between Skills, Tools, and Plugins is crucial for effective agent design.^{5AcademicToolformer: Language Models Can Teach Themselves to Use ToolsSchick et al., NeurIPS 2024View Paper} While these terms are sometimes used interchangeably, they represent fundamentally different approaches to extending AI capabilities.

2.1 Core Definitions

Aspect	Skills	Tools	Plugins
Nature	Procedural knowledge and workflows	Executable functions with defined I/O	Vendor-specific extensions
Represents	What agents know	What agents can do	Platform-bound capabilities
Format	Markdown with YAML frontmatter	Code functions with schemas	Varies by platform
Activation	Suggested, agent decides when to load	Explicitly called by agent	Platform-specific invocation
Side Effects	None (instructions only)	Can modify external state	Varies
Portability	Cross-platform (open standard)	Varies (depends on protocol)	Platform-locked
Security Surface	Minimal (prompt-based)	Requires authentication & validation	Platform-dependent
Token Cost	Progressive (50-5000 tokens)	Fixed (schema definition)	Varies

2.2 The Behavioral Difference

Key Insight: Skills are Suggestions, Tools are Actions

A skill is a suggestion - the agent autonomously decides whether it needs that context and loads it when appropriate. A tool is an action - the agent explicitly calls it to perform an operation with real-world effects.

2.3 The Relationship Between Skills and Tools

Skills and tools are complementary, not competitive. The most effective agents use both:

Skills provide the knowledge of how to approach complex, multi-step tasks
Tools provide the capabilities to execute specific operations
Skills orchestrate tools by instructing the agent which tools to use and when
Tools enable skills by providing the concrete actions referenced in skill instructions

Best Practice: Combine Skills and Tools

For simple bots, tools might be enough. But if you're building digital employees, you need skills to encode domain expertise and tools to execute actions.^{2IndustrySkills vs Tools for AI Agents: Production GuideArcade AI, January 2026View Source} The separation ensures agents are not just powerful, but also reliable, compliant, and efficient.

2.4 When to Use Each

Use Skills For:

Domain-specific workflows
Multi-step procedures
Best practices and conventions
Template generation
Quality analysis patterns
Fast, repeatable workflows

Use Tools For:

Database queries
API calls
File system operations
Code execution
External service integration
State-modifying operations

Use Plugins For:

Quick platform-specific prototypes
Platform-bound capabilities
When portability isn't required
Legacy system integration

3. Agent Skills Architecture and Specification

3.1 Progressive Disclosure Architecture

The core innovation of Agent Skills is their three-phase progressive disclosure mechanism,^{1IndustryEquipping agents for the real world with Agent SkillsAnthropic, October 2025View Source} which keeps agents fast while giving them access to extensive knowledge on demand.

Phase 1: Discovery
~50 tokens per skill

→

Phase 2: Activation
2,000-5,000 tokens

→

Phase 3: Execution
Variable tokens

Agent loads only what it needs, when it needs it

Phase 1: Discovery (Startup)

At startup, the agent scans available skills directories (e.g., .claude/skills/) and parses only the YAML frontmatter from each SKILL.md file. This creates a lightweight index of available capabilities.

# Discovery phase loads only metadata
name: api-design
description: REST API design best practices and conventions

# Total: ~50 tokens loaded into context

Phase 2: Activation (On Match)

When a user's request matches a skill's description, the agent reads the full SKILL.md file into its context. The description acts as a semantic trigger for skill activation.

# When user asks: "Design a REST API for user management"
# Agent activates api-design skill
# Loads full SKILL.md: ~2,000-5,000 tokens

Agent reasoning:
  - Task: "Design a REST API"
  - Matches: api-design skill description
  - Action: Load full skill instructions

Phase 3: Execution (As Needed)

The agent follows the loaded instructions and accesses specific resources (scripts, templates, assets) within the skill folder only when the instructions reference them.

Performance Impact

This architecture allows agents to maintain hundreds of skills while keeping initial context usage minimal. For example, 100 skills × 50 tokens = 5,000 tokens at startup, versus 100 skills × 3,000 tokens = 300,000 tokens if all instructions were always loaded.

3.2 Directory Structure Specification

A skill is a directory containing at minimum a SKILL.md file, with optional subdirectories for supporting resources:

my-skill/
├── SKILL.md              # Required: Instructions + metadata
├── scripts/              # Optional: Executable code
│   ├── setup.py
│   └── process.sh
├── references/           # Optional: Documentation
│   ├── api-spec.json
│   └── examples.md
└── assets/               # Optional: Templates, configs
    ├── template.yaml
    └── config.json

3.3 SKILL.md File Structure

Every SKILL.md file consists of two parts: YAML frontmatter (metadata) and Markdown body (instructions).

Frontmatter Fields

Field	Required	Description	Constraints
`name`	Yes	Unique identifier for the skill	Max 64 chars, lowercase, numbers, hyphens
`description`	Yes	What the skill does and when to use it	Max 1024 chars, non-empty
`license`	No	License identifier (e.g., Apache-2.0)	SPDX identifier recommended
`metadata`	No	Additional info (author, version, etc.)	Freeform YAML
`allowed-tools`	No	Experimental: pre-approved tools	Space-delimited list
`compatibility`	No	System requirements, network needs	Freeform text

Critical: Description Quality

The description is the most important field - it's what the LLM uses to decide which skill to activate.^{6AcademicSymbol Tuning Improves In-Context Learning in Language ModelsWei et al., NeurIPS 2024View Paper} Be specific and clear about what the skill does AND when to use it. Poor descriptions lead to skills never being activated or being activated incorrectly.

3.4 Integration Approaches

There are two main approaches to integrating skills into agent systems:

Filesystem-Based Integration

The agent operates within a computer environment (bash/unix) where skills are activated when models issue shell commands:

# Agent activates skill via filesystem
cat /path/to/my-skill/SKILL.md

# Access bundled resources
python /path/to/my-skill/scripts/process.py
cat /path/to/my-skill/references/api-spec.json

Tool-Based Integration

The agent functions without a dedicated computer environment and instead implements tools allowing models to trigger skills:

// Agent activates skill via tool call
{
  "tool": "activate_skill",
  "skill_name": "my-skill"
}

// Access bundled resources via tool
{
  "tool": "read_skill_asset",
  "skill_name": "my-skill",
  "asset_path": "scripts/process.py"
}

4. Creating Agent Skills: Complete Examples

4.1 Basic Skill Template

Here's a minimal SKILL.md file demonstrating the required structure:

---
name: skill-name
description: A clear, concise description of what this skill does and when to use it
---

# Skill Name

Brief overview of the skill's purpose.

## Instructions

Step-by-step instructions for the agent to follow:

1. First action
2. Second action
3. Third action

## Examples

Example scenarios showing the skill in action.

4.2 Production Example: API Design Skill

A real-world skill for REST API design best practices:

---
name: api-design
description: REST API design best practices and conventions. Use when designing or reviewing REST APIs.
license: Apache-2.0
metadata:
  author: API Standards Team
  version: 1.2.0
  updated: 2026-01-15
---

# API Design Guidelines

Follow these conventions when designing REST APIs to ensure consistency,
scalability, and developer-friendly interfaces.

## URL Structure

- Use plural nouns for resources: `/users`, `/orders`
- Use kebab-case for multi-word resources: `/order-items`
- Nest related resources: `/users/{id}/orders`
- Keep URLs shallow (max 3 levels deep)
- Avoid verbs in URLs (use HTTP methods instead)

## HTTP Methods

- **GET**: Retrieve resources (safe, idempotent)
- **POST**: Create new resources (not idempotent)
- **PUT**: Replace entire resource (idempotent)
- **PATCH**: Partial update (not necessarily idempotent)
- **DELETE**: Remove resource (idempotent)

## Response Codes

### Success Codes
- `200 OK`: Successful GET, PUT, PATCH, or DELETE
- `201 Created`: Successful POST (include Location header)
- `204 No Content`: Successful DELETE with no response body

### Client Error Codes
- `400 Bad Request`: Invalid request syntax or parameters
- `401 Unauthorized`: Missing or invalid authentication
- `403 Forbidden`: Valid auth but insufficient permissions
- `404 Not Found`: Resource doesn't exist
- `409 Conflict`: Request conflicts with current state
- `422 Unprocessable Entity`: Validation errors

### Server Error Codes
- `500 Internal Server Error`: Unexpected server error
- `503 Service Unavailable`: Temporary unavailability

## Request/Response Format

All requests and responses should use JSON with consistent structure:

```json
{
  "data": { ... },
  "meta": {
    "timestamp": "2026-01-30T10:30:00Z",
    "version": "1.0"
  },
  "errors": []
}
```

## Pagination

For list endpoints, use cursor-based pagination:

```json
{
  "data": [...],
  "pagination": {
    "next_cursor": "abc123",
    "prev_cursor": "def456",
    "has_more": true
  }
}
```

## Versioning

- Use URL versioning: `/v1/users`
- Maintain at least 2 versions concurrently
- Announce deprecation 6 months in advance

## Error Handling

Always return structured error responses:

```json
{
  "errors": [
    {
      "code": "VALIDATION_ERROR",
      "message": "Email address is invalid",
      "field": "email"
    }
  ]
}
```

## When Designing a New API

1. Identify all resources and their relationships
2. Define URL structure following conventions above
3. Map operations to HTTP methods
4. Design request/response schemas
5. Document all endpoints with examples
6. Review with API Standards Team

4.3 Production Example: Codebase Visualizer

A skill that generates interactive HTML tree visualizations of project structure:

---
name: codebase-visualizer
description: Generate an interactive collapsible tree visualization of your codebase. Use when exploring a new repo, understanding project structure, or identifying large files.
allowed-tools: Bash(python *)
compatibility: Requires Python 3.8+
---

# Codebase Visualizer

Generate an interactive HTML tree view that shows your project's file structure
with collapsible directories, file sizes, and syntax highlighting.

## Usage

Run the visualization script from your project root:

```bash
python ~/.claude/skills/codebase-visualizer/scripts/visualize.py .
```

This will generate `codebase-tree.html` in the current directory.

## Options

- `--exclude`: Patterns to exclude (default: node_modules, .git, __pycache__)
- `--max-depth`: Maximum directory depth (default: unlimited)
- `--output`: Output file name (default: codebase-tree.html)

## Example

```bash
python visualize.py . --exclude "*.pyc,dist,build" --max-depth 5
```

## Output

The generated HTML includes:
- Collapsible directory tree
- File size indicators
- File type icons
- Search functionality
- Dark/light theme toggle

4.4 Production Example: Git Release Management

A skill for creating consistent releases and changelogs:

---
name: git-release
description: Create consistent Git releases and changelogs. Use when preparing version releases, generating changelogs, or tagging releases.
allowed-tools: Bash(git *)
---

# Git Release Management

Creates consistent releases and changelogs by analyzing merged PRs,
proposing version bumps, and generating release notes.

## Semantic Versioning

Follow semantic versioning (MAJOR.MINOR.PATCH):

- **MAJOR**: Breaking changes (incompatible API changes)
- **MINOR**: New features (backward-compatible)
- **PATCH**: Bug fixes (backward-compatible)

## Release Process

1. **Analyze Changes**
   - Review commits since last release
   - Identify breaking changes, features, and fixes
   - Determine appropriate version bump

2. **Generate Changelog**
   - Group changes by type (Breaking, Features, Fixes)
   - Extract PR titles and numbers
   - Include contributor credits

3. **Create Release**
   - Update version in relevant files
   - Commit changelog
   - Create Git tag
   - Push tag to remote

4. **Publish Release Notes**
   - Use changelog content
   - Include upgrade instructions if breaking
   - Link to detailed documentation

## Changelog Format

```markdown
# Changelog

## [2.1.0] - 2026-01-30

### Breaking Changes
- Removed deprecated `oldMethod()` API (#123)

### Features
- Added new authentication flow (#124)
- Improved performance of data processing (#125)

### Bug Fixes
- Fixed memory leak in background worker (#126)
- Corrected timezone handling (#127)

### Contributors
@username1, @username2, @username3
```

## Commands

```bash
# Create a new release
git tag -a v2.1.0 -m "Release v2.1.0"
git push origin v2.1.0

# Generate changelog
git log v2.0.0..HEAD --pretty=format:"%s (%h)" --merges
```

4.5 Production Example: HR Questions Processing

An enterprise skill for handling HR-related queries:

---
name: hr-questions
description: Answers HR-related questions including policies, benefits, leave requests, onboarding, and employee guidelines. Use for any HR or people operations questions.
metadata:
  department: Human Resources
  sensitivity: confidential
  version: 3.0.0
---

# HR Questions Processing

Provides accurate, policy-compliant answers to employee HR questions
across benefits, policies, leave management, and onboarding.

## Question Categories

### Benefits
- Health insurance coverage and enrollment
- 401(k) contribution limits and matching
- PTO accrual and usage policies
- Parental leave policies
- Tuition reimbursement

### Policies
- Code of conduct
- Remote work policies
- Expense reimbursement
- Equipment policies
- Confidentiality agreements

### Leave Management
- Sick leave
- Vacation time
- Personal days
- FMLA eligibility
- Bereavement leave

### Onboarding
- New hire checklist
- First day procedures
- System access requests
- Training requirements

## Response Guidelines

1. **Always cite policy source**: Reference the specific policy document
2. **Include effective dates**: Policies may have changed
3. **Escalate sensitive issues**: Direct to HR for personal situations
4. **Maintain confidentiality**: Never share other employees' information
5. **Stay current**: Refer to resources/ directory for latest policies

## Example Interaction

**Question**: "How much PTO do I accrue per year?"

**Answer**:
According to the PTO Policy (effective 2026-01-01):
- 0-2 years: 15 days per year (1.25 days/month)
- 3-5 years: 20 days per year (1.67 days/month)
- 6+ years: 25 days per year (2.08 days/month)

PTO accrues monthly and can be used as soon as it's available.
Maximum carryover is 5 days per year.

For questions about your specific accrual, contact hr@company.com.

## Escalation Scenarios

Immediately direct to HR for:
- Harassment or discrimination concerns
- Performance improvement plans
- Termination questions
- Salary negotiations
- Medical accommodations
- Legal matters

5. Skill Discovery and Loading Mechanisms

5.1 Discovery Phase Architecture

Skill discovery is the foundation of efficient skill management. Modern agent systems implement sophisticated discovery mechanisms to index and surface relevant skills without overwhelming the context window.

Filesystem-Based Discovery

Agents scan designated skills directories during initialization:

# Common skill directory locations
~/.claude/skills/          # Claude Code
~/.cursor/skills/          # Cursor
~/.config/opencode/skills/ # OpenCode
./skills/                  # Project-specific skills

# Discovery process
1. Scan directories for SKILL.md files
2. Parse YAML frontmatter
3. Extract name + description
4. Build in-memory index
5. Total tokens: ~50 per skill

Tool-Based Discovery

For agents without filesystem access, discovery happens via dedicated tools:

{
  "tool": "list_skills",
  "response": [
    {
      "name": "api-design",
      "description": "REST API design best practices..."
    },
    {
      "name": "git-release",
      "description": "Create consistent releases..."
    }
  ]
}

5.2 Semantic Matching

The agent uses the skill descriptions for semantic matching against user requests.^{8AcademicHuggingGPT: Solving AI Tasks with ChatGPT and its FriendsShen et al., NeurIPS 2024View Paper} This approach mirrors the task-routing pattern from multi-model orchestration research:

User Request
"Design a REST API"

→

Semantic Analysis
Embedding similarity

→

Match Skills
api-design: 0.92

→

Activate
Load full SKILL.md

Optimization Tip: Description Engineering

Craft skill descriptions to maximize semantic matching accuracy. Include key terms, synonyms, and use cases. For example: "REST API design best practices and conventions. Use when designing, reviewing, or documenting RESTful APIs, web services, or HTTP endpoints."

5.3 Loading Strategies

Eager Loading (Anti-Pattern)

Loading all skill instructions at startup - wasteful and slow:

// DON'T DO THIS
startup() {
  for skill in all_skills:
    load_full_instructions(skill)  // 100 skills × 3000 tokens = 300k tokens!
}

Lazy Loading (Recommended)

Load instructions only when skills are activated:

// RECOMMENDED APPROACH
startup() {
  for skill in all_skills:
    load_metadata_only(skill)  // 100 skills × 50 tokens = 5k tokens
}

on_user_request(request) {
  matching_skills = semantic_match(request, all_skills)
  for skill in matching_skills:
    load_full_instructions(skill)  // Only 1-3 skills typically
}

Predictive Preloading (Advanced)

Some systems preload likely-needed skills based on conversation context:

// ADVANCED: Predictive preloading
on_conversation_start(project_type) {
  if project_type == "web_app":
    preload(["api-design", "security-best-practices", "database-schema"])
}

5.4 Skill Priority and Ranking

When multiple skills match a request, agents use ranking to determine activation order:

Semantic Similarity

Primary ranking factor. Skills with descriptions most similar to the user request score highest.

Recency

Recently updated skills may be prioritized to ensure agents use current best practices.

Usage Frequency

Frequently used skills can be ranked higher based on historical activation patterns.

Explicit Prioritization

Some systems allow manual priority settings in metadata for critical organizational skills.

6. Skill Lifecycle Management

6.1 The Skill Development Lifecycle

Professional skill management requires a structured lifecycle approach.^{9IndustryAnnouncing skills on Tessl: the package manager for agent skillsTessl, January 2026View Source} Emerging platforms now provide comprehensive lifecycle management tools:

1. Design

Identify knowledge gaps, define scope, write skill specification, determine required tools and resources.

2. Implementation

Write SKILL.md with metadata and instructions, create supporting scripts and resources, test with target agent platforms.

3. Evaluation

Test skill activation accuracy, validate instruction clarity, measure token efficiency, gather user feedback.

4. Deployment

Version control with Git, distribute to agent environments, document usage and examples, monitor activation patterns.

5. Monitoring

Track skill usage metrics, identify activation failures, collect agent performance data, assess outcome quality.

6. Optimization

Refine descriptions for better matching, improve instruction clarity, reduce token count, update for new best practices.

6.2 Versioning and Compatibility

Skills require versioning strategies to manage evolution over time:

Version Metadata

---
name: api-design
description: REST API design best practices
metadata:
  version: 2.1.0
  min_agent_version: 1.5.0
  updated: 2026-01-30
  changelog: Added GraphQL guidelines
---

Versioning Strategies

Directory Versioning

Maintain multiple versions in separate directories:

skills/
├── api-design-v1/
├── api-design-v2/
└── api-design-v3/

Git Branch Versioning

Use Git branches for version management:

main (latest stable)
v2.x (maintenance)
v1.x (deprecated)

Metadata Versioning

Single skill with version metadata, agent selects based on compatibility.

Backward Compatibility Challenge

There's currently no built-in versioning system in the Agent Skills specification. Organizations must implement their own versioning strategies. Best practice: Use semantic versioning in metadata and maintain at least one major version for backward compatibility.

6.3 Quality Assurance

Ensuring skill quality requires systematic evaluation:

Key Quality Metrics

Metric	Target	Measurement
Activation Accuracy	> 90%	Skill activated when appropriate
False Positive Rate	< 5%	Skill activated inappropriately
Token Efficiency	< 5000 tokens	Full SKILL.md token count
Instruction Clarity	> 95%	Agent follows instructions correctly
Outcome Quality	> 85%	Task completed successfully

6.4 Skill Deprecation

When skills become outdated or are superseded, follow a structured deprecation process:

---
name: old-api-design
description: DEPRECATED: Use api-design-v2 instead. Legacy REST API guidelines.
metadata:
  deprecated: true
  deprecation_date: 2026-01-01
  replacement: api-design-v2
  removal_date: 2026-07-01
---

# DEPRECATED: Old API Design

This skill is deprecated and will be removed on 2026-07-01.

**Use instead**: api-design-v2

This skill is maintained for backward compatibility only.

6.5 Enterprise Lifecycle Management Platforms

Several platforms have emerged to manage the full skill lifecycle:

Tessl: Skills Package Manager

In January 2026, Tessl announced a developer-grade package manager for skills,^{9IndustryAnnouncing skills on Tessl: the package manager for agent skillsTessl, January 2026View Source} providing tools to evaluate quality, a registry of evaluated skills, and a platform to manage the full lifecycle: build, evaluate, distribute, and optimize.

GSD Integration

The get-shit-done-cc workflow system at ~/.claude/get-shit-done/ demonstrates advanced skill patterns directly applicable to Agent Skills architecture. Each workflow file functions as a specialized skill with clear purpose, step-by-step instructions, and references to supporting artifacts. The system's progressive disclosure (workflows reference templates via @ syntax) mirrors Agent Skills' lazy loading pattern. Future enhancement opportunity: add YAML frontmatter to workflow files for standardized discovery and metadata-based activation.^{10AcademicSkill-it: A Data-Driven Skills Framework for Understanding and Training Language ModelsChen et al., NeurIPS 2024View Paper}

Key Challenges Solved

Quality visibility
Knowledge staleness detection
Performance degradation tracking
Version management

Platform Features

Automated skill evaluation
Registry with quality scores
Distribution management
Performance monitoring

7. Skill Repositories and Marketplaces

7.1 Official Repositories

Anthropic Skills Repository

The official repository maintained by Anthropic serves as the reference implementation:

URL: github.com/anthropics/skills
Stars: 20,000+ (as of January 2026)
Content: Reference skills, specification documentation, contribution guidelines
License: Open for community contributions

OpenAI Codex Skills

OpenAI adopted the Agent Skills standard in December 2025:

URL: developers.openai.com/codex/skills/
Integration: Built into Codex CLI and ChatGPT
Focus: Coding and development skills

7.2 Community Marketplaces

SkillsMP - The Agent Skills Marketplace

Launched in December 2025, SkillsMP is the first comprehensive marketplace for agent skills:

71,000+

Skills Available

GitHub

Source Integration

Smart

Category Filtering

Quality

Score Indicators

Features

Search: Intelligent search across 71,000+ GitHub repositories
Filtering: By category, author, popularity, language
Quality Indicators: Stars, forks, last update, documentation quality
Installation: One-click installation for compatible agents
Collections: Curated skill bundles for common use cases

7.3 Notable Community Collections

Context Engineering Skills

github.com/muratcankoylan/Agent-Skills-for-Context-Engineering

Comprehensive collection for context engineering, multi-agent architectures, and production systems.

Awesome Agent Skills

github.com/skillmatic-ai/awesome-agent-skills

Curated list of high-quality agent skills across domains, with quality ratings and usage examples.

Enterprise Skills Collection

Partners include Atlassian (Jira/Confluence), Figma, Canva, Stripe, and Zapier.

Production-grade skills for enterprise integrations.

7.4 Platform Adoption

The Agent Skills standard has been adopted across major AI development platforms:

Platform	Adoption Date	Integration Level	Skill Directory
Claude Code	Oct 2025	Native	~/.claude/skills/
Cursor	Dec 2025	Native	~/.cursor/skills/
VS Code Copilot	Dec 2025	Native (Microsoft)	.vscode/skills/
GitHub Copilot	Dec 2025	Native (Microsoft)	.github/skills/
OpenCode	Dec 2025	Native	~/.opencode/skills/
Goose	Dec 2025	Native	~/.goose/skills/
Amp	Jan 2026	Native	~/.amp/skills/
Letta	Jan 2026	Native	~/.letta/skills/

Industry Impact

The rapid adoption across platforms demonstrates the industry's recognition of Agent Skills as a unifying standard.^{15IndustryOpenAI Function Calling GuideOpenAI, 2024-2025View Source} This cross-platform compatibility enables organizations to invest in skill development once and deploy everywhere.

8. Best Practices and Design Patterns

8.1 Description Engineering

The skill description is the most critical component - it determines when and how often your skill activates.

Bad Description

description: "API design"

Too vague, won't match user requests effectively.

Good Description

description: "REST API design best practices and conventions. Use when designing, reviewing, or documenting RESTful APIs."

Specific, includes key terms and use cases.

Description Best Practices

Start with what the skill does (noun phrase)
Include specific domain terminology and synonyms
List primary use cases with "Use when..."
Keep under 200 characters when possible (balances clarity with token efficiency)
Test with actual user queries to validate matching

8.2 Common Design Patterns

Template Generation Pattern

Skills that generate standardized outputs from templates:

# Use cases: Report generation, boilerplate code, documentation

1. Gather required information from user
2. Select appropriate template from assets/
3. Populate template with user data
4. Validate output format
5. Present generated content

Iterative Analysis Pattern

Multi-pass processes with increasing depth:

# Use cases: Code review, security audits, quality analysis

1. Broad initial scan - identify areas of interest
2. Medium-depth analysis - examine flagged areas
3. Deep dive - detailed investigation of issues
4. Synthesis - compile findings into report

Sequential Pipeline Pattern

Linear, deterministic workflows where each step depends on the previous:

# Use cases: Data processing, CI/CD, deployment

1. Validation - ensure prerequisites met
2. Processing - execute core operations
3. Verification - confirm success
4. Cleanup - finalize and document

Decision Tree Pattern

Branching logic based on conditions:

# Use cases: Troubleshooting, configuration, routing

1. Assess initial conditions
2. Branch based on criteria:
   - If condition A: follow path 1
   - If condition B: follow path 2
   - Otherwise: follow default path
3. Execute path-specific instructions
4. Converge at output step

8.3 Token Optimization

Keep Instructions Concise

Target Limits

Metadata: ~50 tokens
Full SKILL.md: < 5,000 tokens
Recommended: < 500 lines

Optimization Techniques

Use bullet points over paragraphs
Remove redundant words
Reference external files for details
Use code examples sparingly

External Resources

For extensive documentation, use references/ directory:

my-skill/
├── SKILL.md              # Keep concise (~2000 tokens)
└── references/
    ├── detailed-guide.md # Full documentation
    └── examples.md       # Extensive examples

# In SKILL.md:
For detailed examples, see references/examples.md

8.4 Modularity and Composability

Single Responsibility

Each skill should have one clear purpose.^{11AcademicLIMA: Less Is More for AlignmentZhou et al., NeurIPS 2024View Paper} Research shows that carefully curated skill content outperforms verbose instructions:

Good: api-design, api-security, api-documentation (three skills)
Bad: api-everything (one mega-skill)

Skill Composition

Skills can reference other skills for complex workflows.^{12AcademicAutoGen: Enabling Next-Gen LLM Applications via Multi-Agent ConversationWu et al., 2024View Paper} Multi-agent frameworks demonstrate that skill distribution across agents enables emergent problem-solving capabilities:

---
name: full-api-review
description: Complete API review covering design, security, and documentation
---

# Full API Review

This skill orchestrates multiple specialized skills for comprehensive review:

1. Activate api-design skill
   - Review URL structure, HTTP methods, response codes

2. Activate api-security skill
   - Review authentication, authorization, data validation

3. Activate api-documentation skill
   - Review OpenAPI spec, examples, error documentation

4. Compile findings into unified report

8.5 Security Considerations

Trust and Verification

It's strongly recommended to use Skills only from trusted sources. Skills provide Claude with new capabilities through instructions and code, which makes them powerful but also means a malicious Skill can direct Claude to invoke tools or execute code in ways that don't match the Skill's stated purpose.

Security Best Practices

Source verification: Only use skills from trusted repositories
Code review: Review all scripts/ before deployment
Least privilege: Use allowed-tools to restrict tool access
Audit logging: Track skill activation and tool usage
Regular updates: Keep skills current with security patches
Sandboxing: Run skill scripts in isolated environments when possible

Using allowed-tools

---
name: safe-file-processor
description: Process files safely
allowed-tools: Bash(python scripts/process.py) Read Write
---

# This skill can only:
# - Run specific Python script
# - Read files
# - Write files
# Cannot: Execute arbitrary commands, access network, etc.

8.6 Testing and Validation

Test Scenarios

Before deploying a skill, test these scenarios:^{9IndustryAnnouncing skills on Tessl: the package manager for agent skillsTessl, January 2026View Source}

Activation Test: Does the skill activate for relevant queries?
False Positive Test: Does it activate for irrelevant queries?
Instruction Following: Does the agent follow instructions correctly?
Error Handling: How does it handle edge cases and errors?
Tool Integration: Do referenced tools work as expected?
Cross-Platform: Does it work on all target agent platforms?

Validation Checklist

□ Name is unique and follows naming convention (lowercase, hyphens)
□ Description is clear, specific, and includes use cases
□ Instructions are concise and actionable
□ Code examples are syntactically correct
□ Referenced files exist in correct directories
□ Token count is under 5,000 for full SKILL.md
□ Tested on target agent platforms
□ Security review completed for scripts/
□ Documentation includes examples
□ Metadata includes version and author

9. Advanced Topics

9.1 Multi-Language Skills

Skills can support multiple languages for global teams:

multilingual-skill/
├── SKILL.md              # English (default)
├── SKILL.es.md          # Spanish
├── SKILL.fr.md          # French
└── SKILL.ja.md          # Japanese

# Agent selects based on user's language preference

9.2 Dynamic Skills

Some advanced systems generate skills dynamically based on runtime context.^{13AcademicCREATOR: Tool Creation for Disentangling Abstract and Concrete ReasoningWang et al., 2024View Paper} Research demonstrates LLMs can create their own tools and skills as reusable knowledge artifacts:

// Generate project-specific skill from codebase analysis
analyze_codebase() {
  conventions = extract_patterns(codebase)
  generate_skill({
    name: "project-conventions",
    description: "Project-specific coding conventions",
    instructions: conventions
  })
}

9.3 Skill Analytics

Track skill performance metrics for optimization:

Metric	Purpose	Action Threshold
Activation Rate	How often skill is used	< 1/month → Consider deprecation
Success Rate	Task completion percentage	< 80% → Refine instructions
False Positive Rate	Inappropriate activations	> 10% → Improve description
Average Token Usage	Context efficiency	> 5000 → Optimize content
User Satisfaction	Quality perception	< 4/5 → Review and update

9.4 Enterprise Skill Governance

Governance Framework

Organizations deploying skills at scale need governance frameworks to ensure quality, security, and compliance.

Governance Components

Skill Registry

Centralized catalog
Approval workflows
Access controls
Compliance tracking

Quality Gates

Automated testing
Security scanning
Performance benchmarks
Peer review

Lifecycle Management

Version control
Deprecation policies
Update notifications
Rollback procedures

Monitoring & Audit

Usage analytics
Performance metrics
Compliance logs
Security audits

9.5 Future Developments

Q1 2026

Enhanced metadata standards for better discovery, automated quality scoring, cross-platform skill marketplaces.

Q2 2026

Skill composition frameworks, versioning APIs, automated testing tools, performance benchmarking suites.

Q3 2026

AI-assisted skill generation, dynamic skill optimization, advanced analytics dashboards, enterprise governance platforms.

Q4 2026

Industry-specific skill libraries, federated skill registries, standardized evaluation frameworks, certification programs.

10. Conclusion and Key Takeaways

10.1 Summary

Agent Skills represent a fundamental advancement in how we extend AI agent capabilities.^{14IndustryLangChain Tools DocumentationLangChain, 2024-2025View Source} By providing a standardized, efficient, and portable format for packaging procedural knowledge, they enable:

Context efficiency through progressive disclosure (50 tokens to 5000 tokens on demand)^{1IndustryEquipping agents for the real world with Agent SkillsAnthropic, October 2025View Source}
Knowledge separation between "know-how" (skills) and "can-do" (tools)^{4AcademicTool Learning with Foundation ModelsQin et al., ACL 2024View Paper}
Cross-platform portability via open standard adoption
Organizational knowledge capture in reusable, version-controlled formats
Ecosystem growth through marketplaces and community contributions

10.2 Critical Success Factors

1. Description Quality

Invest time in crafting precise descriptions that maximize semantic matching accuracy.

2. Single Responsibility

Keep skills focused on one clear purpose for better composability and maintainability.

3. Token Efficiency

Optimize content to stay under 5,000 tokens while maintaining clarity.

4. Lifecycle Management

Implement versioning, monitoring, and governance for production deployments.

10.3 Getting Started

Quick Start Checklist

Identify a knowledge gap or repeated workflow in your organization
Create a skill directory with SKILL.md using the basic template
Write a clear, specific description with use cases
Document step-by-step instructions in the markdown body
Test activation with relevant queries on your target platform
Refine based on activation accuracy and outcome quality
Deploy to your team and monitor usage metrics
Iterate based on feedback and performance data

10.4 Resources

Official Specification: agentskills.io/specification
Reference Repository: github.com/anthropics/skills
Marketplace: skillsmp.com
Documentation: platform.claude.com/docs/en/agents-and-tools/agent-skills
Community Collections: github.com/skillmatic-ai/awesome-agent-skills

The Future of Agent Skills

With 71,000+ community-created skills, adoption by major platforms (Microsoft, OpenAI, Anthropic, Cursor), and emerging lifecycle management platforms, Agent Skills are positioned to become the standard way organizations package and share AI agent knowledge. The separation of "know-how" from "can-do" enables building agents that are not just powerful, but reliable, compliant, and efficient - essential characteristics for enterprise deployment.

Implementation Examples

Practical Claude Code patterns for implementing the skill concepts from this section. These examples demonstrate custom agent definitions, skill composition, and tool restriction patterns based on the in-context learning research.^{3AcademicAn Explanation of In-Context Learning as Implicit Bayesian InferenceXie et al., 2024View Paper}

Custom Agent Definition

Define specialized agents with focused prompts and tool sets. This implements the skill separation pattern described in Section 2, where "know-how" (the prompt) is separated from "can-do" (the tools).^{10AcademicSkill-it: A Data-Driven Skills FrameworkChen et al., 2024View Paper}

Python

from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition

# Define a specialized code review agent
code_reviewer = AgentDefinition(
    description="Expert code reviewer for security and quality analysis.",
    prompt="""You are a senior security engineer conducting code review.
Focus on:
- Security vulnerabilities (injection, XSS, auth issues)
- Performance bottlenecks
- Code quality and maintainability
Provide actionable, specific recommendations.""",
    tools=["Read", "Glob", "Grep"]  # Read-only access
)

async for message in query(
    prompt="Use the code-reviewer agent to analyze src/auth/",
    options=ClaudeAgentOptions(
        allowed_tools=["Read", "Glob", "Grep", "Task"],
        agents={"code-reviewer": code_reviewer}
    )
):
    if hasattr(message, "result"):
        print(message.result)

TypeScript

import { query, AgentDefinition } from "@anthropic-ai/claude-agent-sdk";

// Define a specialized code review agent
const codeReviewer: AgentDefinition = {
  description: "Expert code reviewer for security and quality analysis.",
  prompt: `You are a senior security engineer conducting code review.
Focus on: security vulnerabilities, performance, code quality.
Provide actionable, specific recommendations.`,
  tools: ["Read", "Glob", "Grep"]
};

for await (const message of query({
  prompt: "Use the code-reviewer agent to analyze src/auth/",
  options: {
    allowedTools: ["Read", "Glob", "Grep", "Task"],
    agents: { "code-reviewer": codeReviewer }
  }
})) {
  if ("result" in message) console.log(message.result);
}

Skill Composition: Multiple Specialized Agents

Compose multiple skills for complex workflows, demonstrating the multi-agent coordination research findings.^{12AcademicAutoGen: Multi-Agent ConversationWu et al., 2024View Paper}

Python

from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition

# Skill 1: Documentation expert
doc_writer = AgentDefinition(
    description="Technical documentation specialist.",
    prompt="Write clear, comprehensive documentation for code and APIs.",
    tools=["Read", "Write", "Glob"]
)

# Skill 2: Test writer
test_writer = AgentDefinition(
    description="Unit test and integration test specialist.",
    prompt="Write comprehensive tests with edge cases and mocking.",
    tools=["Read", "Write", "Bash"]
)

# Compose skills: orchestrator delegates to specialists
async for message in query(
    prompt="""For the auth module:
1. Use doc-writer to document the public API
2. Use test-writer to add missing unit tests""",
    options=ClaudeAgentOptions(
        allowed_tools=["Read", "Write", "Glob", "Bash", "Task"],
        agents={
            "doc-writer": doc_writer,
            "test-writer": test_writer
        }
    )
):
    pass

Tool Restriction Patterns

Limit tools to create focused, safe agents. This pattern follows the principle of least privilege discussed in security-conscious skill design.

Python

from claude_agent_sdk import query, ClaudeAgentOptions

# Read-only analysis agent (safe for production code)
async for message in query(
    prompt="Analyze the database schema and identify optimization opportunities",
    options=ClaudeAgentOptions(
        allowed_tools=["Read", "Glob", "Grep"],  # No Write/Edit/Bash
        permission_mode="default"  # Still prompts for approval
    )
):
    pass

# Write-enabled agent with approval workflow
async for message in query(
    prompt="Refactor utils.py to improve error handling",
    options=ClaudeAgentOptions(
        allowed_tools=["Read", "Edit", "Glob"],  # Edit but no Write (safer)
        permission_mode="acceptEdits"  # Auto-approve edits
    )
):
    pass

GSD Skill Integration

The GSD workflow system implements skills through structured markdown files, following the composable workflow patterns from academic research.^{8AcademicHuggingGPT: Solving AI Tasks with ChatGPTShen et al., 2024View Paper}

Bash

# GSD skills are defined in .claude/get-shit-done/
# Each workflow is a skill with specific triggers and outputs

# Trigger the planning skill
claude "/gsd:initialize"

# Execute a specific plan (uses plan execution skill)
claude "/gsd:execute-phase .planning/phases/01-setup/01-01-PLAN.md"

# Skills can be composed: planning -> execution -> summary

GSD Agent Inventory

GSD implements skill composition patterns from academic research through its agent definition system. Each agent specifies identity via name/description metadata (~50 tokens loaded at discovery), capability boundaries via tools list, and procedural knowledge via Markdown body (loaded when activated).

GSD Agent	Purpose	Key Pattern	Research Mapping
`gsd-executor`	Execute plans with atomic commits	Deviation rules, checkpoints	ReAct bounded autonomy
`gsd-verifier`	Goal-backward verification	Must-have checking	Outcome-focused planning¹²
`gsd-planner`	Create executable plans	Task breakdown, dependency analysis	Tool learning decomposition⁴
`gsd-phase-researcher`	Domain research	Context7 first, verify before asserting	Retrieval-augmented generation

Enhancement Ideas

Research-informed skill discovery: Auto-suggest agents based on task analysis (match "needs database migration" to gsd-executor with schema tools)
Dynamic tool composition: Agents request additional tools at runtime based on task requirements (Toolformer pattern⁵)
Skill versioning with rollback: Track agent definition changes, enable rollback to previous behavior on regression

References

Research current as of: January 2026

Academic Papers

Xie, S. M., Raghunathan, A., Liang, P., & Ma, T. (2024). "An Explanation of In-Context Learning as Implicit Bayesian Inference." ICLR 2024. arXiv
Qin, Y., Hu, S., Lin, Y., et al. (2024). "Tool Learning with Foundation Models." ACL 2024. arXiv
Schick, T., Dwivedi-Yu, J., Dessi, R., et al. (2024). "Toolformer: Language Models Can Teach Themselves to Use Tools." NeurIPS 2024. arXiv
Wei, J., Hou, L., Lampinen, A., et al. (2024). "Symbol Tuning Improves In-Context Learning in Language Models." NeurIPS 2024. arXiv
Dziri, N., Lu, X., Sclar, M., et al. (2024). "Faith and Fate: Limits of Transformers on Compositionality." NeurIPS 2024. arXiv
Shen, Y., Song, K., Tan, X., et al. (2024). "HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face." NeurIPS 2024. arXiv
Chen, M., Tworek, J., Jun, H., et al. (2024). "Skill-it: A Data-Driven Skills Framework for Understanding and Training Language Models." NeurIPS 2024. arXiv
Zhou, C., Liu, P., Xu, P., et al. (2024). "LIMA: Less Is More for Alignment." NeurIPS 2024. arXiv
Wu, Q., Bansal, G., Zhang, J., et al. (2024). "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation." arXiv
Wang, Z., Zhang, J., & Chen, D. (2024). "CREATOR: Tool Creation for Disentangling Abstract and Concrete Reasoning of Large Language Models." arXiv
Hong, S., Zhuge, M., Chen, J., et al. (2024). "MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework." ICLR 2024. arXiv

Industry Sources

Anthropic. "Equipping agents for the real world with Agent Skills." Engineering Blog, October 2025. View
Arcade AI. "Skills vs Tools for AI Agents: Production Guide." January 2026. View
Tessl. "Announcing skills on Tessl: the package manager for agent skills." January 2026. View
LangChain. "Tools Documentation." Framework Documentation, 2024-2025. View
OpenAI. "Function Calling Guide." API Documentation, 2024-2025. View

Additional Sources

This comprehensive guide was also compiled from the following sources: