A complete technical guide for systems engineers and the deeply curious. What a chipset actually is, how they work across every major platform from the 1985 Amiga to the 2025 iPhone, and how Skill Creator uses the same architecture to manage Claude's context window like a tiny pool of shared system RAM in a high-performance multi-agent timesharing environment.
Before we can talk about Skill Creator's chipset, you need to understand what a chipset is in the physical world, because the word gets thrown around loosely and most people have never thought about what it actually means.
A chipset is a set of integrated circuits designed to work together as a system. The CPU is not part of the chipset — the chipset is everything around the CPU. It's the support infrastructure: memory controllers, I/O controllers, bus arbiters, interrupt handlers, DMA engines, display controllers, and glue logic. The chipset is what turns a CPU from a calculator sitting on a desk into a computer that can talk to RAM, draw pixels, read files, and play sound.
Think of it this way: the CPU is the brain. The chipset is the nervous system, the circulatory system, the hands and eyes. A brain by itself just sits in a jar. The chipset makes it useful.
This matters for Skill Creator because Claude is the CPU. Claude can think, reason, write code, and make decisions — but Claude doesn't have a nervous system. Claude doesn't have memory management, I/O scheduling, inter-process communication, or resource budgeting. It has a context window and a prompt. The chipset is what Skill Creator builds around Claude to give it those capabilities.
Let's walk through every major chipset family, from 1985 to today, so you can see the pattern that repeats across all of them. We'll move from the oldest and simplest to the newest and most complex.
The Amiga shipped with a Motorola 68000 CPU (7.16 MHz, roughly as powerful as a good pocket calculator by modern standards) and four custom coprocessor chips:
| Chip | Job | What It Actually Did |
|---|---|---|
| Agnus | Memory & DMA | Arbitrated access to chip RAM between the CPU, blitter, copper, and display hardware. Decided who gets the bus on each clock cycle. |
| Denise | Display | Converted playfield and sprite data from memory into video output. Handled color palettes, HAM mode, sprite multiplexing. |
| Paula | Audio & I/O | 4-channel 8-bit PCM audio via DMA, plus serial, parallel, and floppy disk controllers. |
| Gary | Glue logic | Address decoding and bus control — decided which chip or memory bank a given address referred to. |
The key insight: the CPU was deliberately kept out of the loop for bulk work. Agnus's blitter could copy and combine bitmaps without the CPU. The Copper executed display lists without the CPU. Paula played audio without the CPU. The CPU's job was to coordinate — to set up operations, then let the coprocessors run. This is why a 7 MHz Amiga could do things that a 33 MHz 386 PC couldn't.
The Intel/AMD PC world used a two-chip architecture for about 15 years:
| Chip | Job | Connected To |
|---|---|---|
| Northbridge | High-speed bus arbitration | CPU ↔ RAM ↔ GPU (the fast stuff) |
| Southbridge | I/O control | USB, SATA, audio, PCI, Ethernet (the slower stuff) |
The northbridge sat between the CPU and everything that needed to be fast. The southbridge hung off the northbridge and handled everything else. This is the same pattern as the Amiga — the CPU talks to a bus controller (Gary/Northbridge), which routes to specialized processors (Agnus-Denise-Paula / memory-controller, GPU, I/O-controller). Over time, Intel absorbed the northbridge into the CPU itself (the "uncore"), leaving just a Platform Controller Hub (PCH) — essentially a modern southbridge.
Today's chipset is a single chip — the PCH (Intel) or FCH (AMD) — because the CPU has absorbed the memory controller and PCIe root complex. The external chipset now handles: USB ports, SATA/NVMe routing, audio codecs, Ethernet MAC, GPIO, SPI flash (BIOS), low-speed PCIe lanes, and the management engine.
The CPU is no longer just a CPU — it's a System-on-Chip (SoC) that contains what used to be the northbridge. The external chipset is everything else.
Apple took the integration further. The M-series chips are true SoCs with no external chipset at all. Everything is on one die: CPU cores, GPU cores, Neural Engine, memory controller, I/O controllers (Thunderbolt, USB, PCIe), media encode/decode engines, the Secure Enclave, and unified memory. The "chipset" is internal to the chip.
But the architectural pattern is identical: specialized processing blocks connected by an internal bus fabric, with a shared memory pool arbitrated by a controller. The CPU cores coordinate. The coprocessors (GPU, Neural Engine, media engine) do bulk work.
Same architecture as the Mac, just smaller. The A-series SoC contains: high-performance CPU cores, efficiency CPU cores, GPU, Neural Engine, image signal processor (ISP), secure enclave, memory controller, and I/O. The chipset is the SoC — there's nothing external. The iPhone literally has one chip (plus a modem). Everything that the Amiga needed 4 chips for, Apple does in one package.
Same story. A Snapdragon 8 Gen 3 contains: Kryo CPU cores, Adreno GPU, Hexagon DSP/NPU, Spectra ISP, X75 modem, memory controller, and I/O. The "chipset" is the entire SoC. The pattern is always the same: one coordination engine (CPU) and multiple specialized processors sharing a bus and a memory pool.
Sun's SPARC workstations used a chipset called SBus or later UPA (Ultra Port Architecture). The pattern was the same as the PC northbridge/southbridge but with Sun's own bus protocols. The SPARCstation used an IOMMU (I/O Memory Management Unit) that translated device addresses to physical memory — giving each I/O device its own virtual address space. This prevented a misbehaving disk controller from overwriting kernel memory. Sun also pioneered putting the memory controller on a separate MBus module, allowing the chipset to support different memory configurations without redesigning the whole board.
What matters here for our purposes: Sun's architecture enforced isolation between I/O agents via the IOMMU. Each device had a memory budget it couldn't exceed. Each device had a dedicated address space. The system administrator configured these budgets in /etc/system and the kernel's device tree.
SGI's workstations (Indigo, Octane, Onyx) used custom chipsets optimized for graphics. The Indigo² had a chipset called GIO64 with a DMA-capable graphics subsystem that could render directly to the framebuffer while the CPU computed the next frame. The Origin 2000 used a distributed shared memory architecture called S2MP where each node had its own memory and chipset (the Hub ASIC), connected by the CrayLink interconnect. The Hub ASIC arbitrated local memory, remote memory, I/O, and directory-based cache coherence — all from a single chipset chip per node.
SGI's lesson: in a multi-node system, each node needs its own local chipset that manages local resources efficiently, and a communication fabric that connects nodes without bottlenecking. This is exactly how Skill Creator's multi-agent architecture works — each agent team has its own chipset configuration, and the bus connects them.
Every chipset across 40 years of computing, from a $500 Amiga to a $250,000 SGI Origin, shares the same five components:
| Component | What It Does | Amiga | Modern SoC | Skill Creator |
|---|---|---|---|---|
| Memory Controller | Arbitrates who gets to read/write shared memory, when, and how much | Agnus | On-die LPDDR controller | Agnus Budget system |
| Output Processor | Generates visible output (display, audio, documents) from memory contents | Denise | GPU / Display engine | Denise Skill output gen |
| I/O Controller | Handles all external interactions — files, network, peripherals, sensors | Paula | USB/PCIe/SATA controllers | Paula File triggers, offload engine |
| Bus Controller | Routes messages between all other components, resolves addresses | Gary | NoC fabric / PCH | Gary Message routing, agent registry |
| Scheduler | Decides which processor runs when, manages priorities and preemption | Exec kernel | OS kernel | Exec kernel (round-robin) |
The implementation changes. The pattern never does. Skill Creator takes this pattern and implements it in TypeScript for coordinating AI agents instead of silicon processors. The "memory" is Claude's context window. The "processors" are specialized agents. The "bus" is a filesystem-based message system. The "scheduler" is a prioritized round-robin with token budgets.
Claude's context window is to Skill Creator what system RAM is to a chipset. It's a finite, shared resource that every agent needs, and if you don't manage it carefully, one greedy process eats it all and the whole system thrashes or crashes. The chipset's job — in silicon and in software — is to prevent that.
Skill Creator explicitly models its architecture on the Amiga. Not as a metaphor — as the actual coordination pattern. Understanding the original helps you understand every design decision in the code. Let's go deeper on how the Amiga actually worked, because the details matter.
The Amiga had two kinds of memory: Chip RAM (accessible by all custom chips) and Fast RAM (CPU-only). Chip RAM was the bottleneck. Every clock cycle, Agnus decided which chip got bus access. If Denise needed to read a bitplane for the display, Agnus gave it a slot. If the blitter needed to copy a bitmap, Agnus gave it a slot. If the CPU needed to read an instruction, Agnus gave it a slot — but only if no coprocessor needed the bus that cycle.
This is the same constraint Skill Creator faces: the context window is "Chip RAM." Every agent needs it. Every skill loaded into context competes for it. The budget system (Agnus) arbitrates access.
The Copper was a coprocessor inside Agnus that executed "copper lists" — tiny programs with three instructions: WAIT (block until scanline N), MOVE (write a value to a hardware register), and SKIP (skip next instruction if condition met). Copper lists were pre-compiled during the vertical blank interval and executed automatically during display — synchronized to the video beam position.
Skill Creator's Pipeline Lists are the exact same concept: WAIT (block until a GSD lifecycle event), MOVE (activate a skill or script), SKIP (conditional bypass). Pipeline Lists are pre-compiled during planning and execute automatically during phase transitions — synchronized to the build lifecycle.
The blitter was a DMA engine that could copy, combine, and transform rectangular regions of memory without CPU involvement. Game developers set up a blit operation (source, destination, masks, logic operation), told the blitter to go, and the CPU moved on to game logic while the blitter churned through pixels.
Skill Creator's Offload Engine is the same concept. Deterministic operations — running tests, generating boilerplate, formatting — are "promoted" from skill metadata to standalone scripts and executed as child processes. The context window (CPU) is freed for reasoning work while the offloaded operations run in parallel.
AmigaOS had a microkernel called Exec that weighed about 8KB and provided: preemptive multitasking with 128 priority levels, message passing via ports (FIFO queues with reply-based ownership), signal-based wake/sleep (32-bit signal masks), and memory allocation with pool-based management. It ran all of this on a 7 MHz CPU with 256KB–1MB of RAM.
This is the environment Skill Creator emulates. Claude's context window is a few hundred thousand tokens — roughly equivalent to a few hundred KB of usable "RAM" once you account for system prompt, conversation history, and overhead. Managing this tiny shared resource efficiently is the entire challenge, just as it was on the Amiga.
Now let's map the Amiga's hardware architecture directly to Skill Creator's source code. Every chip has a corresponding set of TypeScript modules:
┌──────────────────────────────────────────────────────────────────────┐
│ SKILL CREATOR CHIPSET │
│ │
│ ┌───────────────────┐ ┌───────────────────────────────────────┐ │
│ │ AGNUS │ │ GARY │ │
│ │ Context / Budget │ │ Routing / Registry / Bus │ │
│ │ │ │ │ │
│ │ budget-profiles.ts│◄──┤ agent-registry.ts (14 agents, 5 teams)│ │
│ │ budget-stage.ts │ │ message-envelope.ts (EventEnvelope) │ │
│ │ skill-session.ts │ │ routing-table (11 event types) │ │
│ │ token-counter.ts │ │ chipset.yaml (per-pack config) │ │
│ └────────┬───────────┘ └──────────┬────────────────────────────┘ │
│ │ │ │
│ ▼ ▼ │
│ ┌───────────────────┐ ┌───────────────────────────────────────┐ │
│ │ DENISE │ │ PAULA │ │
│ │ Output Generation │ │ I/O, Triggers, Offload │ │
│ │ │ │ │ │
│ │ skill-pipeline.ts │ │ Bus loops (filesystem) │ │
│ │ skill-applicator │ │ File-change triggers (*.pid, *.calc) │ │
│ │ activation-scorer │ │ Offload engine (child processes) │ │
│ │ llm-analyzer.ts │ │ MCP server integration │ │
│ └────────────────────┘ └───────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────────┐ │
│ │ EXEC KERNEL │ │
│ │ Prioritized round-robin · 4 priority tiers · burst mode │ │
│ │ Pipeline List execution · Phase transition coordination │ │
│ └──────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
This is the section where the hardware parallels become engineering constraints. If you've worked with embedded systems, resource-constrained environments, or early Unix timesharing, this will feel familiar.
Claude's context window is 200,000 tokens. That sounds like a lot, but by the time you subtract the system prompt, conversation history, tool definitions, and user files, the usable space for skills is a small fraction. The default budget in src/types/application.ts tells the story:
export const DEFAULT_CONFIG: ApplicationConfig = {
contextWindowSize: 200_000, // Total context window
budgetPercent: 0.03, // Skills get 3% = 6,000 tokens
relevanceThreshold: 0.1, // Minimum relevance score to load
maxSkillsPerSession: 5, // Hard cap on simultaneous skills
};
6,000 tokens for skills. That's your working memory. It's roughly equivalent to what the Amiga had in Chip RAM — enough to do extraordinary things if you manage it well, and enough to crash immediately if you don't.
In the 1970s and 80s, a university mainframe might have 1–4 MB of RAM shared between 30–100 simultaneous users. The operating system used virtual memory, paging, and process scheduling to make each user feel like they had the whole machine. If one user's process started consuming too much memory, the system paged other processes to disk, degrading everyone's performance — a condition called thrashing.
Skill Creator faces the identical problem. If one agent loads too many skills, it consumes context that other agents need. The system "thrashes" — Claude loses track of instructions, starts hallucinating, or drops important context. The budget system exists to prevent this, just like Unix's memory manager exists to prevent thrashing.
Sun's IOMMU gave each I/O device a virtual address space with hard limits. A misbehaving SCSI controller couldn't overwrite kernel memory because the IOMMU would block the invalid access. Skill Creator's BudgetProfile works the same way — each agent gets a budgetPercent and a hardCeilingPercent:
// From budget-profiles.ts
'gsd-executor': {
budgetPercent: 0.06, // Standard allocation: 6% of context
hardCeilingPercent: 0.10, // Absolute max including burst: 10%
tiers: {
critical: [], // Always load, up to hard ceiling
standard: [], // Load within standard budget
optional: [], // First to be shed under pressure
},
thresholds: { warn50: true, warn80: true, warn100: true },
}
Just as the IOMMU prevents a device from exceeding its allocated address range, the hardCeilingPercent prevents an agent from exceeding its allocated token range — even in burst mode.
This is the Agnus chip in code. It's the most critical component because if it fails, everything fails — just like a broken memory controller takes down the whole machine.
The budget enforcement lives in src/application/stages/budget-stage.ts. It's a pipeline stage — one step in a sequential processing chain (exactly like a CPU pipeline: fetch → decode → execute). Here's the algorithm:
1. Calculate budgets. The stage takes the context window size (200K tokens) and multiplies by the agent's budgetPercent for the standard budget and hardCeilingPercent for the absolute ceiling.
2. Partition skills by tier. Every skill in the resolved set is classified as critical, standard, or optional using the agent's tier configuration. This is like the Unix nice value — critical processes run first, optional processes get whatever's left.
3. Load tiers in priority order. Critical skills load first, and they're allowed to use the full hard ceiling. Standard skills load next, capped to the standard budget. Optional skills load last, only if budget remains. The moment a skill doesn't fit, it goes to budgetSkipped with a reason.
4. Fire threshold warnings. As the budget fills, warnings fire at 50%, 80%, and 100% — configurable per-agent. This is your monitoring. If you see 80% warnings in your logs, you need to optimize skill sizes or reclassify tiers.
// The core loop from budget-stage.ts (simplified)
for (const skill of skills) {
const tokens = await this.tokenCounter.count(content);
if (tier === 'critical') {
fits = totalUsed + tokens <= hardCeiling; // critical: use hard ceiling
} else {
fits = standardUsed + tokens <= standardBudget; // others: standard budget
}
if (fits) {
keptSkills.push(skill);
context.contentCache.set(skill.name, content); // cache to avoid double reads
} else {
context.budgetSkipped.push({ name, tier, reason, estimatedTokens });
}
}
src/application/skill-session.ts is the live memory map — it tracks what's loaded, how much it costs, and whether it's worth keeping. Think of it as the equivalent of running top or htop on a Unix box:
The session tracks: active skills (with their token cost and load time), total tokens used, budget limit, remaining budget, and flagged skills — skills that cost more tokens than they save. A flagged skill is like a process with a memory leak: it's consuming resources without providing proportional value. The getFlaggedSkills() method compares each skill's contentTokens against its estimatedSavings and returns those that are net-negative.
This is the Gary chip — the address decoder and bus controller. In hardware, the bus is the set of wires that carry data between components. In Skill Creator, the bus is a filesystem-based message system and an in-memory routing table.
Every message on the AMIGA bus conforms to a single envelope schema defined in src/amiga/message-envelope.ts. Nine mandatory fields, validated by Zod at runtime:
const EventEnvelopeSchema = z.object({
id: z.string(), // UUID — unique event ID
timestamp: z.string(), // ISO 8601 UTC
source: z.string(), // Sender: 'ME-2', 'broadcast', 'any'
destination: z.string(), // Receiver: 'MC-1', 'OPS', 'broadcast'
type: z.string(), // Event type: 'TELEMETRY_UPDATE'
priority: PrioritySchema, // 'low' | 'normal' | 'high' | 'urgent'
payload: z.record(z.unknown()),// Arbitrary key-value data
correlation: z.string().nullable(),// Request/response pairing
requires_ack: z.boolean(), // Sender expects acknowledgement?
});
This is analogous to an Ethernet frame or a PCI Express transaction layer packet — a fixed header with addressing and priority, followed by an arbitrary payload. Every component speaks the same protocol.
src/amiga/agent-registry.ts defines 14 agents across 5 teams and a routing table mapping 11 event types to sender/receiver pairs. This is the "phone book" of the system:
// From agent-registry.ts — routing table entries
['TELEMETRY_UPDATE', { sender: 'ME-1', receiver: 'MC-1', requiresAck: false }],
['GATE_SIGNAL', { sender: 'ME-1', receiver: 'MC-1', requiresAck: true }],
['COMMAND_DISPATCH', { sender: 'MC-1', receiver: 'ME-1', requiresAck: true }],
['RESOURCE_LOCK_REQ', { sender: 'any', receiver: 'OPS', requiresAck: true }],
['SKILL_REQUEST', { sender: 'any', receiver: 'ME-1', requiresAck: true }],
The requiresAck field is exactly like a hardware bus protocol's acknowledgment bit. When a gate signal fires (a critical decision point), the sender blocks until the receiver acknowledges — just as a PCI Express posted write doesn't require acknowledgment but a non-posted write does. Telemetry updates are fire-and-forget (no ack). Command dispatches and resource locks require ack.
The chipset configurations (like the brainstorm chipset) define named bus loops with priorities, stored as files on disk. The brainstorm chipset has four loops at priorities 0–4:
bus:
type: filesystem
loops:
user: { priority: 0 } # Human ↔ CAPCOM (highest priority)
session: { priority: 1 } # Facilitator ↔ all agents
capture: { priority: 2 } # All agents → Scribe
energy: { priority: 4 } # Energy signals → Facilitator
filename_strategy: monotonic_counter
session_scoped: true
base_path: .brainstorm/sessions/{session_id}/bus/{loop}/
Each loop is a directory. Messages are files named with monotonic counters (like a sequence number). Priority determines processing order — the user loop is always drained first, just as interrupts are serviced by priority in hardware. Session scoping means messages are isolated between sessions — no cross-talk, like separate virtual address spaces.
This is a direct parallel to how Sun's S2MP interconnect on the Origin 2000 worked: separate channels for different traffic types (memory, I/O, coherence) with priority routing, all scoped to individual nodes.
This is Denise in action. The skill pipeline (src/application/skill-pipeline.ts) determines which skills get loaded into the context window and in what order. It's a sequential pipeline with pluggable stages — the same pattern as a CPU instruction pipeline (fetch → decode → execute → writeback).
| Stage | What It Does | Hardware Parallel |
|---|---|---|
| ScoreStage | Reads intent/file/context, matches against skill index, scores each skill by relevance (0–1) | Address decode — which memory bank has the data we need? |
| ResolveStage | Detects conflicts between scored skills (overlapping domains), resolves by priority or merge | Cache coherence — two processors want the same cache line |
| BudgetStage | Enforces token budgets, partitions by tier, sheds optional skills first | DMA arbitration — who gets the bus this cycle? |
| LoadStage | Reads skill content from disk, injects into context, records token tracking | DMA transfer — move data from disk to RAM |
The pipeline supports insertBefore() and insertAfter() for injecting new stages without modifying existing code — this is how the budget stage and future model-aware stages plug in. It's the same extensibility pattern as a Unix pipe: score | resolve | budget | load.
The ActivationScorer in src/activation/activation-scorer.ts is a fast heuristic engine that predicts how reliably a skill description will trigger Claude's auto-activation. It computes five weighted factors:
specificityWeight: 0.35 // Domain-specific terms vs generic words
activationPatternWeight: 0.25 // Explicit "use when..." trigger phrases
lengthWeight: 0.20 // Bell curve: optimal 50-150 chars
imperativeVerbWeight: 0.10 // Starts with action verb: "Generate...", "Run..."
genericPenaltyWeight: 0.10 // Penalize "help", "stuff", "things"
The scorer filters stop words, checks for generic terms (a set of 80+ words like "help", "code", "stuff"), looks for imperative verbs (50+ words like "generate", "create", "validate"), and matches activation patterns (regex like /\buse\s+when\b/i). Output is a score from 0–100 with a label: Reliable (90+), Likely (70+), Uncertain (50+), or Unlikely (<50).
Think of this as a branch predictor in a CPU — it makes fast predictions about which code paths will be taken, so the system can speculatively load the right skills before they're needed.
The Amiga's Exec kernel was a prioritized round-robin scheduler in 8KB. Skill Creator's exec kernel coordinates agents the same way.
The kernel allocates context budget across four priority tiers, mirroring how a Unix scheduler allocates CPU time:
| Tier | Budget Share | Agents | Unix Equivalent |
|---|---|---|---|
| Phase-critical | 60% | Active phase executor, active verifier | Real-time priority (SCHED_FIFO) |
| Workflow | 15% | Planner, researcher, debugger | Normal priority (nice 0) |
| Background | 10% | Documentation, formatting | Low priority (nice 19) |
| Pattern detection | 10% | Skill creator observation hooks | Idle priority (SCHED_IDLE) |
The remaining 5% is reserved for burst mode — temporary overallocation when a phase-critical agent needs to exceed its budget to finish. This is like Linux's CFS bandwidth lending: a real-time task can borrow CPU time from lower-priority tasks when it needs a burst.
Prioritized round-robin ensures that even the lowest-priority tier (pattern detection at 10%) always gets some allocation. This prevents starvation — a problem that pure priority scheduling causes where high-priority tasks monopolize the resource and low-priority tasks never run. The Amiga's Exec solved this with round-robin at each priority level; Skill Creator does the same with guaranteed minimum budgets per tier.
Pipeline Lists are declarative workflow programs that execute automatically during GSD phase transitions. They're pre-compiled during planning — the AI doesn't decide at runtime which skills to load. The Pipeline List has already determined the optimal sequence.
| Instruction | Amiga Copper | Skill Creator Pipeline List |
|---|---|---|
WAIT | Block until video beam reaches scanline N | Block until GSD lifecycle event fires (e.g., phase-planned, tests-passing) |
MOVE | Write value to hardware register (change palette color, move sprite) | Activate a skill or script — mode: sprite (~200 tokens in context) or mode: offload (child process, 0 tokens) |
SKIP | Skip next instruction if beam position matches condition | Skip next instruction if filesystem condition is met (e.g., !exists:.planning/phases/*/SUMMARY.md) |
In a naive multi-agent system, the AI decides at runtime which skills to load. This means the AI spends tokens reasoning about skill selection — tokens that should be spent on actual work. It's like the CPU computing display list addresses instead of game logic.
Pipeline Lists eliminate this overhead. During the planning phase, the system analyzes the project, determines which skills will be needed at each phase transition, and compiles a Pipeline List. During execution, the list runs automatically — no AI reasoning needed for skill selection. The Amiga's Copper worked the same way: the programmer compiled the copper list during vertical blank, and it executed during the active display without any CPU involvement.
The MOVE instruction's mode parameter determines whether the activation happens in the context window or outside it:
Sprite mode loads the skill into context (~200 tokens). Named after Amiga hardware sprites — small, fast, overlaid on the playfield without consuming bitplane memory. Use for skills that need AI reasoning (decision-making, code review, architecture).
Offload mode executes the operation as a child process (0 context tokens). Named after the Amiga's blitter — bulk operations handled by dedicated hardware. Use for deterministic operations (test suites, formatting, file generation, metrics computation).
This is where Skill Creator goes beyond the hardware analogy into something that has no direct parallel in silicon: the system can analyze a project's data and dynamically derive the optimal agent model for it.
src/activation/llm-activation-analyzer.ts uses Claude itself to evaluate skill activation quality. When an API key is available, it sends a skill description to Claude with a prompt asking it to simulate its own activation reasoning — essentially asking the CPU to benchmark its own instruction cache:
The analyzer returns a structured result: score (0–100), confidence (high/medium/low), reasoning, strengths, weaknesses, and suggestions for improvement. When no API key is available, the system falls back to the heuristic ActivationScorer — the fast local predictor.
Skills can carry model guidance metadata in src/types/application.ts:
interface ModelGuidance {
preferred: ModelTier[]; // ['opus', 'sonnet'] — which models this skill targets
minimumCapability?: number; // opus=3, sonnet=2, haiku=1
}
When a model profile is active (quality / balanced / budget), the pipeline can filter skills that don't match the current model tier. A skill designed for Opus-level reasoning won't load if the agent is running on Haiku — it would waste tokens on instructions the model can't follow effectively. This is like how an ARM CPU won't execute x86 instructions — the chipset routes to the right instruction set.
The chipset configurations across the codebase show how agent assignments are derived from project analysis. The VTM (Vision-to-Mission) pipeline chipset assigns agents based on the domain complexity of the vision document. The physical infrastructure chipset routes based on the safety class detected in the request. The brainstorm chipset selects an activation profile (solo_quick / guided_exploration / full_workshop / analysis_sprint) based on the session type.
The pattern: analyze the project data → classify the complexity/domain/risk → select the appropriate chipset configuration → wire the agents → compile the Pipeline List → execute. The model isn't static — it's derived from what the project actually needs.
If you're the kind of person who tunes /etc/sysctl.conf and edits kernel parameters, this is where you do it in Skill Creator. Every skill pack has a chipset.yaml that defines its entire agent coordination system. Let's break down the configuration surface.
agents:
architect-agent:
model: claude-opus-4-5 # Opus for complex reasoning
token_budget: 32000 # Per-invocation budget
calculator-agent:
model: claude-sonnet-4-5 # Sonnet for domain-specific execution
token_budget: 16000
Tuning tip: Opus agents should handle orchestration, safety-critical decisions, and cross-domain constraint satisfaction. Sonnet agents handle domain-specific execution — calculations, code generation, rendering. Don't put Opus where Sonnet will do. It's like using a mainframe CPU for I/O when a DMA controller would be faster and cheaper.
Four topology types, each for a different coordination pattern:
| Topology | When to Use | Unix Parallel |
|---|---|---|
| pipeline | Sequential stages where each must complete before next | cmd1 | cmd2 | cmd3 |
| router | Single entry point dispatches to specialists | nginx reverse proxy |
| leader-worker | One coordinator, parallel workers | fork() + worker pool |
| map-reduce | Distribute work, merge results | Hadoop / MapReduce |
triggers:
- pattern: "*.pid" # When a P&ID file changes...
agent: draftsman-agent # ...route to the draftsman
action: review_and_update_pid
This is Paula — the I/O controller. Triggers watch the filesystem for specific patterns and dispatch to the appropriate agent. It's the same concept as inotify on Linux or kqueue on BSD — filesystem event monitoring with handler routing.
safety:
disclaimer_required: true
warden_required_in_all_teams: true
warden_removal_raises_error: true
downgrade_safety_class: forbidden
These are not suggestions. They're enforced constraints — analogous to hardware write-protect pins or memory protection bits in an MMU. You can't flip them by editing the team definition. The chipset rejects invalid configurations at parse time.
Now you understand the system. Here's how you operate it.
SkillSession.getReport() gives you a complete runtime snapshot:
interface SessionReport {
activeSkills: ActiveSkill[]; // What's loaded, with token counts and load times
totalTokens: number; // Total context consumed by skills
budgetLimit: number; // Maximum allowed
budgetUsedPercent: number; // Current utilization
remainingBudget: number; // Headroom
tokenTracking: TokenTracking[]; // Per-skill cost vs. estimated savings
flaggedSkills: string[]; // Skills costing more than they save
}
This is your top command. Check budgetUsedPercent — if it's consistently over 80%, you need to either increase the budget, optimize skill content, or shed optional skills. Check flaggedSkills — these are your memory leaks. A flagged skill's contentTokens exceeds its estimatedSavings, meaning it's consuming more context than the value it provides.
| Parameter | Where | Default | What It Does |
|---|---|---|---|
budgetPercent | ApplicationConfig | 0.03 (3%) | Fraction of context allocated to skills. Increase if skills are being shed; decrease if context is running low for conversation. |
hardCeilingPercent | BudgetProfile | 0.08–0.10 | Absolute max including burst. The IOMMU limit. Don't exceed 0.15 or you risk starving the conversation. |
maxSkillsPerSession | ApplicationConfig | 5 | Hard cap on simultaneous skills. More skills = more context pressure. Lower for complex tasks; raise for broad ones. |
relevanceThreshold | ApplicationConfig | 0.1 | Minimum activation score to load. Raise to filter marginal skills; lower to cast a wider net. |
| Tier assignments | BudgetProfile.tiers | (empty) | Classify skills as critical/standard/optional. Critical skills get hard-ceiling access. Optional skills are shed first. |
| Token budgets | chipset.yaml | 16K–32K | Per-invocation budget for agents. Opus gets 32K for complex reasoning. Sonnet gets 16K for focused execution. |
Context is exhausted. Check budgetUsedPercent. If over 80%, shed optional skills or move operations to offload mode. This is analogous to thrashing — the system is paging and losing state.
Check activation scores. Run ActivationScorer.score() on the skill description. If under 50, the description needs better trigger phrases. Check budgetSkipped for budget-related drops.
You're spawning Tasks where you should use Skills. PR #826 fixed this: phase transitions should use Skill() invocations (same process, level 0) instead of Task() spawns (new process, level+1). Max safe nesting is 1.
Check required_members in the team definition. If safety-warden is missing, the chipset should reject the config. If it doesn't, the chipset parser has a bug — the safety constraint should be enforced at parse time, not runtime.
If you've administered Unix systems, you already know how to think about this. The context window is RAM. Skills are loaded libraries. Agents are processes. The budget system is the memory manager. The bus is IPC. The routing table is the service registry. Pipeline Lists are cron jobs synchronized to lifecycle events instead of wall-clock time. The Exec kernel is the scheduler.
The difference: in Unix, you have virtual memory and swap. When RAM fills up, the kernel pages to disk. Performance degrades gracefully. In Claude's context window, there is no swap. When the context fills, information is permanently lost from the model's attention. There's no page-in. This is why the budget system is so conservative — 3% default allocation, hard ceiling at 8–10%, aggressive tier-based shedding. The chipset manages Claude's context window the way an embedded systems engineer manages a microcontroller's 64KB of SRAM: every byte is precious, every allocation is deliberate, and overcommitting means failure.
The chipset is the same architecture that has worked for 40 years of computing: specialized processors sharing a managed bus, coordinated by a lightweight kernel, with strict resource budgets enforced by the memory controller. The Amiga proved it in 1985. Sun proved it in the datacenter. Apple proved it in your pocket. Skill Creator proves it for AI agents. Specialize, constrain, compose. The system that manages its resources wins.