Skip to content

The Context Pipeline

SPEED uses a 3-Layer architecture to solve the problem of agent orientation. Instead of asking agents to explore a codebase using tools, the pipeline pre-computes the necessary context and delivers it as a static, high-fidelity prompt.

For the implementation-level reference (assembly functions, budget allocation tables, scoping process), see the Architecture: Context Pipeline page.

The pipeline transforms raw source code into stage-specific instructions through three distinct transformations.

Layer 1 is the “Never Silently Wrong” foundation. It indexes the entire codebase into a queryable Codebase Semantic Graph (CSG). For details on how we extract these symbols, see the Language Support Reference.

  1. Project Map: A registry of every file, its language, and its categorical importance.
  2. File Skeletons: Compressed versions of source files containing only signatures and interfaces.
  3. Spec-Codebase Alignment: A mapping that confirms whether spec requirements match existing code.

Layer 2 narrows the global index for a specific task. It applies a “Graph Distance” algorithm to the CSG to select only relevant information.

DistanceDelivery StrategyRationale
0 HopsFull ContentThe files the task will modify.
1 HopFull ContentDirect callers and callees that the agent needs to understand.
2 HopsSkeletonInterface awareness of neighbors without using the token budget.
3+ HopsDroppedDistant code that is unlikely to be relevant to the immediate task.

Layer 2 also calculates a Budget Plan. If the selected files exceed the agent’s context window, it systematically downgrades content (starting with the most distant hops) until the budget is met. Every cut is recorded in budget.json for later debugging.

Layer 3 is the final assembly stage. It takes the scoped data from Layer 2 and formats it into a prompt optimized for a specific agent role.

The assembly follows the core principle of Intent before Detail:

  • Context First: Who is the agent, what is the mission, and what is the rationale for this task?
  • Constraints second: What are the “never” rules (gate constraints, model limits)?
  • Content last: The raw source code and technical documentation.

Each agent stage receives a different “projection” of the same codebase data.

StageHigh-Priority DataFiltered Data
ArchitectCSG domain clusters and symbol counts.Raw function implementations.
DeveloperFull 1-hop source and task rationale.Cross-task analysis and global schemas.
ReviewerBlast radius and spec verification matrix.Internal task-level assumptions.

When content exceeds the configured token budget, the pipeline makes evidence-based cuts:

  1. Skeleton Tier: 2-hop neighbor files are dropped first.
  2. Size Tier: Large 1-hop files are downgraded from full content to skeletons.
  3. Last Resort: The most distant skeletons are truncated with a pointer for the agent to use a “Read” tool if necessary.

This structured approach ensures that developer agents never start a task in the dark. Infrastructure carries the weight of orientation, allowing the LLM to focus entirely on implementation.