# Architecture Overview

This document describes the architecture of the Agent Orchestration system.

## System Overview

The Agent Orchestrator is a multi-agent coordination system designed to manage AI coding agents across different interfaces (CLI and API). It provides:

- **Unified Agent Interface**: Abstracts differences between CLI agents (Claude Code, Gemini CLI, Codex CLI) and API agents (Claude SDK, OpenAI Agents)
- **Risk-Based Autonomy**: Four-tier classification system (LOW/MEDIUM/HIGH/CRITICAL) for safe agent operations
- **Resource Management**: Budget controls, rate limiting, and session tracking
- **Memory System**: Three-tier memory architecture for context management

## High-Level Architecture

```
┌─────────────────────────────────────────────────────────────────────┐
│                        Orchestration Layer                          │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │              OrchestrationBrain                              │   │
│  │  ┌─────────────┐  ┌──────────────┐  ┌─────────────────┐    │   │
│  │  │  Commands   │  │   Spawner    │  │   Dashboard     │    │   │
│  │  └─────────────┘  └──────────────┘  └─────────────────┘    │   │
│  └─────────────────────────────────────────────────────────────┘   │
├─────────────────────────────────────────────────────────────────────┤
│                         Control Layer                               │
│  ┌─────────────────┐  ┌─────────────────┐  ┌──────────────────┐   │
│  │  Control Loop   │  │   Risk Policy   │  │  Autonomy Gate   │   │
│  └─────────────────┘  └─────────────────┘  └──────────────────┘   │
├─────────────────────────────────────────────────────────────────────┤
│                         Agent Layer                                 │
│  ┌────────────────────────────────────────────────────────────┐    │
│  │                    Agent Adapters                           │    │
│  │  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────┐   │    │
│  │  │ Claude   │  │ Gemini   │  │  Codex   │  │  SDK     │   │    │
│  │  │  Code    │  │   CLI    │  │   CLI    │  │ Adapters │   │    │
│  │  └──────────┘  └──────────┘  └──────────┘  └──────────┘   │    │
│  └────────────────────────────────────────────────────────────┘    │
├─────────────────────────────────────────────────────────────────────┤
│                       Memory Layer                                  │
│  ┌─────────────────┐  ┌─────────────────┐  ┌──────────────────┐   │
│  │  Operational    │  │   Knowledge     │  │    Working       │   │
│  │    Memory       │  │    Memory       │  │    Memory        │   │
│  └─────────────────┘  └─────────────────┘  └──────────────────┘   │
├─────────────────────────────────────────────────────────────────────┤
│                      Support Services                               │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌──────────────────┐   │
│  │  Budget  │  │ Interrupt│  │  Merge   │  │  Observability   │   │
│  │ Controls │  │ Handlers │  │   Gate   │  │    Stack         │   │
│  └──────────┘  └──────────┘  └──────────┘  └──────────────────┘   │
├─────────────────────────────────────────────────────────────────────┤
│                      Persistence Layer                              │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    SQLite Database                           │   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘
```

## Module Structure

```
src/agent_orchestrator/
├── __init__.py           # Package exports, exceptions
├── main.py               # Entry point, Orchestrator class
├── config.py             # Configuration management
├── exceptions.py         # Custom exception hierarchy
│
├── orchestrator/         # Central coordination
│   ├── brain.py          # OrchestrationBrain - main intelligence
│   ├── commands.py       # Command handler (/assign, /status, etc.)
│   ├── spawner.py        # Dynamic agent spawning
│   └── dashboard.py      # Status display formatting
│
├── adapters/             # Agent interfaces
│   ├── base.py           # BaseAdapter, LLMAdapter, CLIAgentAdapter
│   ├── claude_code_cli.py
│   ├── gemini_cli.py
│   ├── codex_cli.py
│   ├── claude_sdk.py
│   └── openai_agents.py
│
├── control/              # Orchestration control
│   ├── loop.py           # AgentControlLoop - main event loop
│   ├── health.py         # Health checking, stuck detection
│   ├── actions.py        # Control actions (pause, resume, etc.)
│   ├── autonomy_gate.py  # Four-tier decision engine
│   └── risk_policy.py    # Re-exports from risk module
│
├── risk/                 # Risk classification
│   ├── policy.py         # RiskPolicy, RiskClassification
│   ├── blocklist.py      # Critical blocklist patterns
│   └── autonomy_gate.py  # GateDecision, AutonomyGate
│
├── memory/               # Memory management
│   ├── operational.py    # Project state, decisions
│   ├── knowledge.py      # ADRs, runbooks, patterns
│   ├── working.py        # Active session context
│   ├── write_gate.py     # Memory write validation
│   └── models.py         # Memory data models
│
├── tracking/             # Usage tracking
│   └── cli_usage.py      # Session-based usage for CLI agents
│
├── budget/               # Cost controls
│   ├── agent_budget.py   # Per-agent budgets
│   ├── mcp_budget.py     # MCP tool budgets
│   ├── registry.py       # Budget registry
│   └── usage_monitor.py  # Usage monitoring
│
├── interrupt/            # Human approval
│   ├── approval_queue.py # Approval request queue
│   ├── cli_handler.py    # Terminal-based approval
│   └── async_handler.py  # Async/webhook approval
│
├── merge/                # Branch protection
│   ├── gate.py           # Merge locking
│   └── readiness.py      # Merge readiness checks
│
├── persistence/          # Data persistence
│   ├── database.py       # OrchestratorDB
│   └── models.py         # SQLAlchemy models
│
├── journal/              # Project journal
│   ├── project_journal.py
│   └── status_packet.py
│
├── observability/        # Monitoring
│   ├── alerts.py         # Alert management
│   ├── audit.py          # Audit trail
│   ├── token_audit.py    # Token usage tracking
│   └── ai_observer.py    # AI observability
│
├── reliability/          # Production hardening
│   ├── rate_limiter.py   # Rate limiting
│   ├── retry.py          # Retry logic
│   └── shutdown.py       # Graceful shutdown
│
├── cli/                  # CLI interface
│   ├── colors.py         # Terminal colors/theming
│   └── menu.py           # Command menu display
│
├── secrets/              # Security
│   ├── redactor.py       # Secret redaction
│   └── guard.py          # Secret protection
│
├── workspace/            # Workspace management
│   └── manager.py        # tmux + worktree management
│
└── prompts/              # System prompts
    └── orchestrator_system.py  # Orchestrator prompt builder
```

## Key Design Patterns

### 1. Adapter Pattern

All agents implement a common interface through the adapter pattern:

```python
class BaseAdapter(ABC):
    @abstractmethod
    async def execute(self, task: Task, context: dict) -> AgentResponse: ...

    @abstractmethod
    async def check_authentication(self) -> bool: ...

class CLIAgentAdapter(BaseAdapter):
    # Adds CLI-specific methods: inject_context, parse_output, etc.

class LLMAdapter(BaseAdapter):
    # Adds LLM-specific methods: build_messages, estimate_cost, etc.
```

### 2. Four-Tier Risk Classification

Actions are classified into four risk levels:

| Level | Auto-Behavior | Examples |
|-------|---------------|----------|
| LOW | Auto-allowed | Read files, run tests, lint |
| MEDIUM | Edits OK | Source code edits, git commit |
| HIGH | Requires approval | Push to main, deploy, migrations |
| CRITICAL | Auto-rejected | rm -rf /, force push, DROP DATABASE |

### 3. Three-Tier Memory Architecture

```
┌─────────────────────────────────────────────────┐
│           Working Memory (Session)              │
│  - Current task context                         │
│  - In-flight operations                         │
│  - Temporary state                              │
├─────────────────────────────────────────────────┤
│           Operational Memory (State)            │
│  - Project state                                │
│  - Task history                                 │
│  - Active constraints                           │
│  - Recent decisions                             │
├─────────────────────────────────────────────────┤
│           Knowledge Memory (Persistent)         │
│  - ADRs (Architecture Decision Records)         │
│  - Runbooks                                     │
│  - Patterns and fixes                           │
│  - Indexed project knowledge                    │
└─────────────────────────────────────────────────┘
```

### 4. Control Loop

The main orchestration loop:

```
                    ┌──────────────┐
                    │    Start     │
                    └──────┬───────┘
                           │
                           ▼
              ┌────────────────────────┐
              │  Read Operational      │
              │  Memory                │
              └───────────┬────────────┘
                          │
                          ▼
              ┌────────────────────────┐
              │  Check Agent Health    │
              │  (Stuck Detection)     │
              └───────────┬────────────┘
                          │
                          ▼
              ┌────────────────────────┐
              │  Get Pending Tasks     │
              └───────────┬────────────┘
                          │
                          ▼
              ┌────────────────────────┐
              │  Autonomy Gate Check   │◄────┐
              └───────────┬────────────┘     │
                          │                  │
                    ┌─────┴─────┐            │
                    ▼           ▼            │
              ┌─────────┐  ┌─────────┐       │
              │ Allowed │  │ Blocked │       │
              └────┬────┘  └────┬────┘       │
                   │            │            │
                   ▼            ▼            │
              ┌─────────┐  ┌─────────┐       │
              │ Execute │  │ Request │───────┘
              │  Task   │  │ Approval│
              └────┬────┘  └─────────┘
                   │
                   ▼
              ┌────────────────────────┐
              │  Update Memory &       │
              │  Record Results        │
              └───────────┬────────────┘
                          │
                          ▼
                    ┌──────────┐
                    │  Sleep   │
                    └──────────┘
```

## Agent Types

### CLI Agents (Control Plane A)

- **Claude Code**: Primary coding agent with full IDE capabilities
- **Gemini CLI**: Large context analysis, documentation
- **Codex CLI**: Quick edits with auto-apply

Characteristics:
- Session-based limits (not token-based billing)
- Interactive authentication via web flows
- Run in tmux sessions for isolation

### API Agents (Control Plane B)

- **Claude SDK**: Structured outputs, batch processing
- **OpenAI Agents SDK**: Function calling, alternative provider

Characteristics:
- Token-based billing
- API key authentication
- Direct SDK calls

## Data Flow

### Task Assignment Flow

```
User Input
    │
    ▼
OrchestrationBrain.suggest_agent_for_task()
    │
    ├── Check agent availability (CLIUsageTracker)
    ├── Match task to capabilities
    └── Consider workload balance
    │
    ▼
OrchestratorCommands._cmd_assign()
    │
    ├── Validate agent exists
    ├── Check availability
    └── Create Task in database
    │
    ▼
AgentControlLoop.assign_task()
    │
    ├── Build context (OperationalMemory)
    ├── Autonomy gate check
    └── Execute via adapter
    │
    ▼
Adapter.execute()
    │
    └── Return AgentResponse
```

### Risk Classification Flow

```
Action Request
    │
    ▼
RiskPolicy.classify_command() / classify_file()
    │
    ├── Match against CRITICAL patterns → Auto-reject
    ├── Match against HIGH patterns → Require approval
    ├── Match against MEDIUM patterns → Allow with monitoring
    └── Match against LOW patterns → Auto-allow
    │
    ▼
AutonomyGate.check()
    │
    ├── Apply agent-specific tier constraints
    └── Return GateDecision
    │
    ▼
ApprovalQueue (if needed)
    │
    └── Wait for human approval
```

## Configuration

### Environment Variables

```bash
# Core
ORCHESTRATOR_DB_PATH=data/orchestrator.db
ORCHESTRATOR_LOG_LEVEL=INFO

# API Keys (for SDK agents)
ANTHROPIC_API_KEY=sk-ant-...
OPENAI_API_KEY=sk-...
GOOGLE_API_KEY=...

# Budget Controls
DEFAULT_DAILY_BUDGET_USD=10.00
BUDGET_ALERT_THRESHOLD=0.8

# Rate Limiting
RATE_LIMIT_REQUESTS_PER_MINUTE=60
```

### Agent Limits

CLI agents have session-based limits configured in `tracking/cli_usage.py`:

```python
DEFAULT_CLI_LIMITS = {
    "claude-code": CLIAgentLimits(
        session_request_limit=50,
        weekly_request_limit=500,
        requests_per_minute=10,
    ),
    "gemini-cli": CLIAgentLimits(
        session_request_limit=100,
        weekly_request_limit=1000,
        requests_per_minute=20,
    ),
}
```

## Extending the System

### Adding a New Agent

1. Create adapter in `adapters/`:
   ```python
   class NewAgentAdapter(CLIAgentAdapter):  # or LLMAdapter
       def __init__(self, agent_id: str, ...):
           ...

       async def execute(self, task, context):
           ...
   ```

2. Register in `main.py`:
   ```python
   adapter = NewAgentAdapter(...)
   orchestrator.register_adapter("new-agent", adapter)
   ```

3. Add limits in `tracking/cli_usage.py` (for CLI agents)

### Adding New Risk Patterns

Edit `risk/policy.py`:
```python
CRITICAL_COMMAND_PATTERNS.append(
    (r"your_pattern", "Description"),
)
```

### Creating Specialized Agents

Use the spawner or `/spawn` command:
```python
agent = await spawner.spawn(
    agent_type="custom-specialist",
    custom_prompt="Your specialized prompt...",
)
```

## Error Handling

The system uses a custom exception hierarchy (`exceptions.py`):

```
OrchestratorError
├── AgentError (agent operations)
├── TaskError (task operations)
├── RiskError (risk/approval)
├── BudgetError (budget/rate limits)
├── MemoryError (memory operations)
└── ConfigurationError (config issues)
```

## Observability

- **Audit Trail**: All actions logged with severity levels
- **Token Audit**: Track token usage per agent/task
- **Alerts**: Configurable alerts for budget, errors, stuck agents
- **Health Checks**: Stuck detection with configurable thresholds

## Security Considerations

- **Secret Redaction**: Automatic redaction in logs
- **Risk Blocklist**: Auto-reject dangerous operations
- **Approval Gates**: Human-in-the-loop for high-risk actions
- **Protected Branches**: Merge gate controls
