Agent Architecture

How Vibe's AI agent processes tasks and makes decisions.

Execution Modes

Vibe supports two execution patterns:

Plan-Execute-Reflect (Default)

Best for complex, multi-step tasks.

┌──────────┐     ┌──────────┐     ┌───────────┐
│   PLAN   │────>│  EXECUTE │────>│  REFLECT  │
└──────────┘     └──────────┘     └───────────┘
     ▲                                  │
     │                                  │
     └──────────── feedback ───────────┘

Plan: Analyze task, identify steps
Execute: Run tools, interact with browser
Reflect: Verify result, retry if needed

Plan-Execute (Simple Mode)

Lower overhead for straightforward tasks.

┌──────────┐     ┌──────────┐
│   PLAN   │────>│  EXECUTE │────> Done
└──────────┘     └──────────┘

LangGraph State Machine

The agent uses LangGraph to manage state:

                    ┌─────────────┐
                    │   START     │
                    └──────┬──────┘
                           │
                    ┌──────▼──────┐
                    │  ASSISTANT  │ (LLM decides action)
                    └──────┬──────┘
                           │
              ┌────────────┴────────────┐
              │                         │
       ┌──────▼──────┐           ┌──────▼──────┐
       │    TOOLS    │           │    DONE     │
       └──────┬──────┘           └─────────────┘
              │
       ┌──────▼──────┐
       │  REFLECTION │ (verify success)
       └──────┬──────┘
              │
              └────────────> back to ASSISTANT

Tool Invocation

When the LLM calls a tool:

Parse tool call from LLM response
Validate arguments against schema
Execute tool in browser context
Return result to LLM for next decision

// Example tool call from LLM
{
  "tool_calls": [{
    "name": "click_by_index",
    "args": { "index": 5 }
  }]
}

// Tool execution
const result = await clickByIndex({ index: 5 });
// Returns: "Clicked element [5] 'Add to Cart' button"

Reflection System

After task completion, reflection validates the result:

// Reflection prompt
"Did the agent successfully complete the task?
 Task: Find the price of iPhone 15
 Last action: Extracted text '$999.00'

 Respond with:
 - COMPLETE if task is done
 - RETRY with feedback if more work needed"

If reflection returns RETRY, the agent receives feedback and continues.

Subagent Architecture

For parallel tasks, the main agent spawns subagents:

┌─────────────────────────────────────────────────┐
│                 MAIN AGENT                       │
│                                                  │
│  Task: "Compare prices on Amazon and Best Buy"  │
│                                                  │
│        ┌─────────────┴─────────────┐            │
│        │                           │            │
│  ┌─────▼─────┐              ┌──────▼────┐       │
│  │ SUBAGENT 1│              │ SUBAGENT 2│       │
│  │ (Amazon)  │              │ (Best Buy)│       │
│  └─────┬─────┘              └─────┬─────┘       │
│        │                          │             │
│        └──────────┬───────────────┘             │
│                   │                             │
│           ┌───────▼───────┐                     │
│           │ MERGE RESULTS │                     │
│           └───────────────┘                     │
└─────────────────────────────────────────────────┘

Subagent Properties

Resource	Sharing
LLM	Shared (same API)
Tools	Shared reference
State	Isolated (own context)
Tabs	Can create own tabs

When to Use Subagents

Parallel price comparisons
Multi-site data extraction
Background operations
Context isolation

Token Management

The agent tracks token usage to prevent context overflow:

Max context: Model-dependent (128K for GPT-4)
Page content: Truncated if too large
Conversation history: Summarized when approaching limit

Error Handling

Tool Error → Retry (up to 3 times) → Escalate to Reflection
                                            │
                                    ┌───────▼───────┐
                                    │ Adjust approach │
                                    └───────────────┘

Common error patterns:

Element not found → Wait and retry
Page not loaded → Increase wait time
API timeout → Reduce request complexity

Execution Modes​

Plan-Execute-Reflect (Default)​

Plan-Execute (Simple Mode)​

LangGraph State Machine​

Tool Invocation​

Reflection System​

Subagent Architecture​

Subagent Properties​

When to Use Subagents​

Token Management​

Error Handling​