Back to Documentation
AMADEV Docs
Architecture

System Architecture

Deep system design — request lifecycle, session persistence, tool registry, fallback chains

Amadev Architecture

System Overview

Amadev is a self-hosted AI agent runtime engine. It exposes an OpenAI-compatible API that routes requests through a multi-model LLM proxy (Composio), with built-in tool calling, session persistence, reasoning techniques, and admin monitoring.

Code
┌─────────────────────────────────────────────────────────────┐
│                      HTTP Clients                            │
│  (Mobile App / CLI / SDK / curl / Postman)                  │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌──────────────────────────────────────────────────────────────┐
│  Next.js 16 — App Router — TypeScript                        │
│                                                              │
│  ┌──────────────┐  ┌────────────────┐  ┌────────────────┐   │
│  │ /v1/chat/    │  │ /v1/           │  │ /admin/        │   │
│  │ completions  │  │ conversations   │  │ usage/apikey   │   │
│  │              │  │ sessions        │  │ /llm/settings  │   │
│  │   runWith    │  │ agents          │  └────────────────┘   │
│  │   Tools()    │  └────────────────┘                        │
│  └──────┬───────┘                                           │
└─────────┼────────────────────────────────────────────────────┘
          │
          ▼
┌────────────────────────┐     ┌──────────────────────────┐
│  Prisma (PostgreSQL)   │     │  Composio API            │
│                        │     │  ────────────────────    │
│  ┌──────────────────┐  │     │  backend.composio.dev    │
│  │ Conversation     │  │     │  POST /api/v3.1/tools/   │
│  │ ChatMessage      │  │     │  execute/...             │
│  │ Key (API Keys)   │  │     │                          │
│  │ LLMUsageLog      │  │     │  ┌──────────────────┐    │
│  │ User/Session     │  │     │  │ Groq / Llama     │    │
│  └──────────────────┘  │     │  │ OpenAI / Vision  │    │
└────────────────────────┘     │  │ Custom models    │    │
                               │  └──────────────────┘    │
                               └──────────────────────────┘

Request Lifecycle

1. Authentication Layer

Code
Request → validateAmaLLMApiKey()
          ├── X-API-Key header → Prisma Key lookup
          ├── Bearer token → Prisma Key lookup  
          └── No API key → Fallback to NextAuth session
               └── Check ADMIN_EMAIL or admin role

2. Chat Completion Flow

Code
POST /v1/chat/completions
  │
  ├── Parse body: model, messages, tools, stream, session_id, reasoning
  │
  ├── Reasoning Engine Path (if technique specified)
  │   └── ToT / SC / CoT → multi-call orchestration
  │
  ├── Tool Calling Path (if tools provided)
  │   └── AgentRuntimeService.runWithTools()
  │       ├── decideToolCalls()
  │       │   ├── callComposioForToolDecision() → primary model
  │       │   ├── if empty/fail → repair prompt → retry
  │       │   └── if still fail → llama-3.1-8b-instant fallback
  │       │
  │       ├── executeTool() for each tool call
  │       │   ├── name lookup in registry
  │       │   ├── args JSON.parse
  │       │   └── result (capped at 50KB)
  │       │
  │       └── provider.call() → final synthesis
  │           ├── primary model with tool results
  │           └── if empty → fallback model → last-resort summary
  │
  └── Passthrough Path (no tools, simple CoT)
      └── Direct Composio API call

3. Session Persistence

Code
After response (non-streaming only):
  ├── ensureConversation(session_id, userId)
  │   ├── lookup existing OR create new
  │   └── returns conversation.id
  │
  ├── saveChatMessage(convId, "user", lastMessage)
  │   ├── Prisma ChatMessage.create()
  │   └── if msgCount ≤ 2: auto-title from content
  │
  └── saveChatMessage(convId, "assistant", response)
      └── Prisma ChatMessage.create()

4. Usage Logging

Code
Each request → LLMUsageLog.create({
  model, keyId, success, latencyMs, error, ...
})

Agent Runtime (agent-runtime.ts)

The AgentRuntimeService is the heart of tool calling. Two main methods:

runWithTools(input)

Code
1. if no tools → call LLM directly (simple chat)
2. Build inspection preflight messages (project analysis)
3. decideToolCalls(model, messages, tools) → { toolCalls, assistantText }
4. Normalize tool calls (filter to available tools)
5. Synthesize inspection tool calls if project analysis detected
6. Execute each tool sequentially
7. Call provider for final synthesis
8. If empty → fallback synthesis

decideToolCalls(model, messages, tools)

The critical function that makes the model output JSON tool calls:

Code
Attempt 1: Primary model
  Prompt: toolCallSystemPrompt + messages + tools
  → parseToolCallsOrNull()

Attempt 2: Repair prompt
  "Your previous response did not include valid tool_calls..."
  → parseToolCallsOrNull()

Attempt 3: Fallback model (llama-3.1-8b-instant)
  Primary model output + tools
  → parseToolCallsOrNull()

If all fail → return { toolCalls: null, assistantText }

Fallback Chain Diagram

Code
decideToolCalls(model, messages, tools)
│
├── Step 1: Primary model (any model)
│   ├── ✅ Valid JSON → return toolCalls
│   └── ❌ Invalid → go to Step 2
│
├── Step 2: Repair prompt (same model)
│   ├── ✅ Valid JSON → return toolCalls
│   └── ❌ Still invalid → go to Step 3
│
├── Step 3: Fallback (llama-3.1-8b-instant)
│   ├── ✅ Valid JSON → return toolCalls
│   └── ❌ Failed → return null
│
└── parseToolCallsOrNull() tries 4 regex strategies:
    1. Target JSON with "tool_calls" key
    2. Array extraction after "tool_calls":
    3. Markdown code blocks
    4. Greedy scan all JSON objects

Tool Registry

Two parallel systems:

New System (src/lib/tools/)

Code
src/lib/tools/
├── packs/
│   ├── general.ts       → amadev.web.search, webfetch, websearch
│   ├── filesystem.ts    → file read/write/edit operations
│   ├── academic.ts      → paper search
│   ├── finance.ts       → financial data
│   ├── gemini.ts        → Gemini integration
│   ├── worldbank.ts     → World Bank data
│   ├── dropbox.ts       → Dropbox integration

├── registry.ts          → Central tool registration
├── core/                → Tool execution middleware
├── types.ts             → TypeScript types
└── normalize.ts         → Tool normalization

Legacy System (src/lib/tool-registry.ts)

Singleton registry used by agent-runtime.ts for tool lookup. Tools registered via registry.register().


Database Schema

Key Tables

TablePurposeKey Fields
ConversationChat sessionsid, title, userId
ChatMessageIndividual messagesid, conversationId, role, parts (JSON)
amallm_api_keysAPI auth keysid, key (unique), isActive, rateLimit
LLMUsageLogUsage trackingid, model, success, latencyMs
UserAuth usersid, email, role
AgentSessionAgent sessionsid, agentId, userId
AgentMessageAgent messagesid, sessionId, role, content

All tables use CUIDs for IDs and createdAt/updatedAt timestamps.


Streaming Architecture

Two streaming approaches:

  1. SSE Streaming (stream: true)

    • Non-tool paths: streamDeferredFakeSSE(contentPromise, model)
    • Tool paths: Promise resolves final content → streamed as SSE
    • Format: data: {"choices":[{"delta":{"content":"..."}}]}\n\n
  2. Tool Call Streaming

    • Tools result in finish_reason: "tool_calls" chunks
    • Content streamed per-tool with delta.tool_calls

Key Design Decisions

1. Universal Tool Call Fallback

All models (not just "weak" ones) get llama-3.1-8b-instant fallback for JSON tool decisions, because most models through the Composio proxy can't reliably output tool-call JSON.

2. Prompt-Based Tool Calling

Not using native OpenAI tool_choice because Composio proxy doesn't pass tools parameter natively. Instead, tool definitions are injected into the system prompt and the model must output raw JSON.

3. Separate Tool Decision & Synthesis

Tool decision uses fallback model (cheap, reliable JSON), final answer generation uses the primary model (higher quality).

4. Session Save Order

User messages saved first with await to ensure correct chronological ordering. Assistant messages saved with void as fire-and-forget.

5. Prisma Accelerate

Uses Prisma Accelerate (not direct PostgreSQL) — no prisma migrate dev. Schema changes require raw SQL ALTER TABLE.


Config Files

FilePurpose
config/models.jsonActive model registry + defaults
.env / .env.localEnvironment variables
next.config.tsNext.js configuration
vercel.jsonVercel deployment config