Amadev Architecture

System Overview

Amadev is a self-hosted AI agent runtime engine. It exposes an OpenAI-compatible API that routes requests through a multi-model LLM proxy (Composio), with built-in tool calling, session persistence, reasoning techniques, and admin monitoring.

Code

┌─────────────────────────────────────────────────────────────┐
│                      HTTP Clients                            │
│  (Mobile App / CLI / SDK / curl / Postman)                  │
└─────────────────────────┬───────────────────────────────────┘
                          │
                          ▼
┌──────────────────────────────────────────────────────────────┐
│  Next.js 16 — App Router — TypeScript                        │
│                                                              │
│  ┌──────────────┐  ┌────────────────┐  ┌────────────────┐   │
│  │ /v1/chat/    │  │ /v1/           │  │ /admin/        │   │
│  │ completions  │  │ conversations   │  │ usage/apikey   │   │
│  │              │  │ sessions        │  │ /llm/settings  │   │
│  │   runWith    │  │ agents          │  └────────────────┘   │
│  │   Tools()    │  └────────────────┘                        │
│  └──────┬───────┘                                           │
└─────────┼────────────────────────────────────────────────────┘
          │
          ▼
┌────────────────────────┐     ┌──────────────────────────┐
│  Prisma (PostgreSQL)   │     │  Composio API            │
│                        │     │  ────────────────────    │
│  ┌──────────────────┐  │     │  backend.composio.dev    │
│  │ Conversation     │  │     │  POST /api/v3.1/tools/   │
│  │ ChatMessage      │  │     │  execute/...             │
│  │ Key (API Keys)   │  │     │                          │
│  │ LLMUsageLog      │  │     │  ┌──────────────────┐    │
│  │ User/Session     │  │     │  │ Groq / Llama     │    │
│  └──────────────────┘  │     │  │ OpenAI / Vision  │    │
└────────────────────────┘     │  │ Custom models    │    │
                               │  └──────────────────┘    │
                               └──────────────────────────┘

Request Lifecycle

1. Authentication Layer

Code

Request → validateAmaLLMApiKey()
          ├── X-API-Key header → Prisma Key lookup
          ├── Bearer token → Prisma Key lookup  
          └── No API key → Fallback to NextAuth session
               └── Check ADMIN_EMAIL or admin role

2. Chat Completion Flow

Code

POST /v1/chat/completions
  │
  ├── Parse body: model, messages, tools, stream, session_id, reasoning
  │
  ├── Reasoning Engine Path (if technique specified)
  │   └── ToT / SC / CoT → multi-call orchestration
  │
  ├── Tool Calling Path (if tools provided)
  │   └── AgentRuntimeService.runWithTools()
  │       ├── decideToolCalls()
  │       │   ├── callComposioForToolDecision() → primary model
  │       │   ├── if empty/fail → repair prompt → retry
  │       │   └── if still fail → llama-3.1-8b-instant fallback
  │       │
  │       ├── executeTool() for each tool call
  │       │   ├── name lookup in registry
  │       │   ├── args JSON.parse
  │       │   └── result (capped at 50KB)
  │       │
  │       └── provider.call() → final synthesis
  │           ├── primary model with tool results
  │           └── if empty → fallback model → last-resort summary
  │
  └── Passthrough Path (no tools, simple CoT)
      └── Direct Composio API call

3. Session Persistence

Code

After response (non-streaming only):
  ├── ensureConversation(session_id, userId)
  │   ├── lookup existing OR create new
  │   └── returns conversation.id
  │
  ├── saveChatMessage(convId, "user", lastMessage)
  │   ├── Prisma ChatMessage.create()
  │   └── if msgCount ≤ 2: auto-title from content
  │
  └── saveChatMessage(convId, "assistant", response)
      └── Prisma ChatMessage.create()

4. Usage Logging

Code

Each request → LLMUsageLog.create({
  model, keyId, success, latencyMs, error, ...
})

Agent Runtime (`agent-runtime.ts`)

The AgentRuntimeService is the heart of tool calling. Two main methods:

`runWithTools(input)`

Code

1. if no tools → call LLM directly (simple chat)
2. Build inspection preflight messages (project analysis)
3. decideToolCalls(model, messages, tools) → { toolCalls, assistantText }
4. Normalize tool calls (filter to available tools)
5. Synthesize inspection tool calls if project analysis detected
6. Execute each tool sequentially
7. Call provider for final synthesis
8. If empty → fallback synthesis

`decideToolCalls(model, messages, tools)`

The critical function that makes the model output JSON tool calls:

Code

Attempt 1: Primary model
  Prompt: toolCallSystemPrompt + messages + tools
  → parseToolCallsOrNull()

Attempt 2: Repair prompt
  "Your previous response did not include valid tool_calls..."
  → parseToolCallsOrNull()

Attempt 3: Fallback model (llama-3.1-8b-instant)
  Primary model output + tools
  → parseToolCallsOrNull()

If all fail → return { toolCalls: null, assistantText }

Fallback Chain Diagram

Code

decideToolCalls(model, messages, tools)
│
├── Step 1: Primary model (any model)
│   ├── ✅ Valid JSON → return toolCalls
│   └── ❌ Invalid → go to Step 2
│
├── Step 2: Repair prompt (same model)
│   ├── ✅ Valid JSON → return toolCalls
│   └── ❌ Still invalid → go to Step 3
│
├── Step 3: Fallback (llama-3.1-8b-instant)
│   ├── ✅ Valid JSON → return toolCalls
│   └── ❌ Failed → return null
│
└── parseToolCallsOrNull() tries 4 regex strategies:
    1. Target JSON with "tool_calls" key
    2. Array extraction after "tool_calls":
    3. Markdown code blocks
    4. Greedy scan all JSON objects

Tool Registry

Two parallel systems:

New System (`src/lib/tools/`)

Code

src/lib/tools/
├── packs/
│   ├── general.ts       → amadev.web.search, webfetch, websearch
│   ├── filesystem.ts    → file read/write/edit operations
│   ├── academic.ts      → paper search
│   ├── finance.ts       → financial data
│   ├── gemini.ts        → Gemini integration
│   ├── worldbank.ts     → World Bank data
│   ├── dropbox.ts       → Dropbox integration

├── registry.ts          → Central tool registration
├── core/                → Tool execution middleware
├── types.ts             → TypeScript types
└── normalize.ts         → Tool normalization

Legacy System (`src/lib/tool-registry.ts`)

Singleton registry used by agent-runtime.ts for tool lookup. Tools registered via registry.register().

Database Schema

Key Tables

Table	Purpose	Key Fields
`Conversation`	Chat sessions	id, title, userId
`ChatMessage`	Individual messages	id, conversationId, role, parts (JSON)
`amallm_api_keys`	API auth keys	id, key (unique), isActive, rateLimit
`LLMUsageLog`	Usage tracking	id, model, success, latencyMs
`User`	Auth users	id, email, role
`AgentSession`	Agent sessions	id, agentId, userId
`AgentMessage`	Agent messages	id, sessionId, role, content

All tables use CUIDs for IDs and createdAt/updatedAt timestamps.

Streaming Architecture

Two streaming approaches:

SSE Streaming (stream: true)
- Non-tool paths: streamDeferredFakeSSE(contentPromise, model)
- Tool paths: Promise resolves final content → streamed as SSE
- Format: data: {"choices":[{"delta":{"content":"..."}}]}\n\n
Tool Call Streaming
- Tools result in finish_reason: "tool_calls" chunks
- Content streamed per-tool with delta.tool_calls

Key Design Decisions

1. Universal Tool Call Fallback

All models (not just "weak" ones) get llama-3.1-8b-instant fallback for JSON tool decisions, because most models through the Composio proxy can't reliably output tool-call JSON.

2. Prompt-Based Tool Calling

Not using native OpenAI tool_choice because Composio proxy doesn't pass tools parameter natively. Instead, tool definitions are injected into the system prompt and the model must output raw JSON.

3. Separate Tool Decision & Synthesis

Tool decision uses fallback model (cheap, reliable JSON), final answer generation uses the primary model (higher quality).

4. Session Save Order

User messages saved first with await to ensure correct chronological ordering. Assistant messages saved with void as fire-and-forget.

5. Prisma Accelerate

Uses Prisma Accelerate (not direct PostgreSQL) — no prisma migrate dev. Schema changes require raw SQL ALTER TABLE.

Config Files

File	Purpose
`config/models.json`	Active model registry + defaults
`.env` / `.env.local`	Environment variables
`next.config.ts`	Next.js configuration
`vercel.json`	Vercel deployment config

Amadev Architecture

System Overview

Request Lifecycle

1. Authentication Layer

2. Chat Completion Flow

3. Session Persistence

4. Usage Logging

Agent Runtime (agent-runtime.ts)

runWithTools(input)

decideToolCalls(model, messages, tools)

Fallback Chain Diagram

Tool Registry

New System (src/lib/tools/)

Legacy System (src/lib/tool-registry.ts)

Database Schema

Key Tables

Streaming Architecture

Key Design Decisions

1. Universal Tool Call Fallback

2. Prompt-Based Tool Calling

3. Separate Tool Decision & Synthesis

4. Session Save Order

5. Prisma Accelerate

Config Files

Agent Runtime (`agent-runtime.ts`)

`runWithTools(input)`

`decideToolCalls(model, messages, tools)`

New System (`src/lib/tools/`)

Legacy System (`src/lib/tool-registry.ts`)