Amadev Architecture
System Overview
Amadev is a self-hosted AI agent runtime engine. It exposes an OpenAI-compatible API that routes requests through a multi-model LLM proxy (Composio), with built-in tool calling, session persistence, reasoning techniques, and admin monitoring.
┌─────────────────────────────────────────────────────────────┐
│ HTTP Clients │
│ (Mobile App / CLI / SDK / curl / Postman) │
└─────────────────────────┬───────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────┐
│ Next.js 16 — App Router — TypeScript │
│ │
│ ┌──────────────┐ ┌────────────────┐ ┌────────────────┐ │
│ │ /v1/chat/ │ │ /v1/ │ │ /admin/ │ │
│ │ completions │ │ conversations │ │ usage/apikey │ │
│ │ │ │ sessions │ │ /llm/settings │ │
│ │ runWith │ │ agents │ └────────────────┘ │
│ │ Tools() │ └────────────────┘ │
│ └──────┬───────┘ │
└─────────┼────────────────────────────────────────────────────┘
│
▼
┌────────────────────────┐ ┌──────────────────────────┐
│ Prisma (PostgreSQL) │ │ Composio API │
│ │ │ ──────────────────── │
│ ┌──────────────────┐ │ │ backend.composio.dev │
│ │ Conversation │ │ │ POST /api/v3.1/tools/ │
│ │ ChatMessage │ │ │ execute/... │
│ │ Key (API Keys) │ │ │ │
│ │ LLMUsageLog │ │ │ ┌──────────────────┐ │
│ │ User/Session │ │ │ │ Groq / Llama │ │
│ └──────────────────┘ │ │ │ OpenAI / Vision │ │
└────────────────────────┘ │ │ Custom models │ │
│ └──────────────────┘ │
└──────────────────────────┘
Request Lifecycle
1. Authentication Layer
Request → validateAmaLLMApiKey()
├── X-API-Key header → Prisma Key lookup
├── Bearer token → Prisma Key lookup
└── No API key → Fallback to NextAuth session
└── Check ADMIN_EMAIL or admin role
2. Chat Completion Flow
POST /v1/chat/completions
│
├── Parse body: model, messages, tools, stream, session_id, reasoning
│
├── Reasoning Engine Path (if technique specified)
│ └── ToT / SC / CoT → multi-call orchestration
│
├── Tool Calling Path (if tools provided)
│ └── AgentRuntimeService.runWithTools()
│ ├── decideToolCalls()
│ │ ├── callComposioForToolDecision() → primary model
│ │ ├── if empty/fail → repair prompt → retry
│ │ └── if still fail → llama-3.1-8b-instant fallback
│ │
│ ├── executeTool() for each tool call
│ │ ├── name lookup in registry
│ │ ├── args JSON.parse
│ │ └── result (capped at 50KB)
│ │
│ └── provider.call() → final synthesis
│ ├── primary model with tool results
│ └── if empty → fallback model → last-resort summary
│
└── Passthrough Path (no tools, simple CoT)
└── Direct Composio API call
3. Session Persistence
After response (non-streaming only):
├── ensureConversation(session_id, userId)
│ ├── lookup existing OR create new
│ └── returns conversation.id
│
├── saveChatMessage(convId, "user", lastMessage)
│ ├── Prisma ChatMessage.create()
│ └── if msgCount ≤ 2: auto-title from content
│
└── saveChatMessage(convId, "assistant", response)
└── Prisma ChatMessage.create()
4. Usage Logging
Each request → LLMUsageLog.create({
model, keyId, success, latencyMs, error, ...
})
Agent Runtime (agent-runtime.ts)
The AgentRuntimeService is the heart of tool calling. Two main methods:
runWithTools(input)
1. if no tools → call LLM directly (simple chat)
2. Build inspection preflight messages (project analysis)
3. decideToolCalls(model, messages, tools) → { toolCalls, assistantText }
4. Normalize tool calls (filter to available tools)
5. Synthesize inspection tool calls if project analysis detected
6. Execute each tool sequentially
7. Call provider for final synthesis
8. If empty → fallback synthesis
decideToolCalls(model, messages, tools)
The critical function that makes the model output JSON tool calls:
Attempt 1: Primary model
Prompt: toolCallSystemPrompt + messages + tools
→ parseToolCallsOrNull()
Attempt 2: Repair prompt
"Your previous response did not include valid tool_calls..."
→ parseToolCallsOrNull()
Attempt 3: Fallback model (llama-3.1-8b-instant)
Primary model output + tools
→ parseToolCallsOrNull()
If all fail → return { toolCalls: null, assistantText }
Fallback Chain Diagram
decideToolCalls(model, messages, tools)
│
├── Step 1: Primary model (any model)
│ ├── ✅ Valid JSON → return toolCalls
│ └── ❌ Invalid → go to Step 2
│
├── Step 2: Repair prompt (same model)
│ ├── ✅ Valid JSON → return toolCalls
│ └── ❌ Still invalid → go to Step 3
│
├── Step 3: Fallback (llama-3.1-8b-instant)
│ ├── ✅ Valid JSON → return toolCalls
│ └── ❌ Failed → return null
│
└── parseToolCallsOrNull() tries 4 regex strategies:
1. Target JSON with "tool_calls" key
2. Array extraction after "tool_calls":
3. Markdown code blocks
4. Greedy scan all JSON objects
Tool Registry
Two parallel systems:
New System (src/lib/tools/)
src/lib/tools/
├── packs/
│ ├── general.ts → amadev.web.search, webfetch, websearch
│ ├── filesystem.ts → file read/write/edit operations
│ ├── academic.ts → paper search
│ ├── finance.ts → financial data
│ ├── gemini.ts → Gemini integration
│ ├── worldbank.ts → World Bank data
│ ├── dropbox.ts → Dropbox integration
├── registry.ts → Central tool registration
├── core/ → Tool execution middleware
├── types.ts → TypeScript types
└── normalize.ts → Tool normalization
Legacy System (src/lib/tool-registry.ts)
Singleton registry used by agent-runtime.ts for tool lookup. Tools registered via registry.register().
Database Schema
Key Tables
| Table | Purpose | Key Fields |
|---|---|---|
Conversation | Chat sessions | id, title, userId |
ChatMessage | Individual messages | id, conversationId, role, parts (JSON) |
amallm_api_keys | API auth keys | id, key (unique), isActive, rateLimit |
LLMUsageLog | Usage tracking | id, model, success, latencyMs |
User | Auth users | id, email, role |
AgentSession | Agent sessions | id, agentId, userId |
AgentMessage | Agent messages | id, sessionId, role, content |
All tables use CUIDs for IDs and createdAt/updatedAt timestamps.
Streaming Architecture
Two streaming approaches:
-
SSE Streaming (
stream: true)- Non-tool paths:
streamDeferredFakeSSE(contentPromise, model) - Tool paths: Promise resolves final content → streamed as SSE
- Format:
data: {"choices":[{"delta":{"content":"..."}}]}\n\n
- Non-tool paths:
-
Tool Call Streaming
- Tools result in
finish_reason: "tool_calls"chunks - Content streamed per-tool with
delta.tool_calls
- Tools result in
Key Design Decisions
1. Universal Tool Call Fallback
All models (not just "weak" ones) get llama-3.1-8b-instant fallback for JSON tool decisions, because most models through the Composio proxy can't reliably output tool-call JSON.
2. Prompt-Based Tool Calling
Not using native OpenAI tool_choice because Composio proxy doesn't pass tools parameter natively. Instead, tool definitions are injected into the system prompt and the model must output raw JSON.
3. Separate Tool Decision & Synthesis
Tool decision uses fallback model (cheap, reliable JSON), final answer generation uses the primary model (higher quality).
4. Session Save Order
User messages saved first with await to ensure correct chronological ordering. Assistant messages saved with void as fire-and-forget.
5. Prisma Accelerate
Uses Prisma Accelerate (not direct PostgreSQL) — no prisma migrate dev. Schema changes require raw SQL ALTER TABLE.
Config Files
| File | Purpose |
|---|---|
config/models.json | Active model registry + defaults |
.env / .env.local | Environment variables |
next.config.ts | Next.js configuration |
vercel.json | Vercel deployment config |
