Introduction
relay is a local CLI agent that acts as a privilege-isolated conversational execution layer, mediating all communication between a high-privilege internal agent and real humans via WhatsApp, Telegram, and phone calls.
The Problem
Your core agent has filesystem access, code execution, internal APIs, and secrets. When it needs to talk to real humans — to confirm a delivery, validate information, or collect feedback — you don't want that same powerful agent exposed to external messaging channels.
You want communication without capability leakage.
The Solution
relay is a client-daemon pair: a persistent background daemon manages WhatsApp, Telegram, and phone connections along with conversation state, while lightweight CLI commands provide full observability and control. Your main agent orchestrates human conversations programmatically through simple shell commands, while maintaining strict privilege isolation.
relay can:
- Send and receive WhatsApp messages
- Send and receive Telegram messages via a bot
- Place outbound phone calls via ElevenLabs Conversational AI
- Run structured conversations with objectives and todo lists
- Automatically follow up with unresponsive contacts (heartbeat system)
- Track conversation state through a defined state machine
- Persist all state across daemon restarts
relay cannot:
- Execute arbitrary code
- Access your filesystem
- Use developer tools
- Call internal APIs
- Escalate privileges
Architecture
Main Agent (full privileges)
│
│ CLI commands (relay create, relay send, relay call, ...)
▼
relay daemon (localhost:3214)
│ conversation-scoped tools only
│
├── WhatsApp (Baileys v7)
├── Telegram (grammY bot)
├── Phone (ElevenLabs + Twilio)
▼
Human contactsProduct Principles
-
Capability Isolation — Main Agent (full access) + Relay Agent (conversation-only). No shared execution context.
-
API-First — Everything is programmable via CLI. No UI dependency. No hidden flows.
-
Conversation as Execution Boundary — relay collects information, follows an internal checklist, and produces structured output. The main agent decides what to do with that output.