A CodeB virtual number is an AI agent. The model itself runs in the AI provider's cloud, but every other piece — the system prompt, the transcript, the tool whitelist, the audio path, the SIP credentials — lives on your own Windows host under your own admin control.
Caller, bridge, AI agent, prompt file, transcript file, tool calls
Inbound
01 / Both call paths
Browser or phone — same agent
A visitor on room.html?dial=2000 and a PSTN caller dialling the configured DID land in the same AI agent. The bridge picks one of three answer paths (trunk, Webphone loopback, or public listener) and creates a per-call player for the session.
Bridge
02 / Live audio loop
Per-call WebSocket to the AI model
The bridge opens a fresh WebSocket to the AI model endpoint, streams the caller’s audio frames upstream (PCM 16 kHz), receives synthesised speech downstream (PCM 24 kHz), and resamples each direction to / from the call’s native rate (Opus 48 kHz on the WebRTC half, µ-law 8 kHz on the SIP half).
Prompt
03 / Fresh on every call
Prompts/<vnum>.txt, re-read live
The agent’s personality + rules live in App_Data/<tenant>/prompts/<vnum>.txt. The bridge never caches the file in memory — it reads from disk on every single call, so an edit takes effect on the next ring. Edit, save, dial — that’s the test loop.
Tool calls: how the agent does things
The AI model can call “tools” — named functions the model invokes mid-call by emitting a JSON envelope on the WebSocket. CodeB exposes a small, deliberately conservative set:
transfer_to_user(target) — ring a registered CodeB Webphone or dial a whitelisted PSTN number. The target is matched against the per-vnum allow-list before any signalling happens; nothing else is dial-able from an AI agent session.
hangup(reason) — end the call cleanly. The reason is written to the CDR.
Tool calls travel over the same WebSocket as the audio frames; the bridge intercepts them, validates them against the per-vnum config, and either executes (and reports the result back into the model) or refuses and tells the model why.
Transcripts
If SaveTranscripts is set on the vnum rule, the bridge accumulates each conversational turn (user transcript + assistant transcript) in memory during the call. On hangup it writes a .txt and a .json file to App_Data/<tenant>/transcripts/ and, if email recipients are configured, drops an SMTP pickup file containing the formatted transcript + the caller’s real IP and reverse-DNS where available.
Raw audio is never persisted — only the model’s text view of the conversation.
What stays on the host
System prompt — on disk under App_Data/<tenant>/prompts/. Edited via the admin UI or by hand; auto-migrated from inline JSON the first time the bridge reads a vnum row.
Transcripts — App_Data/<tenant>/transcripts/, gated by an explicit per-vnum opt-in.
CDR entries — one row per call, with the AI-agent mode flag, the tool calls invoked, and the hangup reason.
API key — the AI-provider API key lives in the per-tenant appsettings.json, never in client-visible config.
What leaves the host
Audio frames in each direction, over TLS, to the AI model WebSocket endpoint on the per-tenant API key.
Tool-call results (e.g. “transfer succeeded”) returned to the same WebSocket so the model can continue the conversation accurately.
Nothing else. No analytics, no telemetry, no AI-provider account sign-in, no third-party session cookies.