Does my caller's audio leave my server when CodeB Voice AI answers a call?

It depends on the backend the operator configured. With a cloud AI Voice Engine, caller audio is streamed in real time to the model API and processed there under that vendor's terms. With the on-premise AI Voice Engine backend, no audio leaves the machine. There is no default cloud connection — if no model is configured, AI calls do not run.

Where do AI-call transcripts live?

Tenant-local under App_Data/ /ai-transcripts/ as .txt and .json. The bridge can email a copy to the operator address attached to the vnum or campaign. Nothing is uploaded anywhere else by default.

Who is the data controller for AI calls?

You are. If your deployment runs AI calls using a cloud model, you are the controller for that processing under GDPR Art. 4(7) and equivalent law in your jurisdiction. You choose the model backend, the retention policy, and the caller disclosure. Aloaha Limited and the model vendor are processors.

Do I have to tell callers they're speaking to an AI?

Yes, in most jurisdictions. The EU AI Act (Art. 50) requires disclosure of AI interaction; some US states require it at the start of the call; sectoral law (e.g. healthcare) may impose stricter rules. Disclosure is typically delivered in the AI's first sentence or via a hold-message before the model picks up.

Can I run CodeB Voice AI fully on-premise with no cloud model?

Yes. Configure the bridge with the on-premise AI Voice Engine text-to-speech backend, or wire it to an on-premise inference server running a on-premise speech-to-text + an open-weights model combo. No caller audio, transcripts or prompts leave the machine. The trade-off is response latency and natural-language quality — practical for scripted intake, not yet matching the cloud AI Voice Engine for free-flow conversation.

What about call recording and consent?

CodeB does not record AI calls as audio files. It writes the transcript (text only) returned by the model. If you need full audio recording, that is a separate feature governed by your own retention policy and consent flow.

AI·PRIVACY·v1

AI calls have a different privacy posture than meetings.

CodeB Sovereign Communications and CodeB Phone keep human-to-human media inside your deployment. CodeB Voice AI doesn’t, by design — not unless you pick the local backend. This page explains exactly what changes when an AI call runs, who becomes responsible for what, and what choices you have.

Reading this page because you’re on a procurement, DPO or security-architecture path? Jump to the data-egress matrix, operator checklist, and the regulatory anchors. For the technical packet flow, see vnum-dataflow.html.

01 / Backend choice is yours

The model backend is pluggable per deployment.

CodeB Voice AI doesn’t bake any one model vendor into the product. The bridge can run against any of the following, configured by the operator in App_Data/<tenant>/appsettings.json. There is no default cloud connection: with no backend configured, AI calls don’t run.

CLOUD

a cloud AI Voice Engine

Real-time bidirectional voice over WebSocket. Caller audio streamed to the configured cloud-engine vendor’s API and processed there; transcripts returned through the same connection. Google’s data-processing terms apply.

CLOUD

another cloud AI Voice Engine

Same architectural pattern as the cloud AI Voice Engine. Caller audio streamed to the configured cloud-engine vendor; the cloud-engine vendor’s terms apply. Mix-and-match by vnum is supported.

ON-PREMISE

On-premise AI Voice Engine / inference server

On-Premise-AI-Sprach-Engine text-to-speech for static scripts, or wire to an on-prem on-premise speech-to-text + an open-weights model. No caller audio leaves the machine. Recommended for air-gap, healthcare and public-sector.

Two specifics worth knowing. API keys are per-tenant, configured in the tenant’s appsettings file — we don’t share or proxy keys between deployments. Per-vnum backend override is supported: you can route one DID to a cloud AI Voice Engine and another DID on the same install to an on-premise AI Voice Engine, if some lines need the cloud model and others must stay on-prem.

02 / What leaves vs. what stays

The egress matrix.

The single most important question for a procurement review is “does this data leave my network?” Here is the complete answer per backend.

Data item	Cloud AI Voice Engine	Local backend (On-Premise-AI-Sprach-Engine / on-prem)
Caller audio (real-time PCM)	Streamed to vendor API	Stays on the machine
AI-spoken audio	Generated by vendor, streamed back	Generated locally
Persona system prompt	Sent to vendor on every call	Stays on the machine
Live transcript text	Returned by vendor; mirrored to disk	Generated locally; written to disk
Tool-call args (transfer_to_user, hangup)	Round-trips through vendor	Stays on the machine
Final transcript file (.txt + .json)	Stays on disk; emailed if configured	Stays on disk; emailed if configured
Whitelist + FraudGuard counters	Stays on the machine	Stays on the machine
SIP trunk credentials	Stays on the machine	Stays on the machine
REST API keys, OIDC signing keys	Stays on the machine	Stays on the machine
Webhook delivery list	Stays on the machine	Stays on the machine
Audio recording (caller wav)	Not recorded	Not recorded

See the full AI-call data-flow diagram

03 / Operator obligations

What the operator is responsible for.

If your deployment runs AI calls using a cloud model, you are the data controller under GDPR Art. 4(7) and most equivalent jurisdictions. The model vendor is a processor; Aloaha Limited is a processor for the bridge software itself but does not see your call data. Three concrete obligations follow.

Caller disclosure

Tell callers they’re talking to an AI — typically in the AI’s first sentence, or via a brief hold-message before pickup. The EU AI Act (Art. 50, in force 2025-08-02) requires this; California and several US states have similar rules; some sectoral regulators go further.

Backend selection & data-residency match

Pick a backend whose data-processing terms match your regulatory environment. Concrete defaults we recommend:

EU healthcare, legal or public-sector deployments → on-premise AI Voice Engine, or a vendor with a documented EU data-residency offering and a signed DPA.
Generic SME EU deployments → cloud backend with a standard DPA; document the choice in your processor register.
Air-gap / classified environments → on-premise AI Voice Engine only.

Transcript retention policy

The bridge keeps transcripts indefinitely on disk by default. Set a rotation policy that matches your minimisation duty: a scheduled task that deletes files older than N days under App_Data/<tenant>/ai-transcripts/. We deliberately don’t do this for you — the right N depends on your sector.

04 / Regulatory anchors

How this maps to specific laws.

Not legal advice — just the obvious anchors a DPO will look at.

EU GDPR & AI Act

Art. 4(7) GDPR: you, the deploying organisation, are the controller. Aloaha Limited and the chosen model vendor are processors. Get a DPA signed with each.
Art. 28 GDPR: your processor register needs an entry for the model vendor if you use a cloud backend. Local-only deployments have no external processor.
Art. 32 GDPR: technical and organisational measures — the egress matrix above is the document your security team can attach.
EU AI Act Art. 50: caller-disclosure obligation when an AI system interacts with a natural person.

UK

UK GDPR mirrors EU GDPR for controller / processor obligations.
ICO direct-marketing guidance applies to outbound AI campaigns — we recommend reading it before running an outbound vnum to UK consumer numbers.

US

HIPAA (healthcare): the cloud backend is unlikely to be HIPAA-compliant under the standard DPA. Use the local backend for any deployment that touches PHI.
State AI-disclosure laws (California, Colorado and others): disclose AI interaction at call start.

Healthcare-sector specifics

For any deployment that handles patient identifiers or clinical content, default to the local backend.
If a cloud backend is unavoidable, scope it to non-clinical workflows (appointment confirmation, billing FAQ, opening hours) and never to clinical advice.

05 / Fully on-premise AI

If you can’t use a cloud model at all.

Air-gap and high-assurance deployments are supported. Two patterns work today:

On-Premise-AI-Sprach-Engine scripts — the simplest on-prem option. The persona’s replies are short text-to-speech tied to caller intents, no LLM in the loop. Fine for opening-hours announcements, password resets, scripted intake forms. No external network access required.
on-premise speech-to-text + an open-weights model on the same Windows host — supported via a local inference server (a local inference server). Sub-second latency is achievable on a modest GPU. Speech quality is below cloud-vendor TTS, but acceptable for many internal-facing applications.

We do not ship the inference server itself — you bring the hardware and model weights. The bridge exposes a configurable WebSocket endpoint URL where your local inference server lives. Get the local-inference setup notes →

06 / Common questions

Things buyers usually ask.

Are calls between humans handled the same way?

No. Human-to-human calls on CodeB Phone go from SIP phone ↔ bridge ↔ your SIP trunk. No model API is involved, ever. See the main privacy manifesto and SIP data flow for that path.

Do you train models on my data?

Aloaha Limited does not train anything on your call data — we don’t see it. Whether the model vendor uses your data for training depends on the DPA you sign with them. the cloud AI Voice Engine vendor and enterprise tiers from major cloud vendors contractually exclude training; the consumer tiers may not. Pick accordingly.

Where does the bridge live when an AI call runs?

The bridge is a Windows service on your machine. It maintains two connections: SIP toward your trunk, and a WebSocket toward the configured model API (or your local inference server). It is never proxied through any Aloaha-owned infrastructure.

What about call metadata — numbers dialled, durations, FraudGuard counters?

All of it stays on the machine. The CDR is local. FraudGuard counters reset daily and live in the tenant directory. The webhook dispatcher sends events to URLs you configure, signed with HMAC.

Can a caller request transcript deletion?

Yes — transcripts are flat files under App_Data/<tenant>/ai-transcripts/. A DSAR-fulfilment workflow is one shell-script away. We ship a small admin page (transcripts.html) that lets you locate and delete a specific transcript by call ID.

What gets logged centrally?

Nothing. There is no central log shipper, no telemetry SDK, no Sentry, no Datadog. Bridge logs are written to App_Data/<tenant>/logs/ on the same machine and only viewable via the admin web logs page (operator-only).

07 / Where to next