Posh AI · Cloudflare

Posh ships AI built for banking.
Cloudflare is the runtime under every conversation.

The pitch on posh.tech is clear: "bank-savvy AI" for credit unions and community banks. Every Karina call is multiple model invocations — speech-to-text, intent classification, RAG against the FI's policy corpus, response generation — under PII-regulated data, with a different cost story per FI.

125+
financial institutions on the platform
Karina
conversational + voice agent
Per-FI
cost attribution is the table-stakes question

Each FI you onboard is a new tenant with its own policy corpus, brand voice, call volume curve, and compliance posture. Multi-tenant AI at scale isn't a feature — it's the foundation. That foundation is what Cloudflare's developer platform is built for.

GA · June 2026 Your AI bill is out of control. Cloudflare can fix it now.
AI Gateway ships dollar-denominated spend limits per FI, identity-driven budgets, PII redaction before the model provider sees the transcript, and automatic fallback routing — across every model provider, in one log line. The runtime piece that makes "one bill per credit union" answerable without code changes. Read the announcement.

Three places the developer platform maps directly to how Posh delivers Karina:

AI Gateway — under every Karina call: per-FI spend caps, PII redaction before the LLM sees account data, one unified log across OpenAI, Anthropic, and any self-hosted model. The answer to "which credit union spent what last quarter?"
Vectorize — for the per-FI knowledge retrieval. 125+ separate policy corpora, each searched at conversation speed.
Workers + Durable Objects — for the multi-tenant orchestration: per-FI session state, per-conversation context, regional residency where the FI's compliance posture requires it.

One question for the platform team:

Is the bigger near-term pain on the cost-attribution side — answering each FI's "what's my AI spend this quarter?" question with confidence — or on the multi-tenant scale side — onboarding the next 100 FIs without the inference math falling apart? 20 minutes to find the right starting point.

Deeper Dive

The full architecture, ready when you are

The detailed primitive-by-primitive mapping — including the eight things Cloudflare changes for Posh AI, the request-flow diagram for a Karina call on Cloudflare, the AI Gateway cache math for 125+ FIs (with an interactive calculator), and the path to 500 FIs — is in the expanded version below.

Read the expanded version →
Grab 20 minutes →
Matt Holscher Cloudflare · Developer Platform