All case studies
AI ToolFeatured

Meeting Co-Pilot

Bobi Labs · productized

Role

PM-builder · productized internal tool

Duration

Built Q1 2026 · Live in production use

Status

Live · used daily on contract calls

Meeting Co-Pilot

The problem

Real-time meeting tools (Otter, Fireflies, Granola) summarize calls after the fact. None of them help during the call, at the moment when the operator is being asked a hard question and a grounded answer would change the outcome. Generic LLM-on-call tools fabricate; you can't trust them to read off an answer verbatim.

We wanted a tool that could surface a concrete, source-grounded tip mid-call (the answer to a question the operator could read aloud word-for-word) without ever inventing facts. And we wanted it to be cheap enough to run on every call, not just rehearsed ones.

The approach

Two-stage LLM. Cheap fast model (Haiku 4.5) gates every transcript chunk every ~12 seconds: is this a moment a tip would help? If yes, expensive smart model (Opus 4.7) generates the tip with full project context cached. The gate cuts cost by ~95% vs. running Opus on every chunk.

Per-project knowledge folder loaded into Opus's cached system prompt at session start. Bio, services, prior work, target clients, anxiety/autism context. All loaded once, cached for the session, recalled by Opus on demand. Cache TTL is 1h which matches typical call length.

Hard-coded NEVER-FABRICATE rules in the system prompt. If the answer isn't in the knowledge folder or the live transcript, the tip says 'I don't know.' Better to surface nothing than to surface a hallucinated fact the operator reads aloud.

HUD overlay sits on the operator's monitor, transparent over the call window. Tips render inline on cards; follow-up questions ('Dig Deeper', 'AI Answer') extend the parent card.

What shipped

  • Voicemeeter audio routing → Deepgram STT pipeline (mic + system audio split → unified transcript stream).
  • Two-stage LLM (Haiku gate → Opus tip) with prompt caching for sub-cost-per-call economics.
  • Per-project knowledge folder loader (CLAUDE-style hot-reload at session start).
  • HUD overlay with tip cards, Dig Deeper extensions, and post-call review.
  • Session log JSONL persisted with tip extension lineage (parentTipId, extensionKind).

Stack & decisions

TypeScript + Node.js

Electron-ish HUD shell, Node back end. Cross-platform but currently Windows-tested only (Voicemeeter dependency).

Deepgram STT

Real-time WebSocket streaming. Mic + system audio split + unified transcript.

Anthropic API (Haiku 4.5 + Opus 4.7)

Two-stage gate-then-generate with prompt caching (1h TTL). Per-project knowledge folder loaded into the cached system prompt.

Voicemeeter

Routes mic + system audio to separate Deepgram channels so the transcript can attribute speakers cleanly.

Outcome

  • Used daily by the operator on contract / interview calls.
  • Per-call cost ~$0.02–0.05 with caching. Without the Haiku gate it would be 20× that.
  • Anxiety/autism support: the operator can read tips aloud verbatim during stutters/freezes. A pre-composed answer beats improvisation under pressure.

Have something we should ship?

60-minute scoping call, then a clear plan for how we'd approach it.