Haris Ahmed
Contact
All projects2026

HardTalk — AI rehearsal platform for high-stakes conversations

Real-time voice rehearsal for investor pitches, board briefings, sales calls, and media interviews. Up to three AI personas interrupt, push back, and escalate frustration over a single Gemini Live WebSocket, then post-session analysis scores you across six skill axes and prescribes targeted drills.

Visit live
¶ Overview
HardTalk is a SaaS rehearsal platform for founders, sales leaders, and execs who have to walk into rooms where the questions are hostile and the stakes are real. You pick a room preset (VC pitch, board, enterprise sale, media, partnership), choose or compose AI personas with deep behavioral tuning, and run a live voice session where the personas listen, interrupt, and pressure-test you. After the session, an LLM judges the transcript on clarity, structure, objection handling, conviction, presence, and commercial sharpness, surfaces habit-level weaknesses (hedging, vague numbers, filler, deflecting), and routes you to AI-driven drills designed for the gap. Built as a pnpm monorepo: React/TS web app, separate admin app, Supabase Postgres + Edge Functions backend, Gemini Live for voice, OpenRouter for analysis, Grok/X.AI and People Data Labs for target research, Stripe for billing.
¶ Important info
Production app on Railway + Supabase with 23 edge functions and a 50+ role persona library. Dual-mode AI: dev runs against Google AI Studio (API-key tokens, ~3 concurrent sessions); prod runs against Vertex AI (OAuth2 service-account tokens, 1,000+ concurrent), and the frontend is mode-agnostic because the token endpoint returns a pre-built WebSocket URL. The standout detail is the multi-persona session model — three distinct voices, accents, and personalities all served from a single Gemini Live connection, with handoffs driven by `[PersonaName]` bracket prefixes in the AI stream and a turn-routing cap of three consecutive turns per persona. Supporting systems include real-time hedging/filler/vague-claim detection that fires coaching hints mid-session, per-persona frustration tracking (-100..+100) that escalates on weak answers, an AudioWorklet fallback so the mic works on iOS Safari, Stripe subscriptions with plan-based feature gates, GDPR account deletion + data export, and VirusTotal scanning on uploaded prep documents
¶ Problem faced
Gemini Live only gives you one voice per WebSocket. The product needs three personas in the same room — different voices, different accents, different temperaments — talking to each other and to the user, with the ability to interrupt and be interrupted. Opening three parallel sessions wastes quota, fragments the conversation, and forces the client to fan audio in and out of three contexts. You also can't just close-and-reconnect on every handoff without dropping the conversational state the next persona needs to push back coherently. On top of that, the platform has to run in two cleanly separated AI backends (Google AI Studio in dev, Vertex AI in prod) with completely different auth (API key vs. OAuth2 from a service account), without leaking that distinction into the browser
¶ How it was solved
Single WebSocket, voice-switched via Gemini's session-resumption tokens. The orchestrator (`GeminiVoiceSwitcher`) closes the socket with custom code 4010, reconnects with the new voice plus the resumption token, and the new persona resumes the same conversational context in ~1–3s. Turn routing caps consecutive turns per persona at three and uses a `[PersonaName]` regex on the model stream to detect handoffs. If resumption ever fails, the fallback is a fresh connection with transcript replay so context isn't lost. The dual-mode auth split is hidden behind a single `getAIConfig()` helper and a `gemini-token` edge function that returns `{ accessToken, wsEndpoint, modelPath }` — the browser never knows whether it's talking to AI Studio or Vertex. Trade-offs: voice switches still cost a perceptible reconnect (~1–3s), so the engine biases toward longer turns; and the bracket-prefix handoff is a prompt-engineering contract, not a structured field, so it has to be defended with a regex guard plus a fallback to round-robin if the model forgets to emit it
¶ Stack
  • TypeScript
  • Supabase
  • Postgres
  • Deno Edge Functions
  • Gemini Live
  • Vertex AI
  • Grok / X.AI
  • Stripe
  • Vitest
  • React Query
Live site
Back to all projects
Haris Ahmed

AI engineer building intelligent systems that survive production. Available for roles & contract work.

Back to top
IndexAboutStackWorkPathContact
ElsewhereGitHubLinkedInEmail
© 2026 Haris Ahmed · All rights reservedAI systems that actually scale.
haris-ai.session
Live
Haris

Haris AI

Retrieval-augmented · Always on

Hi, I'm Haris's AI. Ask me about his work, his stack, or how to reach him. I'll get you straight to the answer.

Try asking
Enter to send · Shift+Enter for newline