VOICE AI PLATFORM

Silicon Smackdown

Password: 1999Built for Google Gemini Developer Competition

What Makes It Special

Full-Duplex Voice AI

Real-time, low-latency voice conversations using Gemini 2.5 Flash with native audio streaming. Achieves <100ms audio latency using AudioWorklet for high-performance capture.

No text-to-speech intermediaries—pure voice-to-voice AI with live waveform visualization.

20+ AI Personalities

Curated character pairs from Einstein vs. Bohr to Tony Stark vs. Peter Parker. Each with unique voices, personalities, and debate styles powered by contextual DiceBear avatars.

Choose from rivalries like Logic vs. Hype, Detective & Mastermind, or The Relativist & The Quantum.

Multi-Agent Orchestration

Sophisticated state machine managing dual AI sessions with automatic turn-taking and context-aware prompting. Built with custom React hooks for modular state management.

Typed reducer with useReducer ensures predictable conversation flow and prevents state bugs.

Production Audio Pipeline

Web Audio API + AudioWorklet architecture with ScriptProcessor fallback for browser compatibility. Real-time waveform analysis, audience effects, and quality indicators.

Dual-channel audio routing for guest separation with automatic reconnection logic.

Technical Architecture

<100ms

Audio Latency

1-3s

AI Response Time

50-100MB

Memory Footprint

Custom Hook Architecture

Modular state management with focused, testable hooks that separate concerns:

useConversationState

Typed reducer for conversation flow

useGeminiSessions

Multi-session AI management

useAudioPipeline

Audio capture and playback

useTranscription

Streaming transcription updates

Conversation Flow

State Machine: Typed reducer manages guest turns, speaking states, and prompts
Auto Turn-Taking: Guests automatically respond to each other with configurable delays
Context Preservation: Conversation history maintained across turns
Smart Prompting: Dynamic prompts based on conversation state

Featured Rivalries

Logic vs. Hype

Dr. Orion

vs.

Luna Nova

Philosophy vs. Futurism

Detective & Mastermind

Sherlock

vs.

Moriarty

Genius vs. Criminal Mind

The Genius & The Spider

Tony Stark

vs.

Peter Parker

Mentor vs. Protégé

Jedi Master & Apprentice

Master Yoda

vs.

Luke Skywalker

Wisdom vs. Youth

The Relativist & The Quantum

Einstein

vs.

Niels Bohr

Physics Debate

The Teacher & The Student

Walter White

vs.

Jesse Pinkman

Breaking Bad Dynamics

Tech Stack

React 19

Concurrent features

TypeScript

Type safety

Gemini 2.5

Live API

Web Audio

Real-time audio

Tailwind CSS

Styling

Vite

Build tool

i18next

i18n (EN/EL)

DiceBear

Avatars

Key Learnings

What Worked

✓Custom Hook Architecture: Separating concerns made the system maintainable and testable
✓AudioWorklet: Reduced latency from ~200ms to <100ms and eliminated glitches
✓Typed State Machine: Prevented state bugs and made flow predictable
✓Fallback Mechanisms: Auto-reconnection ensured reliability across browsers

Challenges Overcome

→Turn-Taking: State machine with explicit turn management solved guests talking over each other
→Context Loss: Maintaining conversation history preserved context between turns
→Audio Echo: Headphone detection and audio routing isolation prevented feedback
→Memory Leaks: Proper cleanup in useEffect hooks prevented memory growth

Explore the Project

Dive into the code, read the full documentation, or learn more about production voice AI architecture.

View Source Code Back to Portfolio