VOICE AI PLATFORM
Silicon Smackdown

Silicon Smackdown

Password: 1999Built for Google Gemini Developer Competition

What Makes It Special

Full-Duplex Voice AI

Real-time, low-latency voice conversations using Gemini 2.5 Flash with native audio streaming. Achieves <100ms audio latency using AudioWorklet for high-performance capture.

No text-to-speech intermediaries—pure voice-to-voice AI with live waveform visualization.

20+ AI Personalities

Curated character pairs from Einstein vs. Bohr to Tony Stark vs. Peter Parker. Each with unique voices, personalities, and debate styles powered by contextual DiceBear avatars.

Choose from rivalries like Logic vs. Hype, Detective & Mastermind, or The Relativist & The Quantum.

Multi-Agent Orchestration

Sophisticated state machine managing dual AI sessions with automatic turn-taking and context-aware prompting. Built with custom React hooks for modular state management.

Typed reducer with useReducer ensures predictable conversation flow and prevents state bugs.

Production Audio Pipeline

Web Audio API + AudioWorklet architecture with ScriptProcessor fallback for browser compatibility. Real-time waveform analysis, audience effects, and quality indicators.

Dual-channel audio routing for guest separation with automatic reconnection logic.

Technical Architecture

<100ms
Audio Latency
1-3s
AI Response Time
50-100MB
Memory Footprint

Custom Hook Architecture

Modular state management with focused, testable hooks that separate concerns:

useConversationState

Typed reducer for conversation flow

useGeminiSessions

Multi-session AI management

useAudioPipeline

Audio capture and playback

useTranscription

Streaming transcription updates

Conversation Flow

  • State Machine: Typed reducer manages guest turns, speaking states, and prompts
  • Auto Turn-Taking: Guests automatically respond to each other with configurable delays
  • Context Preservation: Conversation history maintained across turns
  • Smart Prompting: Dynamic prompts based on conversation state

Featured Rivalries

Logic vs. Hype

Dr. Orion
vs.
Luna Nova

Philosophy vs. Futurism

Detective & Mastermind

Sherlock
vs.
Moriarty

Genius vs. Criminal Mind

The Genius & The Spider

Tony Stark
vs.
Peter Parker

Mentor vs. Protégé

Jedi Master & Apprentice

Master Yoda
vs.
Luke Skywalker

Wisdom vs. Youth

The Relativist & The Quantum

Einstein
vs.
Niels Bohr

Physics Debate

The Teacher & The Student

Walter White
vs.
Jesse Pinkman

Breaking Bad Dynamics

Tech Stack

React 19
Concurrent features
TypeScript
Type safety
Gemini 2.5
Live API
Web Audio
Real-time audio
Tailwind CSS
Styling
Vite
Build tool
i18next
i18n (EN/EL)
DiceBear
Avatars

Key Learnings

What Worked

  • Custom Hook Architecture: Separating concerns made the system maintainable and testable
  • AudioWorklet: Reduced latency from ~200ms to <100ms and eliminated glitches
  • Typed State Machine: Prevented state bugs and made flow predictable
  • Fallback Mechanisms: Auto-reconnection ensured reliability across browsers

Challenges Overcome

  • Turn-Taking: State machine with explicit turn management solved guests talking over each other
  • Context Loss: Maintaining conversation history preserved context between turns
  • Audio Echo: Headphone detection and audio routing isolation prevented feedback
  • Memory Leaks: Proper cleanup in useEffect hooks prevented memory growth

Explore the Project

Dive into the code, read the full documentation, or learn more about production voice AI architecture.

© 2026 Systems Engineer | AI Ecosystems Specialist — Built with Next.js & Tailwind

Catalyst is a personal AI operating system and intelligent assistant platform providing real-time voice and text interactions, knowledge base access, and integrated tool capabilities. Learn more