πŸš€ Quick Start

Get your Gemini Live voice agent running in under 5 minutes with this comprehensive setup guide.

πŸ“‹ Prerequisites

Before starting, ensure you have:

Google AI API Key

Access to Google’s Generative AI API with Gemini Live enabled

Twilio Account

For phone call integration (optional for web-only)

Node.js Environment

Node.js 18+ for running the voice service

TixAE Account

Your TixAE workspace with voice features enabled

πŸ”‘ Step 1: Google AI API Setup

Get Your API Key

  1. Visit the Google AI Studio
  2. Create a new API key or use existing one
  3. Important: Ensure your key has access to Gemini Live models
API Key Requirements: Your Google AI API key must have access to the latest Gemini models. Some keys may not have Live API access by default.

Verify API Access

Test your API key with this quick verification:
curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "Hello, are you working?"}]}]
  }'

βš™οΈ Step 2: Configure Your Agent

1. Create Your Agent

In your TixAE dashboard:
  1. Navigate to Create Agent β†’ Voice Agent
  2. Choose β€œGoogle Gemini Live” as your voice provider
  3. Set up your basic agent configuration

2. Model Selection

Choose the correct Gemini Live model:
{
  "llmConfig": {
    "modelId": "gemini-2.0-flash-exp",
    "provider": "google-gemini-live",
    "temperature": 0.7,
    "maxTokens": 4000
  }
}
Model Recommendations:
  • gemini-2.0-flash-exp: Fastest, most reliable (Recommended)
  • gemini-2.0-flash-live-001: Stable alternative
  • Avoid: gemini-2.5-flash-exp-native-audio-thinking-dialog (tool calling issues)

πŸŽ™οΈ Step 3: Voice Configuration

System Instructions

Configure your agent’s personality and instructions:
System Prompt Example
You are a helpful AI assistant capable of having natural voice conversations.

Key Guidelines:
- Speak naturally and conversationally
- Use tools when needed to help users
- Keep responses concise for voice interactions
- Ask clarifying questions if unsure

Current date: {current_date}
Available tools: turn_on_lights, get_weather, search_knowledge

Voice Settings

Configure these essential voice parameters:
{
  "speechConfig": {
    "voiceConfig": {
      "prebuiltVoiceConfig": {
        "voiceName": "Puck"
      }
    }
  },
  "responseModalities": ["AUDIO"],
  "inputAudioTranscription": {},
  "outputAudioTranscription": {}
}

πŸ”§ Step 4: Environment Setup

Environment Variables

Add these to your .env file:
# Google AI API Configuration
GOOGLE_GENAI_API_KEY=your_google_ai_api_key_here

# Twilio Configuration (if using phone calls)
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token

# Optional: Advanced Settings
GEMINI_LIVE_DEBUG=false
ULTRA_FAST_AUDIO=true

Package Dependencies

Ensure these packages are installed:
package.json
{
  "dependencies": {
    "@google/genai": "^0.21.0",
    "twilio": "^4.19.0", 
    "ws": "^8.14.0",
    "@langchain/core": "^0.1.0"
  }
}

πŸ“ž Step 5: Phone Integration (Optional)

Twilio Setup

If you want phone call capabilities:
  1. Purchase a phone number in Twilio Console
  2. Configure webhook URL to point to your TixAE endpoint:
    https://your-domain.com/api/voice/twilio
    
  3. Set HTTP method to POST

WebRTC Setup

For browser-based calling:
  1. Enable WebRTC in your agent settings
  2. Configure STUN/TURN servers if behind NAT
  3. Test browser permissions for microphone access

πŸ§ͺ Step 6: Testing & Verification

Test Your Setup

1

API Key Test

Verify your Google AI API key works with Gemini Live models
2

Voice Test

Test basic voice input/output functionality
3

Tool Calling Test

Verify tools can be called during conversation
4

Phone Test

Make a test call to verify end-to-end functionality

Debug Commands

Use these commands to troubleshoot:
# Test Gemini Live connection
node -e "console.log('Testing Gemini Live...'); /* your test code */"

# Check audio pipeline
curl -X POST "your-endpoint/test-audio" -H "Content-Type: audio/wav" --data-binary @test.wav

# Verify tool configuration  
curl -X GET "your-endpoint/tools" -H "Authorization: Bearer YOUR_TOKEN"

⚑ Performance Optimization

Ultra-Fast Audio Processing

Our implementation includes cutting-edge optimizations:
  • 20ms chunk processing for minimal latency
  • Loop-unrolled resampling (6x speed improvement)
  • Direct memory operations using bit shifts
  • Minimal validation for maximum throughput
Automatic Optimization: These performance enhancements are automatically applied when using TixAE’s Gemini Live integration. No additional configuration required!

🚨 Troubleshooting

Common Issues


🎯 Next Steps


πŸ“ž Need Help?

Having trouble with setup?

Check out our troubleshooting guide or contact support for personalized assistance.
Your ultra-fast Gemini Live voice agent is ready to go! πŸš€