Gemini Live Setup - Convocore

🚀 Quick Start

Get your Gemini Live voice agent running in under 5 minutes with this comprehensive setup guide.

📋 Prerequisites

Before starting, ensure you have:

Google AI API Key

Access to Google’s Generative AI API with Gemini Live enabled

Twilio Account

For phone call integration (optional for web-only)

Node.js Environment

Node.js 18+ for running the voice service

TixAE Account

Your TixAE workspace with voice features enabled

🔑 Step 1: Google AI API Setup

Get Your API Key

Visit the Google AI Studio
Create a new API key or use existing one
Important: Ensure your key has access to Gemini Live models

API Key Requirements: Your Google AI API key must have access to the latest Gemini models. Some keys may not have Live API access by default.

Verify API Access

Test your API key with this quick verification:

curl -X POST "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.0-flash-exp:generateContent?key=YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "Hello, are you working?"}]}]
  }'

⚙️ Step 2: Configure Your Agent

1. Create Your Agent

In your TixAE dashboard:

Navigate to Create Agent → Voice Agent
Choose “Google Gemini Live” as your voice provider
Set up your basic agent configuration

2. Model Selection

Choose the correct Gemini Live model:

{
  "llmConfig": {
    "modelId": "gemini-2.0-flash-exp",
    "provider": "google-gemini-live",
    "temperature": 0.7,
    "maxTokens": 4000
  }
}

Model Recommendations:

gemini-2.0-flash-exp: Fastest, most reliable (Recommended)
gemini-2.0-flash-live-001: Stable alternative
Avoid: gemini-2.5-flash-exp-native-audio-thinking-dialog (tool calling issues)

🎙️ Step 3: Voice Configuration

System Instructions

Configure your agent’s personality and instructions:

System Prompt Example

You are a helpful AI assistant capable of having natural voice conversations.

Key Guidelines:
- Speak naturally and conversationally
- Use tools when needed to help users
- Keep responses concise for voice interactions
- Ask clarifying questions if unsure

Current date: {current_date}
Available tools: turn_on_lights, get_weather, search_knowledge

Voice Settings

Configure these essential voice parameters:

{
  "speechConfig": {
    "voiceConfig": {
      "prebuiltVoiceConfig": {
        "voiceName": "Puck"
      }
    }
  },
  "responseModalities": ["AUDIO"],
  "inputAudioTranscription": {},
  "outputAudioTranscription": {}
}

🔧 Step 4: Environment Setup

Environment Variables

Add these to your .env file:

# Google AI API Configuration
GOOGLE_GENAI_API_KEY=your_google_ai_api_key_here

# Twilio Configuration (if using phone calls)
TWILIO_ACCOUNT_SID=your_twilio_account_sid
TWILIO_AUTH_TOKEN=your_twilio_auth_token

# Optional: Advanced Settings
GEMINI_LIVE_DEBUG=false
ULTRA_FAST_AUDIO=true

Package Dependencies

Ensure these packages are installed:

package.json

{
  "dependencies": {
    "@google/genai": "^0.21.0",
    "twilio": "^4.19.0", 
    "ws": "^8.14.0",
    "@langchain/core": "^0.1.0"
  }
}

📞 Step 5: Phone Integration (Optional)

Twilio Setup

If you want phone call capabilities:

Purchase a phone number in Twilio Console
Configure webhook URL to point to your TixAE endpoint:
```
https://your-domain.com/api/voice/twilio
```
Set HTTP method to POST

WebRTC Setup

For browser-based calling:

Enable WebRTC in your agent settings
Configure STUN/TURN servers if behind NAT
Test browser permissions for microphone access

🧪 Step 6: Testing & Verification

Test Your Setup

API Key Test

Verify your Google AI API key works with Gemini Live models

Voice Test

Test basic voice input/output functionality

Tool Calling Test

Verify tools can be called during conversation

Phone Test

Make a test call to verify end-to-end functionality

Debug Commands

Use these commands to troubleshoot:

# Test Gemini Live connection
node -e "console.log('Testing Gemini Live...'); /* your test code */"

# Check audio pipeline
curl -X POST "your-endpoint/test-audio" -H "Content-Type: audio/wav" --data-binary @test.wav

# Verify tool configuration  
curl -X GET "your-endpoint/tools" -H "Authorization: Bearer YOUR_TOKEN"

⚡ Performance Optimization

Ultra-Fast Audio Processing

Our implementation includes cutting-edge optimizations:

20ms chunk processing for minimal latency
Loop-unrolled resampling (6x speed improvement)
Direct memory operations using bit shifts
Minimal validation for maximum throughput

Automatic Optimization: These performance enhancements are automatically applied when using TixAE’s Gemini Live integration. No additional configuration required!

🚨 Troubleshooting

Common Issues

❌ Connection Failed

❌ Tool Calls Not Working

❌ Audio Quality Issues

❌ Reconnection Problems

🎯 Next Steps

Add Tool Integration

Integrate external APIs and function calling

Performance Optimization

Ultra-fast audio processing and latency tuning

Advanced Configuration

Reconnection logic, language detection, and more

Production Deployment

Deploy your voice agent to production

📞 Need Help?

Having trouble with setup?

Check out our troubleshooting guide or contact support for personalized assistance.

Your ultra-fast Gemini Live voice agent is ready to go! 🚀

​🚀 Quick Start

​📋 Prerequisites

Google AI API Key

Twilio Account

Node.js Environment

TixAE Account

​🔑 Step 1: Google AI API Setup

​Get Your API Key

​Verify API Access

​⚙️ Step 2: Configure Your Agent

​1. Create Your Agent

​2. Model Selection

​🎙️ Step 3: Voice Configuration

​System Instructions

​Voice Settings

​🔧 Step 4: Environment Setup

​Environment Variables

​Package Dependencies

​📞 Step 5: Phone Integration (Optional)

​Twilio Setup

​WebRTC Setup

​🧪 Step 6: Testing & Verification

​Test Your Setup

​Debug Commands

​⚡ Performance Optimization

​Ultra-Fast Audio Processing

​🚨 Troubleshooting

​Common Issues

​🎯 Next Steps

Add Tool Integration

Performance Optimization

Advanced Configuration

Production Deployment

​📞 Need Help?

Having trouble with setup?

🚀 Quick Start

📋 Prerequisites

🔑 Step 1: Google AI API Setup

Get Your API Key

Verify API Access

⚙️ Step 2: Configure Your Agent

1. Create Your Agent

2. Model Selection

🎙️ Step 3: Voice Configuration

System Instructions

Voice Settings

🔧 Step 4: Environment Setup

Environment Variables

Package Dependencies

📞 Step 5: Phone Integration (Optional)

Twilio Setup

WebRTC Setup

🧪 Step 6: Testing & Verification

Test Your Setup

Debug Commands

⚡ Performance Optimization

Ultra-Fast Audio Processing

🚨 Troubleshooting

Common Issues

🎯 Next Steps

📞 Need Help?