ai Priority 4/5 6/12/2026, 11:05:16 AM

Google DeepMind Introduces Gemini 3.5 Live Translate for Real-Time Voice Translation

Google DeepMind has unveiled Gemini 3.5 Live Translate, an advanced model designed to facilitate highly fluid, bidirectional voice translation. By processing spoken input and generating translated speech with minimal latency, this update aims to make cross-lingual conversations feel as natural as face-to-face interactions.

Related tools

Comparison

Aspect	Before / Alternative	After / This
Translation Pipeline	Cascaded systems requiring separate Automatic Speech Recognition (ASR), Machine Translation (MT), and Text-to-Speech (TTS) steps	End-to-end multimodal processing within a single unified model architecture
Latency and Speed	High cumulative latency due to sequential processing of speech-to-text-to-speech translation	Near real-time streaming translation with significantly reduced pause intervals between speaker turns
Tone and Emotion	Loss of original vocal nuances, pitch, and emotional context during intermediate text translation phases	Preservation of the speaker's original emotional tone, inflection, and natural voice characteristics

Source: DeepMind Blog

This page summarizes the original source. Check the source for full details.

More English news Open source

Google DeepMind Introduces Gemini 3.5 Live Translate for Real-Time Voice Translation

Recommended tools for this topic

Comparison

Related