ai Priority 4/5 6/10/2026, 11:05:15 AM

ServiceNow AI Releases AU-Harness Benchmark to Evaluate Code-Switching in Automatic Speech Recognition

ServiceNow AI has released AU-Harness, a benchmark dataset and evaluation toolkit designed to assess Automatic Speech Recognition systems on code-switching speech. Code-switching occurs when bilingual speakers seamlessly alternate between languages within a single utterance, a common behavior in international customer service centers and multilingual workplaces. Despite its real-world prevalence, conventional ASR benchmarks frequently assume a single primary language, leading to performance degradation in practical deployments. The benchmark evaluates model performance using four distinct language pairs that mix English with Spanish, French, and German. The datasets represent typical IT support and human resources dialogues, containing spoken utterances ranging from 12 to 40 words. By testing seven modern speech models, including OpenAI's Whisper and several Large Audio-Language Models, the research revealed that recognition accuracy varies significantly depending on the language pair and the length of word embeddings. Traditional commercial ASR models often struggle when forced to parse multiple languages dynamically, resulting in high Word Error Rates. AU-Harness provides a standardized framework to quantify these error rates under realistic, mixed-language conditions. This benchmark offers developers and enterprise architects concrete metrics to guide the selection and fine-tuning of speech-to-text models for global applications.

Related tools

Comparison

Aspect	Before / Alternative	After / This
Language Assumption	Assumes a single, pre-declared primary language for the audio stream	Accommodates dynamic language switching mid-utterance (code-switching)
Evaluation Context	General-purpose read speech or monolingual conversational datasets	Domain-specific enterprise dialogues (IT support and Human Resources)
Performance Metric Focus	Standard global Word Error Rate (WER)	WER variations analyzed across specific language pairs and embedding lengths

Source: Hugging Face Blog

This page summarizes the original source. Check the source for full details.

More English news Open source

ServiceNow AI Releases AU-Harness Benchmark to Evaluate Code-Switching in Automatic Speech Recognition

Recommended tools for this topic

Comparison

Related