Module 6: Adding Language and translation
Add Speech Recognition and Translation
This guide explains how to enable speech recognition and translation in your GUI AI-powered AI Agent, using the example of a hypertension health assistant. These features are especially useful for building multilingual bots—like an agriculture assistant for Northern India, where Hindi is widely spoken. With these capabilities, users can send voice notes in their preferred language and receive answers in the same language.
Enabling Speech Recognition
In the Capabilities section, enable the Speech Recognition feature. This allows users to record voice messages directly into the AI Agent.

Test
Click the microphone/clip icon in the message section, record a question (e.g., “What is your recommendation for drug classes as first-line agents?”), and click on "Send".

The LLM processes the transcribed text and generates a response, referencing the knowledge base (e.g., the WHO hypertension guidelines).
Adding Translation
Many LLMs may not fully understand non-English languages, so translation is essential to any AI Agent pipeline.
Steps to follow:
To support other languages (e.g., Hindi), enable the Translation feature alongside speech recognition.
Select your translation provider (e.g., Google Translate).
Set the target translation language (e.g., Hindi).
Now, when a user submits a voice note in their preferred language, the system will:
Transcribe the audio
Translate the input to English for the LLM (if needed)
Process the question
Translate the LLM’s response back into the user’s language (e.g., Hindi)
Next Steps
In the next guide, you’ll learn how to add Text-to-Speech, allowing users to listen to responses in their language instead of reading.
Last updated
Was this helpful?