Module 6: Adding Language and translation

Add Speech Recognition and Translation

This guide explains how to enable speech recognition and translation in your GUI AI-powered AI Agent, using the example of a hypertension health assistant. These features are especially useful for building multilingual bots—like an agriculture assistant for Northern India, where Hindi is widely spoken. With these capabilities, users can send voice notes in their preferred language and receive answers in the same language.

Enabling Speech Recognition

In the Capabilities section, enable the Speech Recognition feature. This allows users to record voice messages directly into the AI Agent.

Test

Click the microphone/clip icon in the message section, record a question (e.g., “What is your recommendation for drug classes as first-line agents?”), and click on "Send".

The LLM processes the transcribed text and generates a response, referencing the knowledge base (e.g., the WHO hypertension guidelines).

Adding Translation

Many LLMs may not fully understand non-English languages, so translation is essential to any AI Agent pipeline.

Steps to follow:

To support other languages (e.g., Hindi), enable the Translation feature alongside speech recognition.
- Select your translation provider (e.g., Google Translate).
- Set the target translation language (e.g., Hindi).
Now, when a user submits a voice note in their preferred language, the system will:
- Transcribe the audio
- Translate the input to English for the LLM (if needed)
- Process the question
- Translate the LLM’s response back into the user’s language (e.g., Hindi)

Next Steps

In the next guide, you’ll learn how to add Text-to-Speech, allowing users to listen to responses in their language instead of reading.

PreviousTips for prompting based on LLM NextModule 6b: Setup Text-to-Speech

Last updated 1 month ago

Was this helpful?