# Building AI for Underrepresented Languages and Creators

## **Language Evals** <a href="#iwnoib5nebga" id="iwnoib5nebga"></a>

In our mission to catalyze AI for global impact, we've been working extensively to assess how well the latest crop of state-of-the-art AI models understand low-resource languages, including Kikuyu, Swahili and Kinyarwanda.

We are now hosting:

* [Sunbird](https://gooey.ai/speech/kinyarwanda-sunbird-asr-vulavula-6qtjeruc6ava/)&#x20;
* [Jacaranda health](https://gooey.ai/speech/jacaranda-health-asr-google-translate-swahili-en-nj75qn3smx1q/)&#x20;
* [Akera](https://gooey.ai/speech/kikuyu-asr-via-akerawhisper-kik-full_v2-fine-tuned-us5dwt521r2l/)
* [Mbaza](https://gooey.ai/speech/mbaza-asr-google-translate-swahili-en-x06smbljck5e/)
* [Vulavula](https://gooey.ai/speech/kinyarwanda-sunbird-asr-vulavula-6qtjeruc6ava/)

Not only are we hosting the best African language models, we've also set up detailed evaluations to answer critical questions for developers building voice services in these languages.

## **Our Evaluation Approach** <a href="#id-6e5luqup9xm1" id="id-6e5luqup9xm1"></a>

Our evaluations use real-world questions from agriculture, health, and general conversation—the domains where voice AI can have the greatest impact. Each question was recorded as natural spoken audio to test end-to-end performance in realistic conditions.

#### **We evaluated three key questions:**

1. **Which AI architectures best understand common Swahili, Kikuyu, and Kinyarwanda questions?** We tested different workflow approaches including chained models with machine translation, fine-tuned ASR paired with GPT/Gemini for translation, and single-model audio-to-audio systems like GPT-realtime.
2. **Do these models understand spoken language well enough for production deployments?** We measured how accurately each architecture could process questions and generate expert-level responses.
3. **Are response times fast enough for real-world use?** We tracked latency to ensure these solutions could power voice-only services on non-smartphones, which is critical for accessibility in many communities.

#### **Results** <a href="#id-44fx6eqqqfhh" id="id-44fx6eqqqfhh"></a>

Our evals show that fine-tuned ASR combined with GPT-5 / Gemini 2.5 delivered improved accuracy and lower latency across all three languages.

|                                                                                                                                           | ASR-MT-LLM-MT-TTS                                                                                                                            | ASR-LLM-TTS                                                                                                           | Single Model                                                                                                                                                                                                                                  |
| ----------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p><a href="https://gooey.ai/bulk/top4-swahili-output-text-eval-8th-sept-4qk762cbmepp/">Quality</a></p><p>Swahil Audio2English Answer</p> | <p>94% </p><p><a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-google-mt-a2t-8ps1p54ep74x/">Jacaranda + GPT5 + Google Trans</a></p> | <p>100% </p><p><a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-a2t-pkj3bnha20zu/">Jacaranda + GPT-5</a></p> | <p>49%</p><p> <a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-google-mt-a2t-8ps1p54ep74x/">GPT4o realtime</a> beats <a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-google-mt-a2t-8ps1p54ep74x/">GPT-realtime</a> 31%</p> |
| <p><a href="https://gooey.ai/bulk/swahili-latency-k5dxwa79044l/">Latency</a></p><p>Swahili Audio2SwahiliAudio</p><p>Mean in seconds</p>   | 6.3                                                                                                                                          | 5.99                                                                                                                  | 6.48                                                                                                                                                                                                                                          |

### **Explore the detailed evaluations:**

**KINYARWANDA**

{% embed url="<https://gooey.ai/bulk/realtime-kinyarwanda-audio2text-prompt-compare-30qs-m8nyfl06njsf/>" %}

**SWAHILI:**

{% embed url="<https://gooey.ai/bulk/top4-swahili-audio2text-comparison-11-sept-30qs-7sqzq9jf0wxn/>" %}

**KIKUYU:**

{% embed url="<https://gooey.ai/bulk/kikuyu-audio2text-comparison-25qs-u5vpyu8ip1r1/>" %}

## Other Updated Language Models <a href="#aeofg1vyaqz2" id="aeofg1vyaqz2"></a>

In our efforts to create higher accessiblity for AI in the impact sector we are happy to share that we are already hosting - [Sealion v4](https://gooey.ai/copilot/english-sea-lion-v4-bot-lysd8rxa6vl7/) and [Apertus](https://gooey.ai/copilot/apertus-copilot-b8a03utn306w/)!&#x20;

## Video Workflows <a href="#id-6pophgn5omg1" id="id-6pophgn5omg1"></a>

#### Animate Under-represented Datasets <a href="#id-1lc5smr1qhj7" id="id-1lc5smr1qhj7"></a>

As part of our [**Beyond Bias**](https://gooey.ai/beyondbias) initiative, we're expanding video capabilities to help creators and artists bring visibility to underrepresented communities and datasets. With [Gooey.AI](http://gooey.ai/) you can now:

1. Bring your own image dataset
2. Train a Flux Lora custom image model
3. Use the Lora Model to create images
4. And finally, animate these images

***ZOOM IN to see the Beyond Bias Workflow!***&#x20;

{% embed url="<https://www.figma.com/board/sUPT00UJNn2LvF731rxkoO/Untitled?node-id=0-1&t=E5JMiokgVJOuvEOQ-1>" %}

These features emerged from our Beyond Bias workshops, where we identified the need for AI tools that don't just work for everyone, but actively help amplify voices and stories that have been marginalized in AI training data.

**Learn more about Beyond Bias workflows** in our upcoming section below.

#### Video Models on Gooey.AI <a href="#id-57wyvoimv3a1" id="id-57wyvoimv3a1"></a>

We are thrilled to release our text-to-video and image-to-video models! Start making high-quality videos with:

* Veo3
* Wan 2.5
* Kling and more!

{% @arcade/embed flowId="6XTIJ7DCx0ycv4lHVVxV" url="<https://app.arcade.software/share/6XTIJ7DCx0ycv4lHVVxV>" %}

{% hint style="info" %}
PRO TIP: It can also generate audio!
{% endhint %}

## Upcoming <a href="#jd5whn38njyi" id="jd5whn38njyi"></a>

Finally, we are excited to announce our upcoming Beyond Bias Prompt-a-thon in Delhi, Pune and Bangalore.

{% embed url="<https://cdn.prod.website-files.com/6864c89c210135e40bbd6674%2F68d162922146809732404843_Beyond%20Bias%20Video-transcode.mp4>" %}

> Beyond Bias, a Gooey.AI and Goethe-Institut India partnership, reimagines generative AI through participatory practices, creating inclusive datasets and tools that honor diversity and drive innovation.

Know more about Beyond Bias:

{% embed url="<https://gooey.ai/beyondbias>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://blog.gooey.ai/building-ai-for-underrepresented-languages-and-creators.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
