# Building AI for Underrepresented Languages and Creators

## **Language Evals** <a href="#iwnoib5nebga" id="iwnoib5nebga"></a>

In our mission to catalyze AI for global impact, we've been working extensively to assess how well the latest crop of state-of-the-art AI models understand low-resource languages, including Kikuyu, Swahili and Kinyarwanda.

We are now hosting:

* [Sunbird](https://gooey.ai/speech/kinyarwanda-sunbird-asr-vulavula-6qtjeruc6ava/)&#x20;
* [Jacaranda health](https://gooey.ai/speech/jacaranda-health-asr-google-translate-swahili-en-nj75qn3smx1q/)&#x20;
* [Akera](https://gooey.ai/speech/kikuyu-asr-via-akerawhisper-kik-full_v2-fine-tuned-us5dwt521r2l/)
* [Mbaza](https://gooey.ai/speech/mbaza-asr-google-translate-swahili-en-x06smbljck5e/)
* [Vulavula](https://gooey.ai/speech/kinyarwanda-sunbird-asr-vulavula-6qtjeruc6ava/)

Not only are we hosting the best African language models, we've also set up detailed evaluations to answer critical questions for developers building voice services in these languages.

## **Our Evaluation Approach** <a href="#id-6e5luqup9xm1" id="id-6e5luqup9xm1"></a>

Our evaluations use real-world questions from agriculture, health, and general conversation—the domains where voice AI can have the greatest impact. Each question was recorded as natural spoken audio to test end-to-end performance in realistic conditions.

#### **We evaluated three key questions:**

1. **Which AI architectures best understand common Swahili, Kikuyu, and Kinyarwanda questions?** We tested different workflow approaches including chained models with machine translation, fine-tuned ASR paired with GPT/Gemini for translation, and single-model audio-to-audio systems like GPT-realtime.
2. **Do these models understand spoken language well enough for production deployments?** We measured how accurately each architecture could process questions and generate expert-level responses.
3. **Are response times fast enough for real-world use?** We tracked latency to ensure these solutions could power voice-only services on non-smartphones, which is critical for accessibility in many communities.

#### **Results** <a href="#id-44fx6eqqqfhh" id="id-44fx6eqqqfhh"></a>

Our evals show that fine-tuned ASR combined with GPT-5 / Gemini 2.5 delivered improved accuracy and lower latency across all three languages.

|                                                                                                                                           | ASR-MT-LLM-MT-TTS                                                                                                                            | ASR-LLM-TTS                                                                                                           | Single Model                                                                                                                                                                                                                                  |
| ----------------------------------------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| <p><a href="https://gooey.ai/bulk/top4-swahili-output-text-eval-8th-sept-4qk762cbmepp/">Quality</a></p><p>Swahil Audio2English Answer</p> | <p>94% </p><p><a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-google-mt-a2t-8ps1p54ep74x/">Jacaranda + GPT5 + Google Trans</a></p> | <p>100% </p><p><a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-a2t-pkj3bnha20zu/">Jacaranda + GPT-5</a></p> | <p>49%</p><p> <a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-google-mt-a2t-8ps1p54ep74x/">GPT4o realtime</a> beats <a href="https://gooey.ai/copilot/swahili-jacaranda-gpt-5-google-mt-a2t-8ps1p54ep74x/">GPT-realtime</a> 31%</p> |
| <p><a href="https://gooey.ai/bulk/swahili-latency-k5dxwa79044l/">Latency</a></p><p>Swahili Audio2SwahiliAudio</p><p>Mean in seconds</p>   | 6.3                                                                                                                                          | 5.99                                                                                                                  | 6.48                                                                                                                                                                                                                                          |

### **Explore the detailed evaluations:**

**KINYARWANDA**

{% embed url="<https://gooey.ai/bulk/realtime-kinyarwanda-audio2text-prompt-compare-30qs-m8nyfl06njsf/>" %}

**SWAHILI:**

{% embed url="<https://gooey.ai/bulk/top4-swahili-audio2text-comparison-11-sept-30qs-7sqzq9jf0wxn/>" %}

**KIKUYU:**

{% embed url="<https://gooey.ai/bulk/kikuyu-audio2text-comparison-25qs-u5vpyu8ip1r1/>" %}

## Other Updated Language Models <a href="#aeofg1vyaqz2" id="aeofg1vyaqz2"></a>

In our efforts to create higher accessiblity for AI in the impact sector we are happy to share that we are already hosting - [Sealion v4](https://gooey.ai/copilot/english-sea-lion-v4-bot-lysd8rxa6vl7/) and [Apertus](https://gooey.ai/copilot/apertus-copilot-b8a03utn306w/)!&#x20;

## Video Workflows <a href="#id-6pophgn5omg1" id="id-6pophgn5omg1"></a>

#### Animate Under-represented Datasets <a href="#id-1lc5smr1qhj7" id="id-1lc5smr1qhj7"></a>

As part of our [**Beyond Bias**](https://gooey.ai/beyondbias) initiative, we're expanding video capabilities to help creators and artists bring visibility to underrepresented communities and datasets. With [Gooey.AI](http://gooey.ai/) you can now:

1. Bring your own image dataset
2. Train a Flux Lora custom image model
3. Use the Lora Model to create images
4. And finally, animate these images

***ZOOM IN to see the Beyond Bias Workflow!***&#x20;

{% embed url="<https://www.figma.com/board/sUPT00UJNn2LvF731rxkoO/Untitled?node-id=0-1&t=E5JMiokgVJOuvEOQ-1>" %}

These features emerged from our Beyond Bias workshops, where we identified the need for AI tools that don't just work for everyone, but actively help amplify voices and stories that have been marginalized in AI training data.

**Learn more about Beyond Bias workflows** in our upcoming section below.

#### Video Models on Gooey.AI <a href="#id-57wyvoimv3a1" id="id-57wyvoimv3a1"></a>

We are thrilled to release our text-to-video and image-to-video models! Start making high-quality videos with:

* Veo3
* Wan 2.5
* Kling and more!

{% @arcade/embed flowId="6XTIJ7DCx0ycv4lHVVxV" url="<https://app.arcade.software/share/6XTIJ7DCx0ycv4lHVVxV>" %}

{% hint style="info" %}
PRO TIP: It can also generate audio!
{% endhint %}

## Upcoming <a href="#jd5whn38njyi" id="jd5whn38njyi"></a>

Finally, we are excited to announce our upcoming Beyond Bias Prompt-a-thon in Delhi, Pune and Bangalore.

{% embed url="<https://cdn.prod.website-files.com/6864c89c210135e40bbd6674%2F68d162922146809732404843_Beyond%20Bias%20Video-transcode.mp4>" %}

> Beyond Bias, a Gooey.AI and Goethe-Institut India partnership, reimagines generative AI through participatory practices, creating inclusive datasets and tools that honor diversity and drive innovation.

Know more about Beyond Bias:

{% embed url="<https://gooey.ai/beyondbias>" %}
