Nigerian Startup, Intron, Expands AI Speech Recognition Platform to 57 Languages

Quadri Adejumo
By
Quadri Adejumo
Senior Journalist and Analyst
Quadri Adejumo is a senior journalist and analyst at Techparley, where he leads coverage on innovation, startups, artificial intelligence, digital transformation, and policy developments shaping Africa’s...
- Senior Journalist and Analyst
9 Min Read

Nigerian artificial intelligence startup, Intron, has expanded its speech recognition platform, Sahara, to support 57 languages, adding 24 new languages as the company accelerates efforts to build voice technology tailored to African speech patterns.

The latest upgrade, Sahara v2, supports 23 African languages within the overall total and recognises more than 500 African accents, marking one of the most extensive speech recognition systems built specifically for the continent.

Among the newly added languages are Hausa, Swahili, isiZulu, Yoruba, Kinyarwanda, Twi, Igbo, isiXhosa, African French, Amharic, Bemba, Luganda, Oromo, Pidgin, Shona, and Wolof.

According to the company, language selection was largely driven by enterprise demand and commercial use cases across sectors such as healthcare, legal services, financial services, and telecommunications.

What you need to know

The expansion reflects a broader shift within Africa’s technology ecosystem, rather than adapting Western-built speech models, startups are increasingly developing AI infrastructure designed for the linguistic realities of the continent.

Africa is home to roughly 2,000 languages, the vast majority of which exist primarily as spoken languages with limited written datasets. As a result, voice-based technology is emerging as a crucial gateway to digital services.

Globally, the speech and voice recognition market is projected to reach $81.59 billion by 2032, according to industry forecasts. For African developers, however, the challenge has been building models capable of understanding local accents, dialects, and linguistic mixing, which global systems often struggle with.

Intron says Sahara was designed to address that gap by relying heavily on locally sourced voice data to capture the contextual nuances of African speech.

Benchmarks claim stronger performance on African speech

According to internal benchmarking conducted using African voice datasets curated by the company, Sahara v2 significantly outperformed several global speech AI systems.

The company reports that Sahara v2 delivered up to 64% better accuracy when transcribing African names, organisations, and locations, compared with models such as Gemini, GPT-4, Whisper, ElevenLabs, and Azure Speech.

Other reported improvements include:

  • 35% stronger performance with numerical data
  • 20% greater robustness in noisy or multi-speaker environments
  • 25% higher cross-domain accuracy across industries such as healthcare, finance, legal services, and telecommunications

“We curated datasets of African voices, combining publicly available datasets with our in-house collection, and made them available so anyone can test global models on African speech,” Tobi Olatunji, founder and CEO of Intron, told TechCabal.

From medical transcription to continent-wide voice infrastructure

Founded in 2020 by Tobi Olatunji and Olakunle Asekun, Intron initially focused on clinical documentation tools, using speech recognition to help medical professionals capture patient notes more efficiently.

Since then, the company has expanded its platform into a broader voice infrastructure layer, offering speech-to-text, text-to-speech, and voice authentication systems used by enterprises and government agencies.

Intron says its platform now sees consistent usage in at least six African countries, including Nigeria, Kenya, South Africa, Ghana, Rwanda, and Uganda.

Enterprise clients include the Ogun State Judiciary, which uses the platform for transcription, and Audere, a South African health technology company that relies on Sahara to transcribe WhatsApp voice messages across multiple local accents.

“We are a for-profit company, so we prioritised languages where there is enterprise demand and willingness to pay,” Olatunji said. “There is still a massive gap across the continent, but we have to start where there is both population coverage and clear use cases.”

Training the models with African voices

The latest version of Sahara was trained using more than 14 million audio clips, representing over 50,000 hours of speech from more than 40,000 speakers across 30 African countries.

Much of the early data collection was labour-intensive, particularly in specialised fields such as medical speech.

While newer datasets supported by organisations including the Gates Foundation, Lacuna Fund, and Google have contributed to the broader ecosystem, Intron says most of Sahara’s training data remains internally generated.

One of Sahara v2’s most notable upgrades is what Intron describes as the world’s first bilingual Swahili–English automatic speech recognition model built specifically to handle code-switching.

The model was developed in partnership with Penda Health, a Kenyan outpatient healthcare provider, where clinicians frequently switch between languages during consultations.

Code-switching is common across Africa, particularly in clinical, customer service, and public-sector environments, making bilingual models increasingly important.

Intron says additional code-switching models for Yoruba, Hausa, Zulu, and Kinyarwanda are currently in development.

Expanding into text-to-speech and voice bots

Alongside speech recognition improvements, the company has also launched its first local-language text-to-speech model in Hausa.

The model is designed to power multilingual voice assistants and call-centre bots, enabling businesses to interact with customers in their native languages.

Potential applications include health advisory services, banking support systems, and automated call-centre agents, particularly in markets where voice communication remains the primary channel.

Recognising that many African organisations operate in low-connectivity environments, Intron has introduced fully offline enterprise deployment options for Sahara.

Through a partnership with Nvidia, the models can now run on Nvidia Jetson Edge devices, enabling local processing without relying on cloud connectivity.

According to the company, the entry-level device costs roughly $250 and can support multiple users connected within a local network.

Although full offline processing on smartphones is not currently supported, the platform can operate in caching mode, allowing it to function in areas with intermittent internet access.

Intron says it also complies with local data protection regulations, giving enterprise clients the flexibility to store data either locally or in the cloud.

Talking Points

It is notable that Intron is expanding its speech recognition platform, Sahara, to support 57 languages, including a significant number of African languages and accents that have historically been underserved by global AI systems.

By focusing on African speech patterns, dialects, and contextual nuances, the company is addressing a critical gap in the global voice technology ecosystem, where most existing models struggle to accurately interpret African names, locations, and accents.

At Techparley, we see how innovations like Sahara v2 could play a transformative role in sectors such as healthcare, legal services, financial services, and telecommunications, where accurate speech transcription and voice-based interactions can significantly improve efficiency and accessibility.

The introduction of a bilingual Swahili–English model designed to handle code-switching is particularly important. In many African countries, people naturally switch between languages in everyday conversations, especially in clinical and customer service environments. Tools that can understand this linguistic reality have the potential to greatly enhance digital service delivery.

As Intron continues to scale Sahara, there is an opportunity to further deepen collaborations with healthcare providers, financial institutions, telecom companies, and public sector organisations that rely heavily on voice communication.

If successfully executed, platforms like Sahara could help lay the foundation for a new generation of voice-driven digital services built specifically for Africa’s linguistic and technological landscape.

——————-

Bookmark Techparley.com for the most insightful technology news from the African continent.

Follow us on Twitter @Techparleynews, on Facebook at Techparley Africa, on LinkedIn at Techparley Africa, or on Instagram at Techparleynews.

Senior Journalist and Analyst
Follow:
Quadri Adejumo is a senior journalist and analyst at Techparley, where he leads coverage on innovation, startups, artificial intelligence, digital transformation, and policy developments shaping Africa’s tech ecosystem and beyond. With years of experience in investigative reporting, feature writing, critical insights, and editorial leadership, Quadri breaks down complex issues into clear, compelling narratives that resonate with diverse audiences, making him a trusted voice in the industry.
Leave a Comment

Leave a Reply

Your email address will not be published. Required fields are marked *

Subscribe to Techparley Africa

Stay ahead of the curve. While millions of people still have to search the internet for the latest tech stories, industry insights and expert analysis; you can simply get them delivered to your inbox.


Please ignore this message if you have already subscribed.

×