Google Opens WAXAL To Support Over 100 Million African Language Speakers

Google has launched WAXAL, a large open speech dataset designed to support the development of artificial intelligence tools for African languages.

The dataset, which took over three years to develop, aims to address the shortage of high-quality speech data that has limited the use of voice-based technologies across much of Sub-Saharan Africa.

WAXAL contains speech data for 21 African languages, including Hausa, Yoruba, Igbo, Swahili, Luganda, and Acholi. According to Google, the dataset is intended to support more than 100 million speakers whose languages are largely absent from existing speech recognition and voice synthesis systems.

The dataset includes more than 11,000 hours of speech recordings drawn from nearly two million individual audio samples. Of this total, about 1,250 hours are fully transcribed natural speech, which can be used to train automatic speech recognition systems. In addition, the dataset contains over 20 hours of studio-quality recordings suitable for text-to-speech voice generation.

Google said WAXAL was developed in partnership with African universities and research organisations. Makerere University in Uganda and the University of Ghana led data collection for 13 languages, while Digital Umuganda in Rwanda coordinated work on five languages. Professional studio recordings were produced with support from Media Trust and Loud n Clear, while the African Institute for Mathematical Sciences (AIMS) contributed multilingual data for future releases.

Unlike many global speech datasets, ownership of the collected data remains with the African institutions that produced it. Google said this structure is intended to ensure that local researchers and developers can independently build tools.

“The ultimate impact of WAXAL is the empowerment of people in Africa,” said Aisha Walcott-Bryantt, Head of Google Research Africa. “This dataset provides the critical foundation for students, researchers, and entrepreneurs to build technology in their own languages and reach over 100 million people.”

Speech data was collected by asking volunteers to describe images in their native languages, a method intended to capture natural patterns of everyday speech. High-quality studio recordings were produced by professional voice actors to support realistic text-to-speech applications.

At the University of Ghana, more than 7,000 volunteers contributed voice samples to the project. Isaac Wiafe, an Associate Professor at the university, said the dataset could support innovation in education, healthcare, and agriculture.

“For AI to have a real impact in Africa, it must speak our languages and understand our contexts,” said Joyce Nakatumba-Nabende, a Senior Lecturer at Makerere University. “WAXAL gives researchers access to the quality data needed to build speech technologies that reflect our communities.”

The full WAXAL dataset is released under an open license and is now publicly available on the Hugging Face platform. Google said the dataset is intended for use by researchers, developers, startups, and public institutions working on speech-enabled technologies across Africa.

Tags
google

0 0 votes

Article Rating

Subscribe

0 Comments

Oldest

Newest Most Voted

Google Opens WAXAL to Support Over 100 Million African Language Speakers

Subscribe

President Tinubu is like Lee Kuan Yew – Uzodinma

FENRAD demands Umahi’s Suspension, independent probe over death of Physiotherapist

Court in Anambra sentences 2 to death for rape, murder of 17-yr-old

FCT getting better under Wike 50 years after creation – Tinubu

Court sentences another Anambra native doctor to prison over false claims of supernatural power

More like this
Related

Rihanna is my dream collabo – Davido

President Tinubu is like Lee Kuan Yew – Uzodinma

Burna Boy Lifts Shakira in World Cup Rehearsal Video Ahead of Sunday’s Final

FENRAD demands Umahi’s Suspension, independent probe over death of Physiotherapist

Office Address

Most Popular

Latest News

Rihanna is my dream collabo – Davido

President Tinubu is like Lee Kuan Yew – Uzodinma

Burna Boy Lifts Shakira in World Cup Rehearsal Video Ahead of Sunday’s Final

Subscribe

Google Opens WAXAL to Support Over 100 Million African Language Speakers

Subscribe

More like thisRelated

Office Address

Most Popular

Latest News

Subscribe

More like this
Related