African Language Speech Datasets
Commercially licensed Twi, Wolof, and Fon speech data, validated and shipping today with volume scaling now. Every African audio dataset is speaker-verified, AI quality-filtered at capture, and delivered under a commercial license with an SLA.
Twi, Wolof & Fon speech datasets shipping today
Twi (Akan) Speech Dataset
Ghana · ISO 639: ak
Natural, code-switched Akan Twi speech for Ghanaian voice and ASR systems.
For: Ghanaian ASR training data & local-language IVR
- Modality
- Conversational & image-prompted speech
- Format
- WAV · 16 kHz + timestamped transcripts (JSON)
- Licensing
- Commercial license + SLA
- Quality
- >80% inter-annotator agreement
- Sourcing
- Fair-Trade, native speakers
- Delivery
- API or S3
Part of our 128-hour speaker-verified Twi, Wolof & Fon corpus fully annotated and growing.
Wolof Speech Dataset
Senegal · ISO 639: wo
Everyday Wolof speech, including French code-switching, for CPaaS platforms serving Senegal.
For: Wolof IVR Senegal & West African ASR training data
- Modality
- Conversational & image-prompted speech
- Format
- WAV · 16 kHz + timestamped transcripts (JSON)
- Licensing
- Commercial license + SLA
- Quality
- >80% inter-annotator agreement
- Sourcing
- Fair-Trade, native speakers
- Delivery
- API or S3
Part of our 128-hour speaker-verified Twi, Wolof & Fon corpus fully annotated and growing.
Fon Speech Dataset
Benin · ISO 639: fon
Natural, tonal Fon speech for voice and ASR systems serving Benin and the wider Gbe-speaking belt.
For: Beninese ASR training data & local-language IVR
- Modality
- Conversational & image-prompted speech
- Format
- WAV · 16 kHz + timestamped transcripts (JSON)
- Licensing
- Commercial license + SLA
- Quality
- >80% inter-annotator agreement
- Sourcing
- Fair-Trade, native speakers
- Delivery
- API or S3
Part of our 128-hour speaker-verified Twi, Wolof & Fon corpus fully annotated and growing.
More African Languages Coming
Need a language that isn't in our catalogue yet? We scope and staff a custom collection mission with our native-speaker network the same image-prompted elicitation, AI quality-filtering and >80% agreement standard as our shipping datasets.
- SwahiliTanzania
- HausaNigeria
- AmharicEthiopia
- YorubaNigeria
- ZuluSouth Africa
- FulfuldeSenegal
- EweTogo
- ShonaZimbabwe
- LingalaDR Congo
- IgboNigeria
- OromoEthiopia
- TigrinyaEthiopia
- SomaliSomalia
- XhosaSouth Africa
- LugandaUganda
- KikuyuKenya
- KrioSierra Leone
- BambaraMali
- KinyarwandaRwanda
Typical lead time: 4–8 weeks from kickoff to first delivery.
Don't see the dataset you need?
Or email us at contact@afriklang.com