The global populace continues to celebrate the proliferation of artificial intelligence (AI) as an ultimate revolutionary force in the contemporary world. In fact, it would be somehow dim-witted to argue otherwise without presenting factual data to back-up your case, especially in the wake of numerous advantages attributed to this technology as outlined in a variety of academic and popular publications. But as AI systems continue to evolve, a critical query emerges, particularly concerning generative AI, Large Language Models (LLMs): Why do these technologies – mostly created by the countries in the Global North undermine the legitimacy of local African languages such as Swahili and Yoruba, even when they are targeting African consumers? This is despite these languages being the most spoken in the African continent. According to Harvard statistics, the top spoken local languages in Africa are Swahili (200 million), Yoruba (45 million), Igbo (30 million), and Fula (35 million) all which belong to the Niger-Congo family[1]. Further, statistics by the United Nations Educational, Scientific and Cultural Organization (UNESCO), show that Kiswahili is not only a language of the African family but also ranked among the 10 most widely spoken languages in the world, with more than 200 speakers. Despite these statistics, Swahili and other dominant local languages are not listed among the UN official languages. One wonders, what qualifies to be a global official language? These same colonial legacies are visible in the way LLMs are conceived and presented in and for Africa. As Africans who speak Swahili language more than any other foreign languages categorised as official by the UN like French, English, Portuguese and Arabic, and is enthusiastic about AI, can I instruct Siri or Alexa, for instance, in Kiswahili? Well, your guess is as good as mine – a resounding NO! Ngugi wa Thiong’o, a renown African Author, in his book: “decolonizing the mind: The politics of language in African Literature”, asserts that there is more to language than just communication. According to Thiong’o, language can be used as weapon to control people’s perceptions of reality, and where necessary, lead to their subjugation. Thus, elevation of foreign languages in the current LLMs and transposing these in a context with rich multi-lingual history is done by design.
[1] See https://alp.fas.harvard.edu/introduction-african-languages#:~:text=The%20most%20widely%20spoken%20languages,Congo%20language%20family%20on%20Ethnologue.
AI developers promoting dominant systems have deliberately overlooked African languages in favour of European and Asian languages. While working as a copywriter and content creator for a company in the Global North, I was tasked with rating the feedback of a specific AI system based on multiple rubrics, including verbosity, truthfulness, style, and overall score, among other criteria, and providing justifications for my ratings. None of the models I developed and moderated were available in local languages. This exclusion permeates a broad spectrum of technological implementations and applications, beyond the confines of language processing, and stems from historical legacies of racial exclusion highlighted by Thiong’o. This subordination reduces African nations to a market for finished technology from the west. This is visible in the way Africa has remained an export hub for companies selling a wide range of products with AI-enabled systems, such as smartphones, automobiles, and other various digital products.
These products are used across multiple sectors in the continent, including healthcare, education, and agriculture among others. Some of these embedded AI-enabled systems are designed specifically to enhance their operational efficiency. In the automobile sector, for instance, African Union estimates the value of Africa’s automotive imports at US$48 billion. This includes cars imported from China, Japan, South Korea, Europe, and America, most of which have automobile interphases featuring voice commands and navigation systems in Chinese, Japanese, or English languages and accents, devoid of options for Africa’s dominant languages like Swahili, Yoruba, Igbo, or other local languages. With such a large automotive market, this undertaking is problematic in two respects: firstly, it questions whether tailoring products to the target audience is indeed a strategic business approach; secondly, it raises concerns about the imposition of foreign cultures and norms on Africa, which could be interpreted as a form of modern colonialism. Regardless of how it is perceived, this endeavour is wrong and should be addressed as such.
There are several other AI systems that have neglected African languages but are being used in the continent in various sectors, including education, healthcare, and agriculture. Although there are several examples of digital tools and platforms that have deliberately overlooked African languages but are designed for Africans, a few notable platforms include:
While it may look benign to some, the exclusion of African languages in the development of AI Systems for African consumers surpasses the threshold of a technological oversight; is actually a socio-cultural and economic issue. Language is one of the fundamental pillars and tenets of cultural identity, and economic stability. Largest parts of African economies run informally and thrive within the rich African languages and culture. Hence failure to acknowledge the centrality of African languages in AI development, the developers inadvertently perpetuate digital colonialism, but also facilitating economic polarization, when the western languages and norms or imposed on the African people in the technology space, and a large population are excluded from economic benefits of these systems respectively.
In the wake of AI proliferation in Africa, policymakers and developers must prioritize linguistic and cultural diversity to inculcate a sense of inclusivity in such digital systems. Some of the considerations here may include collaborating with local linguistic experts to incorporate local African languages in AI development promoting open-source datasets; engaging with the local communities to understand their needs; and advocating for the inclusion of African languages in AI systems, particularly through policy frameworks.
Embracing these measures will encourage diversity in terms of language and culture while designing and developing AI systems for the African continent. Otherwise, the ongoing disregard of African languages in technology development may perpetuate contemporary or digital isolation, economic polarization and digital colonialism.
Evans Agembo: A Research Fellow at the Centre for Epistemic Foundation, focusing on the nexus between Artificial Intelligence on the one hand and Food systems, Climate Change, and Environmental Sustainability on the other. Evans holds an MSc in Food Biotechnology (hons) from ITMO University, Russia and a BSc in Environmental Health from Kenyatta University, Kenya.
Dr. Angella Ndaka: The CEO of the Centre for Epistemic Justice foundation and a Gender & Digitisation Expert | Sustainable AI Futures | Women in AI 2023 Awards Winner | Top 100 Women in AI Ethics 2023| Critical AI scholar | Author | 2020 International Alumna Award Winner – ANU| Thought leader | Speaker.