Authors: COAST Project Radio Mining Workstream (Joyce Nakatumba-Nabende, Jonathan Mukiibi, Chodrine Mutebi, Tobius Bateesa, Sudi Murindanyi, Andrew Katumba and Hilda Mirembe)
Thursday, 05 May 2022
The COAST project aims to build end-to-end AI and data systems for targeted surveillance and management of COVID-19 and future pandemics affecting Uganda. To achieve these objectives, the activities are carried out through the collective effort of four workstreams as described below:
Understanding perceptions within offline communities about different facets of the pandemic is a crucial starting point. Workstream 1 of the COAST project focuses on mining existing and prospective near real-time radio broadcast data. This enables the team to collect and analyze community voices for effective surveillance and management of COVID-19.
Screenshot of our informational dashboard illustrating radio recordings with specific keywords
The COAST project team is using Artificial Intelligence techniques, in particular Automatic Speech Recognition Systems and Keyword Spotter models, to mine public radio broadcasts. In Uganda, low internet penetration makes radio the preferred medium of social communication by far. Radio broadcasting facilitates two-way information flow: from top to bottom and from the bottom to top.
With respect to the top to bottom information flow, government and local village policies are generally announced first via radio broadcasting in the languages spoken in the local communities. With the bottom to top information flow on the other hand, concerns of citizens, particularly those located in rural communities, mostly get voiced on various radio talk shows where citizens are able to call in.
The work being undertaken by this workstream centers on automatically analyzing radio conversations about COVID-19 issues affecting males and females in “offline” communities, and understanding their perceptions towards COVID-19 related public policy. The results and emerging insights from this analysis are critical for informing public health planning and response by health experts and policymakers.
Throughout the first year of the project, we have focused on four main objectives:
Objective 1: To collect and transcribe radio data from different regions in Uganda.
(a) Data Collection: We collected data from radio stations that broadcast in the Central region. We also added online streaming channels for radio data from Uganda's Northern and Eastern regions into the radio streaming pipeline. To support this data collection, a dashboard was developed to monitor radio broadcast streams across these three regions.
Screenshot of the informational dashboard used to monitor radio broadcast streams across different regions in Uganda
Objective 2: To use the transcribed radio data to train and build Automatic Speech Recognition and Keyword Spotter models for three of the most common languages in Uganda: Luganda, Acholi, and Lumasaaba. The radio data collected was cleaned, preprocessed, and transcribed using the Praat annotation tool.
Objective 3: To analyze the radio data for COVID-19 mentions. We trained the Luganda Automatic Speech Recognition (ASR) Model based on transcribed radio data and Commonvoice data. Common Voice is a publicly available platform for crowdsourcing speech from the community to create free and open representative datasets.
Objective 4: To further analyze the radio data for COVID-19 mentions with respect to vaccination and COVID-19 awareness use cases. The COVID-19 awareness use case zooms in on mentions of Standard Operating Procedures (SOPs), Nonpharmaceutical Interventions (NPIs), and reactions towards COVID-19. Meanwhile, the vaccination use case focuses on assessing the level of vaccination hesitancy, frequency of vaccination campaigns, and level of misinformation about vaccination.
The data analysis is guided as follows:
Community engagement related to COVID-19 vaccination based on gender
Concerning the COVID-19 vaccination use case, we observed that the public is well sensitized about the importance of vaccination and the various vaccination centers available. This was inferred from the frequency of adverts, news, and feedback from the community during the radio shows. According to radio data from May 2021 to early December 2021, an average of 5 adverts about COVID-19 were aired daily. According to the radio schedules that featured COVID-19 vaccination-related topics, 43% of them were news based on information from the three most listened to radio stations in Central Uganda, i.e., CBS FM, akaboozi FM, and Buddu FM.
5. There are cases of misinformation and misconceptions about COVID-19 vaccines. Some of the views observed within the community were associated with topics such as "COVID vaccines cause infertility and barrenness in people", "COVID-19 vaccines have satanic chips", "COVID-19 vaccines are made generally to kill Africans", amongst others.
6. We also observed that the Ugandan Ministry of Health has been countering misinformation and misconceptions through continuous fact-driven and awareness-focused advertising to emphasize the safety of the vaccines. The figure below indicates the difference between callers who were misleading listeners and callers who were sharing facts about COVID-19 vaccination.
A comparison between the number of callers sharing misinformation and misconceptions about COVID-19 vaccination with callers who shared factual information
Two to three days before or after a presidential address, we tend to notice a relative increase in COVID-19 keyword mentions. Using the 22 September 2021 presidential address as an example, there was an observable hike for a number of days before and after the presidential address. This is shown in the figure below, where the PA marker refers to the Presidential address. The focus here was also on mentions via radio broadcasts pertaining to COVID-19 vaccines.
Spikes observed in radio discussions before and after a presidential address (marked as PA)
The insights and data analysis from radio recordings provides a reflection of occurrences in the community and are powerful tools that can be utilized by the government for informing public policies and response to the COVID-19 crisis as well as future pandemics that may affect Uganda. As part of our future work, we will expand our radio analysis to include other Ugandan languages and also integrate the results of our analysis within a web-based visualization dashboard that can be deployed for use by health experts and policy workers.
The COAST project is being implemented with grant support from Canada’s International Development Research Centre (IDRC) and the Swedish International Development Cooperation Agency (Sida) as part of the Global South AI4COVID Program. To learn more about the COAST Project follow us on Twitter and visit our website.
Tags: