Building an Efficient Database to Improve Data Use in Côte d’Ivoire

August 6, 2020 DCDJ, Health
Lindsey Fincham

This post originally appeared as Building an Efficient HIV/AIDS Database for Improved Data Use in Côte d’Ivoire on the DCLI website, which can be found here.

A Data Fellow placed with the National AIDS Control Program (Plan National de la Lutte Contre le SIDA/VIH, PNLS), the government agency that coordinates the national response to the HIV/AIDS crisis in Côte d’Ivoire, Ali Diakité, built a solution that noticeably improved the PNLS process efficiency and data quality. As the primary HIV/AIDS control program in Côte d’Ivoire, PNLS uses the tool to make direct improvements to its data quality, timeliness, and epidemic response.


Des Chiffres et Des Jeunes (DCDJ) aims to bolster the subnational supply and usage of data for citizens of Côte d’Ivoire, engage youth as champions of these services, and fuel innovation to address rising data and information needs. Through the DCDJ Fellowship Program, Ivorians between 18 and 34 years old spend two months in intensive data science and analytics training. Following the training, Fellows are placed in internships to support their hosting organizations in making data-informed decisions. DCDJ (Des Chiffres et Des Jeunes) is a project of the Data Collaboratives for Local Impact (DCLI) program, a partnership between PEPFAR and the Millennium Challenge Corporation (MCC) that aims to increase expertise and resource availability in Côte d’Ivoire.

The National AIDS Control Program (Plan National de la Lutte Contre le SIDA/VIH, PNLS) is the government agency that coordinates the national response to the HIV/AIDS crisis in Côte d’Ivoire. PNLS is responsible for collecting and reporting logistical and clinical data from 33 regions and 113 districts, as well as creating and executing the National Strategic Plan (Plan National du VIH/SIDA, PNS) to fight HIV/AIDS, which is the national roadmap for the next five years of Côte d’Ivoire’s epidemic response.

Ali Diakité is an alumnus of the first cohort of the DCDJ Fellowship program. Ali holds a degree in statistics from Ecole Nationale Supérieure de Statistique et d’Economie Appliquée d’Abidjan (ENSEA). After the training, Ali was placed in an internship at PNLS, to support their efforts in confronting the HIV/AIDS crisis in Côte d’Ivoire. A year later, Ali has been able to implement and sustain a new data collection tool to ease PNLS’ slow data aggregation and use process.

A Lack of Timely, High-Quality Data

Despite the wealth of data reported to PNLS, department heads lacked access to timely, high-quality data for policy and program decision making. During stakeholder interviews within PNLS, Ali learned that the lack of standardized data formats made compiling the reported logistical and clinical data a time-consuming and error-prone process. Inconsistent categories and Excel table headers from different clinics and pharmacies had to be hand-reconciled into one consolidated data set. Without interoperable data formats, data quality suffered, and compiling the data took ten individuals over a week of work. Then inconsistent data would be sent back to the districts for corrections and the process would start again from the beginning. The time- and resource-intensive data compilation and correction process meant some of the data could be up to a year old. Finally, even if the data was more accurate, the data was stored in a series of spreadsheets, which made analysis more difficult without a statistician on staff at PNLS. PNLS was making decisions and setting long-term strategies based on suboptimal, sometimes outdated, data. Ali explained, “The problem was really having access to the data quickly. They weren’t even dealing with analysis or visualizations – their first challenge was getting the data. So we wanted to address the root causes of the problem.”

Developing a Solution

Ali used the Python skills and other resources gained during the Data Fellowship training to develop an efficient data compilation tool for PNLS. Ali’s program gathers data from a variety of heterogeneous datasets and compiles it into a standard format in a single homogenous database in a matter of minutes. During the automatic compilation process, the tool also completes data quality checks. In the final stages, data is provided to users in an easily digestible format.

Before creating the tool, Ali researched the data ecosystem of PNLS. With a more complete understanding of the reporting and compilation process, Ali considered several solutions. A Python based solution was the best fit as it automated and accelerated the data compilation process. The tool was created to systematically format comparable data, and simultaneously check cells for accuracy. For example, if a clinic reports treating 30 patients, but also reports that 36 of the patients were male, the tool automatically alerts PNLS of the reporting error.

Impacts for PNLS & Ali

The Python-based program has drastically reduced the amount of time and number of staff necessary to compile data from reports. Before, ten PNLS staff members would manually check each cell of data reported by clinics and pharmacies in 33 regions and 113 districts, a process that could take more than a week. Now, the Python program completes the entire process quickly and automatically in a matter of minutes. After the compilation process, PNLS staff now access and analyse one unified file in a single, accurate format referred to as the National Database. Previously, even though data was compiled every quarter, the slow processing time sometimes meant analyzing data that was six or nine months old. Now that the Python program has automated the process, data is from the previous one or two months, which makes it significantly more relevant. . Ali began his evaluation and work on a solution in January 2019, by May 2019 the python program was introduced organization-wide, and continues to be in use today. Ali has trained several of his colleagues at PNLS to use, maintain, and update the Python program so that it will continue to be useful in the future.

On an organizational level, more accurate and timely data gives PNLS a better understanding of the realities of HIV/AIDS in Côte d’Ivoire. Before, the organization was relying on out of date data and information. As a result, PNLS had miscalculated how many HIV/AIDS patients it was treating and could not understand why the numbers of patients being treated did not match with the significant interventions taking place. But with data compiled from the Python program, PNLS could see the actual numbers, correct inaccuracies, and understand the true impact.

The PNS currently under development draws heavily on data processed by Ali’s program. Ali said, “I was really honored to be involved in these discussions that international experts also attended. It was a moment of pride for me to see that such an important document for the fight against HIV/AIDS is being developed based on data I helped obtain through the solution that I built.”

The deployment of Ali’s solution noticeably improved the PNLS process efficiency and data quality. PNLS uses the tool to make direct improvements to its data quality, timeliness, and epidemic response. In addition, PNLS extended Ali’s contract beyond the length of the Data Fellowship. Using the wealth of knowledge he acquired as a DCDJ fellow, Ali now has the resources to take his career far beyond what he might have imagined before graduating from University.