A few months ago, Development Gateway (DG) and our partner Athena Infonomics (AI) began the groundwork to improve the interoperability, analysis and – ultimately – use of agriculture and nutrition data in Cambodia and Nepal. Under the mSTAR project funded by USAID, DG and AI set out to understand the underlying structure of the data currently being collected and managed by Feed the Future implementers, and how to best support them to open up and share their data through digital tools and best practices. With the goal of accelerating data-driven agriculture development, DG and AI are supporting these partners and the USAID Missions in Cambodia (USAID/Cambodia) and Nepal (USAID/Nepal) to increase the availability of relevant data for research, analysis, and learning across stakeholders.
After assessing the challenges and opportunities for leveraging open agriculture and nutrition data in each country, we reviewed sample datasets to define the key components of a common data structure capable of improving interoperability across datasets and implementers. We also explored options for deploying a data repository – a solution for importing, storing, and retrieving digital content – to improve data sharing. Three main takeaways emerged, which we believe are relevant not just for data managers in Cambodia and Nepal but also for other contexts and sectors:
1. It is possible to achieve a common data structure – provided some common-sense principles are followed.
Our analysis of sample datasets from 13 USAID-funded research groups and implementing partners in both countries revealed similarities in the data they collect. But, due to differences in the way datasets are structured and variables are defined, stakeholders cannot currently make use of each others’ data – including socioeconomic and experimental data on agricultural practices and nutritional statuses – to inform their own programming. In order to help stakeholders discover and reuse relevant data, a number of interoperability standards and principles must be followed. This includes adopting frameworks that help ensure that datasets are easy to search for based on their content and any user can interpret and use their content.
2. Existing storage solutions meet stakeholder needs.
Our analysis also explored options for storing partners’ datasets – looking broadly from building a custom repository to deploying one of hundreds of existing digital data repositories. Many of these meet basic data management and storage needs, and several free or nearly free options have advanced functionalities that meet best practices for uploading, storing, and sharing data.
Drawing on the FAIR principles and guidelines by the Data Curation Centre and Data Seal of Approval, we considered critical features for any data storage solution, including the ability to upload and share various data types, assign unique identifiers, restrict access to certain data, apply reuse licenses, be discoverable by search engines, and link to APIs. When existing storage options meet user needs, we do not recommend investing the time and significant resources required for designing and building a new repository. In this case of USAID Cambodia and Nepal stakeholders, a solution like Harvard’s Dataverse also offers relevant, advanced features that would allow the Missions to further customize the tool – including rich and granular metadata fields; advanced search functions; customized use metrics; auto-generated “preservation” file formats; built-on integrations for data analysis and visualization; and links to other data catalogs.
3. Strong, central administration is crucial for a sustainable data repository.
Finally, designing a practical implementation strategy requires an effective governance structure, capable of and adept at managing the repository. In defining clear oversight and support tasks, you must consider organizational roles and capacities. For USAID/Cambodia and USAID/Nepal to facilitate smooth set up and implementation, we recommend they determine a set of procedures and guidelines for partners and researchers who will be uploading data, and agree on roles and responsibilities with each stakeholder group.
To ensure compliance, all data managers should agree on data quality, sharing, and usability protocols prior to the rollout of any repository, including sharing all data underlying research publications, establishing timelines for data uploads (e.g. at the time of publication for researchers and within one year of collection for implementing partners), handling sensitive information, and promoting discoverability through linkages with other relevant repositories. . Additionally, USAID should take on additional data preparation and curation tasks such as supporting initial repository setup, tracking data submissions, approving and publishing data, and assisting partners with adopting new standards and processes for dataset preparation and upload. DG and AI recommend the Missions prioritize stakeholder buy-in for these new roles and responsibilities, and convene annual in-country open data working groups to sustain stakeholder interest and regularly address issues as they arise.
Together, these data management recommendations provide actionable strategies for migrating to a common data structure and implementing a practical data storage solution. Recognizing the complexities and nuances involved in managing data and designing country-appropriate systems, our next steps aim to build stakeholder buy-in and capacity to make the most of their data – through in-country workshops and targeted technical assistance to users across both Cambodia and Nepal to improve interoperability, knowledge sharing, and food security outcomes.
Enjoy this post and our first piece about DG’s work supporting FHI 360’s Mobile Solutions Technical Assistance and Research (mSTAR) project? Stay tuned for updates as we continue progressing towards strengthening data interoperability, knowledge sharing, and food security outcomes in Cambodia and Nepal.
Thumbnail image: Kathryn Alexander, the Feed the Future demonstration plots at CE-SAIN/RUA in Phnom Penh
En 2020, Cultivating New Frontiers in Agriculture (CNFA) s’est associé à Development Gateway (DG) dans le cadre du projet USDA West Africa PRO-Cashew pour développer la plateforme de collecte et d'analyse de données Cashew-IN. Le projet a identifié des lacunes dans la collecte, le stockage, l'utilisation et la diffusion des données relatives au secteur du cajou dans les cinq pays de mise en œuvre (Côte d'Ivoire, Bénin, Burkina Faso, Ghana et Nigeria). Le projet s'efforce maintenant de combler ces lacunes par le biais d'un système de gestion des données sur la noix de cajou dans plusieurs pays (Cashew-IN) qui facilitera l'accès aux données et leur…
In 2020, Development Gateway (DG) partnered with Cultivating New Frontiers in Agriculture (CNFA) under the USDA West Africa PRO-Cashew project to develop the Cashew-IN data collection and analysis platform. The project has identified gaps in the data collection, storage, usage and dissemination related to the cashew sector in all five of the implementing countries (Côte d’Ivoire, Benin, Burkina Faso, Ghana, and Nigeria). The project is now working to address these gaps through a multi-country cashew data management system (Cashew-IN) that will facilitate access to and use of data to improve decision-making for policymakers, farmers, and the private sector. The ultimate goal is to generate better…
IREX’s Data Compass and Development Gateway’s CALM are methodologies for assessing data ecosystems. While each methodology produces different outputs based on different needs, both prioritize local collaboration and development to produce insights and outputs that reflect local priorities and actionable recommendations that institutions can own and implement.