What does Data Interoperability Require in Practice?

May 30, 2018 Agriculture
Kathryn Alexander
Explainer, Open Data

A few months ago, Development Gateway (DG) and our partner Athena Infonomics (AI) began the groundwork to improve the interoperability, analysis and – ultimately – use of agriculture and nutrition data in Cambodia and Nepal. Under the mSTAR project funded by USAID, DG and AI set out to understand the underlying structure of the data currently being collected and managed by Feed the Future implementers, and how to best support them to open up and share their data through digital tools and best practices. With the goal of accelerating data-driven agriculture development, DG and AI are supporting these partners and the USAID Missions in Cambodia (USAID/Cambodia) and Nepal (USAID/Nepal) to increase the availability of relevant data for research, analysis, and learning across stakeholders.

After assessing the challenges and opportunities for leveraging open agriculture and nutrition data in each country, we reviewed sample datasets to define the key components of a common data structure capable of improving interoperability across datasets and implementers. We also explored options for deploying a data repository – a solution for importing, storing, and retrieving digital content – to improve data sharing. Three main takeaways emerged, which we believe are relevant not just for data managers in Cambodia and Nepal but also for other contexts and sectors:

1. It is possible to achieve a common data structure – provided some common-sense principles are followed.

Our analysis of sample datasets from 13 USAID-funded research groups and implementing partners in both countries revealed similarities in the data they collect. But, due to differences in the way datasets are structured and variables are defined, stakeholders cannot currently make use of each others’ data – including socioeconomic and experimental data on agricultural practices and nutritional statuses – to inform their own programming. In order to help stakeholders discover and reuse relevant data, a number of interoperability standards and principles must be followed. This includes adopting frameworks that help ensure that datasets are easy to search for based on their content and any user can interpret and use their content.

The former can be achieved when data publishers capture important metadata information on the project and dataset, which includes defining and detailing elements like scope, data collection methodology, data availability, and terms of use. Adopting a standard ontology or vocabulary – a formal naming scheme, such as AGROVOC, that defines the types and properties of variables – during data collection and reporting helps improve future usability of a dataset. Similarly, to ease integration with other applications and repositories, stakeholders should agree on naming schemes for country-specific variables (e.g. administrative boundaries like provinces or municipalities), standard units of measurement (e.g. area, length, weight, etc.), and following basic structural hygiene standards (e.g. keeping data belonging to the same dataset within a single file, and using clear column and row names and codes for missing data). Finally, we suggest that all dataset owners create corresponding codebooks documenting all of these elements of the metadata schema.

2. Existing storage solutions meet stakeholder needs.

Our analysis also explored options for storing partners’ datasets – looking broadly from building a custom repository to deploying one of hundreds of existing digital data repositories. Many of these meet basic data management and storage needs, and several free or nearly free options have advanced functionalities that meet best practices for uploading, storing, and sharing data.

Drawing on the FAIR principles and guidelines by the Data Curation Centre and Data Seal of Approval, we considered critical features for any data storage solution, including the ability to upload and share various data types, assign unique identifiers, restrict access to certain data, apply reuse licenses, be discoverable by search engines, and link to APIs. When existing storage options meet user needs, we do not recommend investing the time and significant resources required for designing and building a new repository. In this case of USAID Cambodia and Nepal stakeholders, a solution like Harvard’s Dataverse also offers relevant, advanced features that would allow the Missions to further customize the tool – including rich and granular metadata fields; advanced search functions; customized use metrics; auto-generated “preservation” file formats; built-on integrations for data analysis and visualization; and links to other data catalogs.

3. Strong, central administration is crucial for a sustainable data repository.

Finally, designing a practical implementation strategy requires an effective governance structure, capable of and adept at managing the repository. In defining clear oversight and support tasks, you must consider organizational roles and capacities. For USAID/Cambodia and USAID/Nepal to facilitate smooth set up and implementation, we recommend they determine a set of procedures and guidelines for partners and researchers who will be uploading data, and agree on roles and responsibilities with each stakeholder group. 

To ensure compliance, all data managers should agree on data quality, sharing, and usability protocols prior to the rollout of any repository, including sharing all data underlying research publications, establishing timelines for data uploads (e.g. at the time of publication for researchers and within one year of collection for implementing partners), handling sensitive information, and promoting discoverability through linkages with other relevant repositories. . Additionally, USAID should take on additional data preparation and curation tasks such as supporting initial repository setup, tracking data submissions, approving and publishing data, and assisting partners with adopting new standards and processes for dataset preparation and upload. DG and AI recommend the Missions prioritize stakeholder buy-in for these new roles and responsibilities, and convene annual in-country open data working groups to sustain stakeholder interest and regularly address issues as they arise.

Together, these data management recommendations provide actionable strategies for migrating to a common data structure and implementing a practical data storage solution. Recognizing the complexities and nuances involved in managing data and designing country-appropriate systems, our next steps aim to build stakeholder buy-in and capacity to make the most of their data – through in-country workshops and targeted technical assistance to users across both Cambodia and Nepal to improve interoperability, knowledge sharing, and food security outcomes.

Enjoy this post and our first piece about DG’s work supporting FHI 360’s Mobile Solutions Technical Assistance and Research (mSTAR) project? Stay tuned for updates as we continue progressing towards strengthening data interoperability, knowledge sharing, and food security outcomes in Cambodia and Nepal. 

Thumbnail image: Kathryn Alexander, the Feed the Future demonstration plots at CE-SAIN/RUA in Phnom Penh

Share This Post

Related from our library

Three Key Takeaways From Discussions on Digital Transformation in Agriculture

Development Gateway: An IREX Venture (DG) hosted a discussion titled "Transforming Food Systems: The Power of Interoperability and Partnerships" at both Africa Food Systems Forum (AGRF) 2023 and the recently concluded ICT4Ag conference. Discussions from these critical events revolved around key themes crucial to DG’s ongoing work, including connecting people, institutions, partners, and systems when we think about technology working at scale to transform agriculture. In this blog, we explore three key takeaways from these conversations.

November 16, 2023 Agriculture
DG Launches Digital Agriculture Resources Portal to Advance Digital Agriculture in Africa, the Middle East, & Central Asia

DG is pleased to announce the launch of our Digital Agriculture Knowledge Management Library, which is a digital repository of resources detailing digital agriculture best practices. These resources were created as part of our DAS program in order to support individuals and groups across Africa, the Middle East, and Central Asia as they advance local and regional agricultural systems through the implementation of digital tools and technologies.

September 12, 2023 Agriculture
Two Recommendations for Accelerating Digital Agriculture and Data Use

With the aim of improving the efficiency of agriculture data use, Development Gateway: An IREX Venture (DG), Jengalab, and TechChange—with a grant from the International Fund for Agricultural Development (IFAD)—recently held a learning event, titled “Digital Agriculture: Building the Agricultural Systems of Tomorrow,” in Nairobi, Kenya. Participants identified two key recommendations for advancing digital agriculture in order to increase food security.

August 29, 2023 Agriculture