• From
    CGIAR Initiative on Breeding Resources
  • Published on
    30.01.25
  • Impact Area
  • Funders
    Gates Foundation

Share this to :

January 2025, Montpellier – CGIAR received a grant from the Gates Foundation to develop AgPile, a platform aimed at supporting agricultural research through improved data capabilities and computational infrastructure. AgPile will focus on compiling an AI-ready, regionally relevant data corpus, which can support the development of innovations for smallholder farming systems.

What’s AgPile?

Imagine you’re a researcher studying wheat diseases globally, or a breeder preparing for a new wheat study. Before you can even start, you’ll need to gather data from multiple sources—some online, others stored in outdated systems, etc. For instance, you might need to track down old wheat database files from a retired system. This could involve making phone calls or sending emails, hoping a busy colleague finds time to upload the data files to Google Drive—if they can locate them. After some time, when you finally receive the data, it may not even be what you initially needed.

This is where AgPile comes in. The project aims to solve this inherent challenge in the agricultural research world by facilitating the aggregation of relevant datasets for analysis. AgPile will create a federated and collaborative data platform that references and organizes access to datasets from CGIAR, its Centers, and its partners.

It will also make them FAIR – for Findable, Accessible, Interoperable, and Reusable – and therefore easier to integrate into AI models. Ultimately, this will help CGIAR develop agricultural products and services tailored to the needs of smallholder farmers.

The implications of AgPile are transformative. Currently, obtaining datasets can take months, delaying critical research. With AgPile, CGIAR staff and partners could potentially find answers to key questions without having to conduct physical trials. By accessing and comparing data from different organizations, breeders could, for example, analyze past experiences, learn from them, and apply them to their work, saving time and resources while accelerating innovation.

How will AgPile work, in practice?

While it’s unrealistic to store all agricultural data ever generated, AgPile will employ two practical approaches to collect and manage a substantial portion of it.

The most obvious one is data replication: we will create a copy of relevant datasets and host them directly on the AgPile platform. At the same time, we will work on data federation. Instead of storing the data itself, the platform will provide a reference to where the data resides, along with metadata and relevant descriptors, as well as information needed to access it.

The project will start by integrating data assets from existing CGIAR projects, such as Tumaini, Artemis, and 1000 Farms, into a centralized, AI-ready corpus.

Beyond data collection, AgPile is meant to leverage artificial intelligence to maximize the value of data. For example, field image data collected via Artemis will allow breeders to extract phenotypic information and digitize it. AI will then allow them to identify characteristics of plants from the images, such as disease symptoms.

AI will also help the AgPile team make decisions about data curation and import. The goal is to streamline the process of compiling datasets, to accelerate the creation of relevant, high-value data products that are tailored to agricultural innovation.

In a nutshell

In summary, AgPile is set to change the way CGIAR and its partners manage and utilize data. By consolidating datasets into a unified, FAIR and AI-ready platform, AgPile will empower breeders, researchers, and scientists with analytics tools and collaborative workspaces to accelerate innovation cycles.

The project will establish federated data-sharing mechanisms, fostering collaboration across Centers and partner institutions. By addressing existing challenges and seizing CGIAR data landscape opportunities, AgPile will unlock the full potential of CGIAR and partners’ extensive data resources, transforming them into a global public good that drives agricultural research, innovation, and impact for smallholder farmers worldwide.

***

Written by Julie Puech, Breeding for Tomorrow. Main image: Safeguarding Africa’s beans: CIAT’s Kawanda genebank and research station. Credit: ©2016CIAT/GeorginaSmith.

Share this to :