Data and Information: the Backbone of the Global System
Information systems are central to the effective conservation and use of plant genetic resources, whether at the level of the individual collection, using the likes of GRIN-Global, or at a global level, like Genesys. The former helps genebanks to manage their own operations and coordinate with other collections, while the latter allows genebank users and researchers to access information about individual genebank samples and collections as a whole across the globe.
The Crop Trust’s work on both types of system made significant strides in 2021, with six national genebanks sharing their data on Genesys and CGIAR centers beginning testing GRIN-Global Community Edition (GG-CE), a genebank information system that builds on GRIN-Global.
The global system for ex situ conservation relies upon information management and sharing to operate successfully. Genebanks need data to operate efficiently. And to use genebank collections, it is essential to know which contain what crop diversity. This means constantly improving data and data management systems.
In addition to supporting individual genebanks to make their own improvements, the Crop Trust supports initiatives to improve the management and availability of information at the global scale, including through Genesys and GRIN-Global Community Edition.
Genesys is an online portal that allows users to explore the world’s crop diversity conserved in genebanks. Its database collates information from more than 450 genebanks worldwide and now contains entries on more than 4 million samples, around half of all samples conserved in genebanks worldwide. In 2021, more than 3.5 million of these records were confirmed and updated.
Genesys serves two distinct but connected groups of people. There are the genebanks, institutes and research centers that use Genesys to publish data on the material they maintain. Then, there are those—plant breeders and researchers in particular—who use Genesys to discover the material that best meets their needs.
Genesys Continues to Expand…
Six national genebanks joined the Genesys family in 2021: the National Department of Plant Genetic Resources, Ecuador; the Margot Forde Germplasm Center, the national forage genebank of New Zealand; the National Agricultural Research Center, Côte d'Ivoire; the National Plant Genetic Resources Center, Zambia; the Plant Genetic Resources Research Institute, Ghana; and the National Agricultural Research Council, Nepal.
The number of registered Genesys users increased from 1,901 to 2,288 in 2021.
Major Data Update from USDA
Early in 2021, the United States Department of Agriculture National Plant Germplasm System (NPGS) updated the passport data—information such as the name of the species, where and when it was originally collected, and where it is being kept—for their entire collection in Genesys. This covers a total of more than 580,000 samples in active collections—the material that is readily accessible to users such as plant breeders. This was a huge effort, made possible by close collaboration between Crop Trust and NPGS staff over a considerable period of time.
New Collections Added to Genesys
Later in the year, the National Center for Agronomic Research (CNRA), in Côte d’Ivoire, uploaded passport data on its coffee collection—a total of nearly 8,000 accessions representing more than 300 coffee populations. This means that information on the Ivorian collection can now be compared with data on other coffee collections and cross-referenced with weather and climate datasets also available on Genesys, helping researchers identify materials that are likely to have the specific characteristics they are looking for.
In October, the National Plant Genetic Resources Center of Zambia uploaded passport data for its sweetpotato collection. It followed this up with passport data for cassava and wild relatives of rice and sorghum.
CGIAR Genebanks Adopt GRIN-Global
All genebanks need sound data management to keep track of the inventories of seeds in their cold rooms and fields, and of all their interlinked processes.
For a long time, the different CGIAR genebanks have each taken their own unique approaches to data management. The result has been a wide array of software, with some genebanks relying on Excel spreadsheets and others relying on complex, bespoke systems.
In 2020, the genebanks agreed to move to a common information management system that builds on GRIN-Global, addressing some of the gaps in functionality and utility of the original software. GG-CE integrates barcoding and hand-held devices, greatly facilitating the work of genebank staff.
The development of GG-CE is coordinated at the Crop Trust in collaboration with the CGIAR Community of Practice on Data Management.
By the end of the year, all 11 CGIAR genebanks had a test version of GG-CE up and running and were in the process of evaluating the system. This system is specially designed for the daily activities that take place in genebanks, which is a big plus over other solutions. For example, the use of barcoding helps reduce the need for staff to manually write notes or labels, which, for one thing, will increase accuracy and help avoid mislabeling. It can also improve standardization, which will help collaboration between centers.
Finding Similar Seed Samples Just Got Easier
Genesys now includes a “Similarity Search” function to help users compare passport data records across collections—an important tool for genebanks to identify which of their accessions may have previously unknown replicates in other collections.
The CGIAR genebanks and others have developed such tools in the past, involving laborious data standardization and validation. Thankfully, Genesys already takes care of this aspect of the work, and now it can do the matching analysis too.
The system uses both exact text matching—where data entries need to be identical—and fuzzy text matches, which can pick up cases where there might be typos or minor differences in the data. The system uses a hierarchy of data to determine similarity, with matches on accession number and taxonomic data scoring highest, but with all other data fields being taken into account.
22 Jul 2021
15 Jun 2021
9 Jun 2021
13 Jan 2021
22 Jun 2017
8 May 2017
18 Jul 2016
14 Jul 2016
21 Jun 2016