top of page
Search

AI in Crop Breeding for Sustainable Food Security

  • Writer: Runjhun Saran
    Runjhun Saran
  • Nov 11, 2024
  • 6 min read

Updated: Dec 3, 2024

November 10th, 2024


ree

Agriculture, the backbone of human life, is now undergoing a transformative shift as technology reshapes traditional practices on a global scale. With constantly changing climate, limited resources, and rising global population (expected to reach 9.8 billion by 2050), agriculture faces the immense challenge of sustainably producing 70% more food by mid-century. In this critical situation, AI can potentially have a game-changing impact on achieving global food security by optimizing resources such as water management and sustainable land use, reduction of waste & harmful emissions, sequestration of carbon, real-time weather/climate forecasting, boosting crop yield & crop quality, continuous monitoring of crop health, improving soil health, and enhancement of agricultural biotechnology techniques. From the perspective of agricultural biotechnology techniques, 'Crop Breeding' is one of the main weapons for escalating crop productivity & quality, to achieve sustainable food security. Data-driven and speedy breeding of crop plants with improved productivity, resilience, sustainability, and adaptability has been a long-standing pain-point. However, the dawn of AI-assisted precision breeding & big data analysis is fuelling leaping progress in the field – lets read how!


ree

Plant phenomics platforms equipped with AI-enabled remote sensors & imaging technologies: Large-scale phenotyping of plants/crops that show increased yield, disease resistance, stress tolerance, shelf-life, nutritional value, flavor, as well as reduced toxin content, fungi contamination, pesticide residues, and various other plant/crop traits, is crucial for crop breeding. Traditionally, this limited by low data acquisition capacity. Now, the advent of AI-enabled advanced imaging platforms has brought exponential progress in curating phenotypic databases [1]. These include various stationary or mobile sensors (e.g., 3D laser scanners, hyperspectral cameras, thermal IR cameras, chlorophyll fluorescence sensor, etc.) and formats (fixed towers, rail-based systems, movable gantries) to monitor plant traits (e.g. plant height, tiller density, grain yield, moisture content, and leaf color, etc.), quantify 3D plant and leaf structures & leaf temperature across different growth phases and environmental conditions. Mobile phenotyping platforms expand field coverage and deliver high-resolution imaging, though environmental factors such as light-intensity can pose challenges in open areas. Innovations like the BreedVision systems are addressing such issues by effectively blocking ambient light and conducting imaging within a movable dark chamber. Unmanned aerial vehicles (UAVs) also enable rapid, high-resolution data collection over large areas, e.g. capturing detailed canopy color and texture features with ~1 mm pixel density [2, 3]. Machine learning (ML) and deep learning (DL) further enhance phenomics by analyzing vast, complex datasets to identify crop traits and stress responses. Techniques like random forests, support vector machines, and advanced CNN architectures (e.g., ResNet, YOLO) extract and classify phenotypic data for traits such as nitrogen levels, biomass, stress resilience, etc. AI-driven integration of multiple data sources—RGB, thermal, and multispectral—boosts predictive accuracy in models for yield and disease identification [1].


ree

Plant genomics data generation from germplasm resources using AI: AI algorithms are now being used in pre-breeding strategies for identification of novel or previously known beneficial alleles linked to phenotypic diversity, and for genomic-assisted selections in crop-breeding, which is crucial for crop adaptation to environmental changes. Building high-quality reference genome databases to train and finetune such AI algorithms is a foundational step in this process. A major untapped resource in this field is the vast germplasm collections found in over 1,750 genebanks globally, housing more than 7 million germplasm accessions, including cultivars, landraces, and wild relatives [4]. Genebank genomics refers to genome-wide genotyping of stored germplasm. With the help of high-throughput sequencing techniques, expansive datasets for crops like wheat, maize, and barley have already been generated. Using such data, advanced AI-driven predictive genomics models are being trained to gain crucial insights into the beneficial traits from genotyped wild populations as well as cultivated crops. Integration of such AI-driven models with passport data (capturing genotypic crop identity and origin) as well as agroclimatic and soil data, can evaluate the breeding potential of germplasm collections and simulate germplasm performance, even without extensive phenotypic data.


ree

Bridging the genome-phenome knowledge gap using AI: One of the major limitation in genebank utilization is the lack of phenotypic data across target environments. The integration and mapping of genomic and phenotypic data are pivotal for advancing crop breeding, as they provide a comprehensive understanding of how genetic variations translate into observable traits. This approach enhances our ability to identify specific genes and alleles associated with desirable traits, such as disease resistance, yield, and climate resilience, which are essential for developing robust, high-performing cultivars. AI-enabled mapping of genomic data with phenotypic observations (using multiple, instance learning, leave-one-environment-out (LOEO), support vector regression (SVR), and gradient boosting machine (GBM) models), is unravelling complex trait heritability [1]. This is allowing for more precise trait predictions, faster identification of quantitative trait loci (QTLs) & novel genes, and addressing issues like "missing heritability" where genetic factors alone cannot explain certain traits.


ree

AI-enabled Functional genomics (transcriptomics, proteomics, and metabolomics): An exciting frontier for AI in plant breeding lies in its application to single-cell RNA sequencing, enabling the detailed exploration of developmental processes and environmental responses within complex, heterogeneous tissues [5, 6]. Such datasets, encompassing thousands of cells and tens of thousands of genes, are often analyzed using unsupervised ML techniques to identify patterns without predefined labels. Clustering and manifold learning methods, for example, uncover nonlinear relationships within this data, aiding in its organization and interpretation. AI models have also been used to facilitate the integration and analysis of metabolomic data, enabling the metabolomic pathway prediction in crops [7, 8]. Altogether, AI is playing a transformative role in integrating transcriptomic, proteomic, and metabolomic data to provide a comprehensive view of plant physiology. For instance, in a study on maize autophagy mutants, multi-omics analysis under various nitrogen-replete and -starvation conditions revealed autophagy's impact on cellular function [7]. These findings were backed by notable changes in the transcriptome and proteome. By analyzing the abundances of mRNA and proteins, researchers identified specific protein targets for autophagic clearance, as well as protein complexes, organelles, and various processes regulated by this catabolic pathway. This capability of AI can be utilized to identify critical molecular targets and metabolic pathways, that play a role in resilience and productivity in crops.


ree

AI-driven integration of multi-omics data: Generation of genetic networks from integrated multi-omics data (genomics, phenomics, proteomics, metabolomics, transcriptomics, etc.) is more effective than single-omics approaches. However, the ultra-high dimensionality of omics data often makes the integrated multi-omics data analyses a difficult task. AI models have come to the rescue of such "curse of dimensionality". Autoencoder models are being used to denoise extensively integrated datasets, and deep learning models are now able to harness significant features from multi-omics datasets for accurate predictions. AI-enabled powerful data integration methods, are able to incorporate and process complex interactions and dependencies across various biological data types, hence handing out a more comprehensive, deeper, and holistic understanding of crop systems.


ree

AI-empowered gene-editing technologies: Traditional breeding methods, such as mutagenesis, hybridization, and transgenic breeding approaches, have significantly advanced crop yield and quality. However, they face challenges like lengthy breeding cycles, incomplete gene function loss, limited precision, and labor-intensive screening. Genome sequencing technologies have paved the way for more precise molecular breeding. Gene editing systems, including homing endonucleases (HEs), mega-nucleases (MNs), zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and CRISPR/Cas9, are central to genome-editing advancements, allowing specific and efficient DNA modifications. AI technologies (e.g. AlphafFold2, OpenCRISPR-1, etc.) are revolutionizing the design or bioengineering of genome-editors with more compact structures, and substantially improved precision and efficiency in genetic manipulation.


In my next blog, I am excited to elaborate on how Generative Artificial Intelligence is transforming the molecular toolkit of gene editing!



References:
1) Trends in Genetics. Volume 40, October 2024, No. 10, Pages 893–908
2) Plant Physiology. Volume 187, Issue 4, December 2021, Pages 2623–2636

3) ISPRS Journal of Photogrammetry and Remote Sensing. Volume 150, April 2019, Pages 226-244

4) Molecular Plant. Volume 13, Issue 10, 5 October 2020, Pages 1341-1344

5) Developmental Cell. Volume 48, Issue 6, 25 March 2019, Pages 840-852.e5

6) Plant Cell. Volume 31(5), 2019 May, Pages 993-1011

7) Communications Biology. Volume 2, Jun 18 2019, 214

8) Nature Plants. Volume 4, 2018, Pages 1056–1070

9) Molecular Biology Reports. Volume 49, 2022, Pages11385–11402

10) Genes. Volume 14, 777

11) Genomics, Proteomics & Bioinformatics. Volume 22(4), 2024, qzae051



 
 
 

Comments


© 2025 MOLwise Biosciences Inc

bottom of page