The Golden Era of AI in Biotechnology: A bird’s Eye View
- Runjhun Saran
- Oct 24, 2024
- 5 min read
Updated: Dec 3, 2024
October 24th , 2024
The convergence of artificial intelligence (AI) and Biotechnology has catalyzed multiple breakthroughs and continues to create unprecedented opportunities for innovation and advancement in healthcare, agroecology, and environmental sustainability. This powerful synergy has now embarked upon a Golden Era, empowering scientists to engineer biological molecules and living systems with outstanding precision. A strong testament to this is the 2024 Chemistry Nobel Prize being awarded to John Jumper & Demis Hassabis at Google DeepMind in London, and David Baker, at the University of Washington in Seattle, for developing cutting-edge computational tools that can predict and design protein structures.

David Baker, Demis Hassabis and John Jumper (left to right) won the chemistry Nobel for developing computational tools that can predict and design protein structures. Credit: X/@NobelPrize, canva.com
My blog today, provides a quick bird’s eye view of how AI’s incredibly expanding potential is reshaping the landscape of Biotechnology at various levels of life. Impact of AI-driven methods on biological programmability is significantly influenced by the vast scale of genomic, protein, and cellular data. With the significant progress in high-throughput sequencing technologies in today’s date, the National Center for Biotechnology Information (NCBI) has sequenced over 2,000,000 prokaryotic genomes and 40,000 eukaryotic genomes, while the Human Cell Atlas has recorded more than 60,000,000 single-cell sequences [1]. This wealth of such curated and standardized databases provides an unparalleled foundation for constructing advanced computational models to reveal patterns, forecast outcomes, and navigate decision-making, specially in the uncharted territories of biotechnology research.

Molecular Level: Biomolecules include carbohydrates, proteins, nucleic acids, and lipids. These quintessential organic molecules and their interactions are vital for life. Biomolecular functions are structure-dependent, and deciphering or predicting molecular structures has been a challenging research problem. However, machine learning (ML) based protein structure prediction, design, and docking tools such as RoseTTAFold-AllAtom, three versions of AlphaFold (precited over 200 million protein structures), OpenFold, TomoDRGN, CryoDRGN-ET, ProteinMPNN, EvoDiff, RFDiffusion, ProGen, DiffDock, etc. have revolutionized this research area like never before. Apart from predicting the structure of naturally occurring biomolecules with outstanding accuracy, AI models are now able to bioengineer or generate a plethora of novel molecules bearing specific properties or functionalities, either by optimization of existing molecules or by de novo design / directed evolution of new molecular structures. For instance, advanced AI-augmented strategies have been used for enzyme engineering of more efficient, specific or flexible CRISPR enzymes (molecular tools for gene editing) [2]. Generative AI models are being increasingly used for designing of novel small molecules as drugs [3]. Yet another example is the use of deep learning powered approach to design novel lipid molecules to be used in building highly efficient lipid nanoparticles as drug carriers [4]. Also, deep neural network approaches are accelerating the directed evolution and identification of novel DNA/RNA Aptamers for developing highly sensitive biosensors [5]. These novel biomolecules or small molecules are proving to be game-changers in drug discovery, drug-targeting, personalised medicine, bio-diagnostics, biotherapeutics, environmental monitoring, etc.

Cellular level: The rapid advancement of high-throughput omics technologies, particularly at the single-cell level, has generated enormous datasets that capture various molecular modalities from millions of cells, making them excellent resources for training AI models. These include transcriptomics, lipidomics, proteomics, functional & structural genomics, pharmacogenomics, metabolomics, etc. Deep generative models for spatial omics is one of the latest highlights of this field [6]. Now, two single-cell foundation models scGPT and scFoundation, capable of cell-type annotation, perturbation prediction, etc. are available for public use. A variational autoencoder-based framework is also available for modeling of transcription as well as splicing kinetics [7]. Additionally, RNA-sequencing datasets, cellular gene expression profiles, and protein embeddings learnt by large protein language models are widely being used for building universal cell embeddings [8]. Altogether, AI has significantly accelerated the mapping of sequences to phenotypes, and greatly enhanced the understanding of the interplay of molecular key players underlying multiple cellular processes.

Tissue/organ level: The emergence of tissue engineering & regenerative medicine (TERM), and organ-on-a-chip technology present exciting opportunities for biotherapeutics and personalized medicine. Simulation of the complex 3D microenvironments of human organs (by integration of stem cells, biochemical factors, and biomaterials), has enabled investigation of disease mechanisms and testing of biotherapies in a more relevant context. AI is becoming increasingly vital in this field. Multiple AI models (supervised and unsupervised deep learning, reinforcement learning, agent-based models, etc.) are being leveraged for analyzing large datasets to uncover patterns, predict and simulate ideal conditions for cell growth and differentiation, identify suitable growth factors, evaluate novel biotherapies, develop intelligent adaptable biomaterial & bioink, and design optimised scaffolding designs for various tissue types [9]. Furthermore, integrating these microenvironments with AI-powered image and data analysis is allowing scientists to gain crucial insights into drug efficacy, toxicity, and tailored treatment strategies.

Community Level: The growing biological threats to public health necessitate the development of early detection systems, leveraging advancements in biotechnology and AI to identify and mitigate risks before they escalate. Antibody-, aptamer-, nanomaterial-, or cell-based biosensors play a crucial role in the swift and sensitive detection of biological threats. The integration of AI with the above offers exciting possibilities for early warning systems, through the analysis of large datasets to uncover patterns and anomalies that indicate disease outbreaks. Furthermore, autonomous AI systems for biological threat detection, and AI-driven wearable devices and IoT sensors can potentially create personalized early warning systems that continuously monitor physiological parameters for infection signs even in remote or high-risk areas. This real-time data analysis would enable timely alerts to healthcare providers and public health authorities, enhancing our ability to detect and respond to biological threats individually. Realizing such systems is a future goal deserving serious consideration.

Ethical concerns: The intersection of AI and biotechnology raises significant ethical concerns that warrant careful consideration. With accelerated advancements in AI, there are serious concerns about AI-guided development of biological hazards or biological weapons [10]. Further, the use of AI in biomedicine, particularly in areas like genetic data analysis and personalized medicine, raises questions about privacy and data security. There is also concern over bias in AI algorithms, which could result in unequal access to healthcare or flawed predictions that disproportionately affect certain populations. Moreover, the prospect of autonomous AI systems in decision-making roles such as in health diagnostics or resource allocation, challenges traditional notions of accountability and responsibility. As these technologies continue to evolve, it is crucial to establish ethical frameworks and regulatory measures that ensure their responsible use while prioritizing global safety, public welfare and individual rights.
In my next blog, I am excited to delve deep into how AI is addressing some of the long-standing pain-points in Agricultural Biotechnology!
References:
Abudayyeh, O.O., Gootenberg, J.S. Programmable biology through artificial intelligence: from nucleic acids to proteins to cells. Nat Methods 21, 1384–1386 (2024).
Boger, R. et al. In 2023 ICML Workshop on Computational Biology (2023)
https://www.elsevier.com/industry/ai-in-small-molecule-drug-discovery
Xu, Y., Ma, S., Cui, H. et al. AGILE platform: a deep learning powered approach to accelerate LNP development for mRNA delivery. Nat Commun 15, 6305 (2024).
Shin, I., Kang, K., Kim, J. et al. AptaTrans: a deep neural network for predicting aptamer-protein interaction using pretrained encoders. BMC Bioinformatics 24, 447 (2023).
Tian, T., Zhang, J., Lin, X. et al. Dependency-aware deep generative models for multitasking analysis of spatial omics data. Nat Methods 21, 1501–1513 (2024).
Carilli, M., Gorin, G., Choi, Y. et al. Biophysical modeling with variational autoencoders for bimodal, single-cell RNA sequencing data. Nat Methods 21, 1466–1469 (2024).
Rosen, Y., Brbić, M., Roohani, Y. et al. Toward universal cell embeddings: integrating single-cell RNA-seq datasets across species with SATURN. Nat Methods 21, 1492–1500 (2024).
Gharibshahian M, Torkashvand M, Bavisi M, Aldaghi N, Alizadeh A. Recent advances in artificial intelligent strategies for tissue engineering and regenerative medicine. Skin Res Technol. 2024; 30:e70016
de Lima RC, Sinclair L, Megger R, Maciel MAG, Vasconcelos PFDC, Quaresma JAS. Artificial intelligence challenges in the face of biological threats: emerging catastrophic risks for public health. Front Artif Intell. 2024 May 10;7:1382356.
Comments