Artificial Intelligence (AI) for Genome Editing Technologies

Runjhun Saran
Nov 26, 2024
6 min read

Updated: Dec 3, 2024

26th November 2024

What is genome editing?

Genome editing (GED) technologies enable precise modifications to DNA sequences within living cells, revolutionizing the study of gene functions and paving the way for innovative therapeutic approaches. Among the most advanced GED tools are zinc-finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and CRISPR-Cas-associated nucleases (CRISPR/Cas9). CRISPR/Cas9 stands out as the most widely used due to its adaptability, efficiency, and simplicity. In today's date, GED's scope has expanded to include techniques such as base editing (BED), which allows single-nucleotide modifications; prime editing (PED), which leverages reverse transcription to introduce precise DNA insertions, deletions, and substitutions; epigenome editing (epi-GED), as well as CRISPRi & CRISPRa techniques, which regulate gene expression without introducing DNA breaks or DNA damage or altering DNA. The field of cell and gene therapy is rapidly advancing, with significant progress made in the development of CRISPR-based treatments, culminating in numerous clinical trials [1]. GED technologies hold immense potential for treating various human diseases. They can be used to correct harmful mutations in genes linked to conditions such as cancer and cardiovascular diseases, including long QT syndrome and hypertrophic cardiomyopathy. These technologies can also deactivate defective genes, introduce functional genes, and address genetic disorders like sickle cell anemia, cystic fibrosis. Additionally, GED has applications in targeting genes involved in neurodegenerative diseases, such as Alzheimer’s and Huntington’s, and in engineering cells to resist infections like HIV and Hepatitis B.

How is AI advancing the field of Genome editing?

Genome editing (GED) technologies face several challenges, with the most notable being the efficiency of on-target editing and the risk of unintended off-target effects. However, with the help of ever-increasing GED data e.g. GEM (Genome Editing Meta-database) and other databases listed by Dixit, et al. (2024), it is now possible to develop AI-driven models and tools which can enhance the precision, efficiency, robustness, and cost-effectiveness of GED techniques. AI is augmenting the GED pipelines in identification of appropriate gene targets, prediction of outcome efficiencies, personalised gene therapy, and developing more effective GED molecular tools.

AI for designing efficient GED Molecular Toolkit

guide RNAs: The role of AI in advancing GED technologies is undeniably transformative. In CRISPR-Cas and related nuclease-mediated genome editing, the CRISPR-Cas protein machinery recognises its genomic target with the help of guide RNAs (gRNAs). gRNAs are complementary to specific DNA regions, and depending on their design they can lead the CRISPR-Cas machinery to a unique loci or to multiple sites as targets. Multiple tools have been developed to find targets and design gRNAs with heightened precision for minimizing off-target effects while maximizing on-target editing efficiency. While many procedural tools exist, some of the AI based tools (Linear Regression, Support Vector Machines, Gradient Boosted Trees, Convolutional Neural Networks (CNN) & DL frameworks) are: CRISPRscan, WU-CRISPR & library-on-library approach, tool by Doench, et al. (2016), sgRNAScorer2, TUSCAN, SSC, CHOPCHOP, and CRISPR-broad [2, 3]. Similarly, the Prime Editing Design (PED) toolkit comprises three core components: a reverse transcriptase, pegRNA, and a Cas9 nickase. Achieving high editing efficiency with PED requires optimized design of the PED guide RNA (pegRNA). Many computational tools exist for designing efficient pegRNAs such as Easy-Prime and PrimeDesign.

GED Enzymes: At the same time, AI based protein structure prediction tools such as Alphafold2 are being utilized to develop new and efficient base editing enzymes bearing distinct features with remarkable accuracy. e.g. the cytosine base editing enzyme by Huang et al. (2023). Another breakthrough is the development of protein language model called OpenCRISPR-1, a Generative AI model trained on millions of protein sequences [4]. This tool can design novel variants of CRISPR gene-editing proteins, some of which have been shown to work as expected in the laboratory [4]. Additionally, another AI Foundation Model called Evo has been developed enables prediction and generation tasks from the molecular to genome scale. It is capable of designing novel CRISPR systems, which consist of a DNA- or RNA-cutting enzyme paired with RNA molecules that guide the molecular scissors to specific target sites [5].

AI for Target Identification & Prediction of GED Outcome Efficiency

CRISPR/Cas: The profound impact AI in gene target identification, and design of efficient gRNA sequences has been elaborated above. In addition to this, several computational frameworks are available for predicting the non-specific or off-targeting of gRNAs i.e. if they are leading the GED machinery to the intended target (on-target binding) vs. a faulty target (off-target binding). Some of these are CasOT, CasOFFinder, OffScan, DeepCRISPR, CROP-IT, CRISPR-Local, CRISPR MultiTargeter, CRISPR-broad, etc. Few AI tools developed to predict the efficiency of these guides are: CNN-based TIGER, attention-based CNNs CRISPR-ONT & CRISPR-OFFT, DL-based CRISPRon, DeepCas9, piCRISPR, & DeepHF, etc.

BED: Base editing is a highly effective genome editing technique that enables precise alteration of individual nucleotides in the genome without introducing double-stranded breaks. Numerous machine learning (ML) and deep learning (DL) models have been developed to enhance the efficiency of base editors, with a primary focus on predicting and optimizing editing outcomes. BE-Hive (deep conditional autoregressive model) predicts editing sequences and base effectiveness. While BE-DICT (attention-based DL model) predicts the outcomes of adenine- and cytosine-based editors, the DL based webtools DeepABE and DeepCBE predict the BED efficiencies and outcomes of adenine-based editors and cytosine-based editors, respectively. XGB Regressor based CAELM (predicts efficiency of cytosine- based editors) is shown to accurately forecast the outcome of BED n-situ as well.

PED: Prime editing is a groundbreaking technique, to introduce precise DNA insertions, deletions, and substitutions. AI tools have been developed for predicting the outcome of PED. For e.g. the DL-based framework published by Koeppel et al. (2023), the Recurrent Neural Networks (RNN) based PREDICT that has learned from a large dataset of over 90,000 PED experiments.

epi-GED: Epigenome editing enables precise modifications to gene regulation without altering the underlying DNA sequence. This technique allows manipulation of DNA methylation patterns, histone modifications, and RNA editing to influence gene expression. Epi-GED holds significant promise for applications in disease treatment, functional genomics, and stem cell therapies. CRISPR/Cas-based epi-GED involves targeting specific genomic DNA sequences with CRISPR/Cas nucleases with the help of small guide RNAs (sgRNAs). EpiCas-DL is a deep learning tool designed to predict sgRNA activity for CRISPR-based epi-GED. It also identifying key factors that affect sgRNA effectiveness in gene activation and silencing. Compared to base and prime editing, the application of AI in epi-GED remains a relatively new and underexplored field.

Personalised Precision Gene Therapy

Precision medicine focuses on personalizing treatment by tailoring medical interventions based on biological or molecular profiling, either for specific populations or individual patients. All of the above research in GED techniques as well as state-of-the-art omics research is culminating into prospects of personalised gene therapy. Taking into account a specific patients genetic makeup (using the genome, transcriptome, epigenome, or proteome profiles), AI’s data analysis and predictive capabilities can be harnessed to design tailored and precise patient-specific gene editing treatments, and predict the patient’s response to the treatment as well. The future of healthcare looks bright!

I my next blog I am excited to elaborate upon the application of Artificial Intelligence for Antibody Discovery!

References:

1) Front. Bioeng. Biotechnol., 07 January 2024 Sec. Cell and Gene Therapy Volume 11 - 2023 | https://doi.org/10.3389/fbioe.2023.1335901

2) Bradford J, Perrin D (2019) A benchmark of computational CRISPR-Cas9 guide design methods. PLoS Comput Biol 15(8): e1007274. https://doi.org/10.1371/journal.pcbi.1007274

3) Veluchamy, A., Teles, K. & Fischle, W. CRISPR-broad: combined design of multi-targeting gRNAs and broad, multiplex target finding. Sci Rep 13, 19717 (2023). https://doi.org/10.1038/s41598-023-46212-x

4) Design of highly functional genome editors by modeling the universe of CRISPR-Cas sequences

Jeffrey A. Ruffolo, Stephen Nayfach, Joseph Gallagher, Aadyot Bhatnagar, Joel Beazer, Riffat Hussain, Jordan Russ, Jennifer Yip, Emily Hill, Martin Pacesa, Alexander J. Meeske, Peter Cameron, Ali Madani

bioRxiv 2024.04.22.590591; doi: https://doi.org/10.1101/2024.04.22.590591

5) Sequence modeling and design from molecular to genome scale with Evo

Eric Nguyen, Michael Poli, Matthew G. Durrant, Armin W. Thomas, Brian Kang, Jeremy Sullivan, Madelena Y. Ng, Ashley Lewis, Aman Patel, Aaron Lou, Stefano Ermon, Stephen A. Baccus, Tina Hernandez-Boussard, Christopher Ré, Patrick D. Hsu, Brian L. Hie

bioRxiv 2024.02.27.582234; doi: https://doi.org/10.1101/2024.02.27.582234

Artificial Intelligence (AI) for Genome Editing Technologies

Recent Posts

Comments