Bacteria Defense Insights Could Revolutionize Genetic Editing

3 April 2026
colind88
News Feed

Bacteriophage Virus attacking Bacterium — Credit: fpm/Getty Images

A computer program that can predict which genes help bacteria to defend themselves against viruses could lead to the next generation of precision genetic engineering tools.

The artificial intelligence model recognizes genetic sequences involved in defenses that act against bacteriophages—viral invaders that infect bacteria.

These anti-viral immune systems have already been repurposed into powerful gene-editing technology, such as CRISPR-Cas, that enable DNA sequences to be precisely cut, modified, or deleted within an organism.

The DefensePredictor tool, outlined in Science, is available as an open-source tool to enable the discovery of more prokaryotic immune systems.

“Identifying new antiphage defense systems may yield the next generation of precision molecular tools while also shedding important light on the ongoing arms race between bacteria and phages,” said MIT-based molecular biologist Michael Laub, PhD, and co-workers.

Intense selective pressure to evade or survive infection has driven the evolution of numerous antiphage defense mechanisms, including restriction enzymes and the CRISPR-Cas systems.

While antiphage immunity genes often cluster into “defense islands” in prokaryotic genomes, this does not always occur and many systems are dispersed or carried on mobile elements such as plasmids, prophages, and transposons.

In an attempt to create a model to identify antiphage proteins, Laub and team first looked at around 17,000 genomes of prokaryotic organisms.

They labelled homologs of known defense and nondefense genes and built representations of the proteins coded by these genes as well as their four nearest neighbors on the genome.

DefensePredictor was trained through this to distinguish whether a gene was involved in defense systems.

After performing well in silico, it was tested on 69 diverse Escherichia coli genes and identified 624 different proteins that it confidently predicted were involved in defense, including 154 that shared no detectable homology to known defense proteins.

Nearly half of the defense proteins identified were not encoded in plasmids, prophages, or defense islands, showing that the model was able to identify systems in a wide range of genomic contexts.

Of 94 predicted genes tested in the lab, 42 provided protection against at least one of 24 phages tested, giving a validation rate of around 45%.

Fifteen protein domains across these 42 systems had not previously been validated as defensive, suggesting new immune systems remain undiscovered.

Expanding the predictive capacity of DefensePredictor beyond E. coli to 1000 diverse prokaryotic genomes revealed more than 5000 predicted defense proteins that were not clear homologs of those already known.

Another Science research article in the same issue of the journal also showed how AI could uncover unexplored diversity in bacterial immunity.

Ernest Mordret, PhD, from the Pasteur Institute, and co-workers demonstrated how deep-learning frameworks could lead to the large-scale discovery of antiphage and a vast atlas of bacterial antiviral immunity.

The team developed three complementary deep-learning models to predict antiphage proteins by leveraging genomic context (ALBERTDF), amino acid sequence (ESMDF), or both (GeneCLRDF).

Twelve newly predicted antiphage systems were then experimentally validated in Escherichia coli and Streptomyces albus.

When applied to more than 30,000 bacterial genomes, the models predict 2.39 million antiphage proteins, 85% of which had no previously known link to immunity, corresponding to approximately at least 23,000 predicted antiphage operon families.

All predictions have been made freely available through an interactive antiphage atlas.

“We developed deep learning models to predict antiphage systems,” the authors summarized.

“These methods extract cues about the “defensiveness” of a protein from two seemingly orthogonal sources: its genomic context across thousands of genomes, and its own amino acid sequence.

“By combining these complementary signals, we move from a fragmented, incomplete view of bacterial immunity toward a more resolved and quantitative understanding of its repertoire.”

Share This

colind88

Related Posts

REACH OUT!