Nature Reveals Characteristics of Genomes Engineered by Artificial Intelligence

Technologies
BB.LV
Publiation data: 27.03.2026 13:47
Такому ученому неведомы моральные сомнения.

Various tasks were set before the trained neural network.

Last year, the media reported on genomes constructed by AI: at that time, the focus was on the genomes of bacteriophages, which, despite their AI origin, were capable of infecting bacterial cells.

Recently, the same researchers published an article in Nature demonstrating that it is possible to create genomes not only of viruses but also of bacteria, mitochondria, and partially of yeast. However, the new genomes currently exist only in digital form, and their functionality in living cells has not been tested.

Every genome contains a complex system of interactions and interdependencies, which pertains not only to the coding sequences that store information about proteins (which we usually refer to as genes) but also to numerous regulatory regions. Their total volume significantly exceeds the size of the protein sequences themselves. Genes can influence each other through the proteins encoded within them, they can be governed by a single regulatory block, and different regulatory blocks can affect the same sequence in the genome, and so on. The effect of mutations occurring in more or less significant regions of the genome will depend on such interdependencies.

Although reading genomes has become relatively straightforward, the interactions of DNA regions within them are not always clear. At the same time, genomes of different organisms are related to each other to varying degrees, although this relationship can be very distant. Evolutionary kinship is manifested not only in the sequences of genes but also in how genes and regulatory regions are organized within the genome, how they are positioned relative to each other. If we know the common features of the genomes of bacteriophages or, say, a group of bacteria, we can attempt to construct a new bacteriophage—or bacterium—with some new characteristics that are not present in its relatives.

What does it mean to "know the common features"? We can find out why the genes in the genomes of these organisms are arranged in a particular way and what the biological significance of deviations from the general bacteriophage or general bacterial order is. Alternatively, we can feed the genomic data to algorithms similar to those that decode protein structures or construct sentences from words. Such models operate on entire genomes, as well as RNA sequences that copy information from a particular region of the genome, and the proteins into which this information ultimately transforms. With artificial intelligence, we can utilize genomic patterns without struggling with the question of what exactly these patterns mean.

This is how phages were constructed in the study we mentioned at the beginning, and this is how the genomes described in the new article in Nature were constructed. An algorithm called Evo2, related to AlphaFold and ChatGPT, was trained on complete genomes of more than 100,000 species from across the tree of life, from bacterial viruses to humans; the total number of genetic letters processed by Evo2 exceeded 9.3 trillion.

Next, various tasks were set before the trained neural network. In particular, it was required to determine how a particular mutation would affect the organism, both in the coding region of the genome and in the non-coding region. One of the coding regions on which Evo2 was tested was the BRCA1 gene, long known for its association with breast cancer. Mutations in it, as in any gene, can be either dangerous or harmless, and Evo2 distinguished them with 90% accuracy.

In other tests, the algorithm was supposed to generate a genome similar to a real one. According to the authors of the study, they produced the genome of the bacterium Mycoplasma genitalium, the genome of a mitochondrion, and one of the chromosomes of baker's yeast. The generated genomes were then checked for plausibility, and for the genome of the bacterium, the proportion of genes resembling real ones was about 70%.

However, if it is assumed that a cell with such a genome should live and function, one must consider the remaining 30%, because it is unlikely that something can be half or two-thirds alive—life is either present or absent. Additionally, the arrangement of genes must be taken into account: even if all the necessary genes are present in the synthetic genome, if they are positioned incorrectly, they will function incorrectly.

Only during the preparation for publication did questions arise regarding Evo2 in the sense that the genomes generated by the algorithm differ in architecture from natural ones. This does not mean that an AI-generated genome will not work, but it still needs to be tested in experiments. And perhaps it is not worth immediately aiming for the creation of complete genomes, even such relatively simple ones as the genomes of bacteria or mitochondria. For now, it would be quite sufficient if the neural network could accurately predict the effects of mutations, relieving researchers of unnecessary experimental work.

The bacterium M. genitalium was chosen for a reason in this case. It has only 525 genes, of which 470 code for proteins; 375 of them are vital. Its genome is one of the smallest among living organisms (excluding viruses). M. genitalium and other mycoplasmas have long been used in experiments where the natural DNA of a cell is replaced with a synthetic edited copy. Another option for intervening in the genome is changing the genetic code itself, correcting the dictionary of letter triplets corresponding to a particular amino acid. Designing a genome using AI appears to be an even more ambitious undertaking in terms of the changes that can be made to the genetic text. However, it is worth repeating once again that the achievements of AI still need to be verified in "live" experiments.

<iframe width="560" height="315" src="https://www.youtube.com/embed/C4VXu8lpBXc?si=ntyH3yqGOFXize7d" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

ALSO IN CATEGORY

READ ALSO