Scientists used AI to rewrite part of life’s alphabet
An engineered E. coli strain survived after one amino acid was designed out of many of its ribosomal proteins—an early test of whether life’s chemistry can be simplified

An illustration of protein production inside a bacterium. In a new study, researchers used AI to redesign some E. coli ribosomal proteins to work without the amino acid isoleucine.
BSIP/Education Images/Universal Images Group via Getty Images
Nearly all known life builds proteins from the same alphabet of 20 canonical amino acids. Strung together in different orders, those building blocks form the proteins that make cells work. In a new Science study, researchers at Columbia University, the Massachusetts Institute of Technology and Harvard University used artificial-intelligence-guided protein design to test how much of that alphabet can be pared back: they engineered an Escherichia coli strain that survived after it was redesigned to not have a specific amino acid in its ribosomal proteins.
The team did not create a true 19-amino-acid organism. The engineered strain still uses the targeted amino acid, isoleucine, throughout most of its genome. But the result suggests that one of life’s most ancient and essential machines can tolerate at least partial simplification—and that AI may help biologists test the limits of life’s chemistry.
“The underlying question that we seek to ask is what early life looks like,” says Harris H. Wang, a professor of systems biology at the Columbia University Irving Medical Center and senior author of the study. Researchers think all life today descends from an ancient, single-celled organism that lived more than four billion years ago. But some suspect that earlier, simpler life-forms that predate even this common ancestor may have run on a leaner chemistry. Wang’s team wanted to find out whether modern cells could be engineered in that direction.
On supporting science journalism
If you’re enjoying this article, consider supporting our award-winning journalism by subscribing. By purchasing a subscription you are helping to ensure the future of impactful stories about the discoveries and ideas shaping our world today.
“Think about language. There are 26 letters in the English alphabet, but do you really need 26, or can you simplify that to 25 or 24?” Wang says. The team chose to remove isoleucine because it resembles the amino acids valine and leucine closely enough that, in principle, some proteins might tolerate isoleucine’s removal when it was replaced with one of them. They worked with E. coli, one of biology’s best-studied organisms, and targeted its ribosomes, the molecular machinery that builds proteins and is itself a sprawling complex of more than 50 proteins. “Like in a video game, we just pushed the ‘skip to the final boss’ button,” Wang says.
The first attempt was brute force. The researchers took 39 essential or highly expressed E. coli genes and replaced every isoleucine with valine or leucine, like a genetic find-and-replace. The engineered bacteria survived but did so poorly. Their fitness dropped to about 40 percent of wild-type E. coli. The team’s target was 90 percent. To close the gap, the researchers turned to AI.
They combined two kinds of models. First, sequence-based protein language models such as ESM2 and MSA Transformer read protein sequences and suggested evolutionarily plausible mutations that a simple swap would miss. Then structure-based AI models such as AlphaFold2 and ProteinMPNN checked that the redesigned proteins would fold into the correct shapes and fit alongside neighboring molecules.
The proposals were stranger than the team expected. “Some of these AI designs were really surprising,” Wang says. “They didn’t look like anything we would have anticipated.” In one case, while redesigning a ribosomal protein called RpsJ, the AI remodeled an alpha helix—a structural element bridging different parts of the ribosome—and introduced eight new nearby mutations to compensate for the substitution of just two isoleucines. “Maybe these machine-learning systems know some aspects of biology we can experimentally verify but we don’t yet understand,” Wang says.
“A noteworthy part of the project is the evolving contribution of AI to this work,” says Tom Ellis, a professor of synthetic genome engineering at Imperial College London, who was not involved in the study. “In the last seven years, the AI-enabled modeling of proteins and mutations in proteins has come on leaps and bounds.”
The team first tested each AI-suggested change one at a time, confirming individual edits could meet the 90 percent fitness goal. Combined, the changes killed the cells. So the researchers debugged the genome by hand. Starting fresh from the natural E. coli sequence, they added the AI-designed pieces in small batches until the cells stopped growing, narrowing down the lethal interaction to a single region so they could fix it.
The final strain, Ec19, carries 21 isoleucine-free ribosomal proteins out of 52, alongside AI-redesigned versions of the others that the team validated individually but could not yet combine. The strain is robust: fitness stays above 90 percent of wild-type E. coli, and natural selection did not revert the changes over 450 generations.
“The paper is a tour de force of synthetic biology to address a really interesting question that’s fundamental to the origin of life on Earth,” Ellis says. He adds that this work could eventually inform biotechnology beyond Earth, in environments where not every amino acid is available.
For now, Ec19 remains a 20-amino-acid organism. Wang and his colleagues purged 382 isoleucine residues from ribosomal proteins, but the rest of its genome still contains more than 81,000 isoleucine residues across thousands of other proteins. A truly 19-amino-acid organism will require cheaper, faster DNA synthesis and more capable AI models, including genomic language models trained on whole genomes rather than just proteins.
Still, showing that ribosomal proteins can survive even partial simplification gives researchers a template for the rest of E. coli. “Considering the ribosome is probably the oldest remnant of the original common ancestor organism that first evolved protein synthesis, it’s also a poetic thing to demonstrate this ambitious work on,” Ellis says.
It’s Time to Stand Up for Science
If you enjoyed this article, I’d like to ask for your support. Scientific American has served as an advocate for science and industry for 180 years, and right now may be the most critical moment in that two-century history.
I’ve been a Scientific American subscriber since I was 12 years old, and it helped shape the way I look at the world. SciAm always educates and delights me, and inspires a sense of awe for our vast, beautiful universe. I hope it does that for you, too.
If you subscribe to Scientific American, you help ensure that our coverage is centered on meaningful research and discovery; that we have the resources to report on the decisions that threaten labs across the U.S.; and that we support both budding and working scientists at a time when the value of science itself too often goes unrecognized.
In return, you get essential news, captivating podcasts, brilliant infographics, can’t-miss newsletters, must-watch videos, challenging games, and the science world’s best writing and reporting. You can even gift someone a subscription.
There has never been a more important time for us to stand up and show why science matters. I hope you’ll support us in that mission.
