A new study peers into the structure of hundreds of protein complexes, tiny machines in cells that control everything from energy use to DNA replication. Researchers at the University of Washington’s Institute for Protein Design helped lead the study, which leverages new deep learning tools and could lead to new ways to treat disease.
This spring the IPD unveiled its deep learning software to predict protein folding, on the heels of a similar tool built by Alphabet subsidiary DeepMind. The tools have stunned researchers with their speed and accuracy at predicting how proteins form into three dimensional shapes.
Proteins are made up of strings of amino acid building blocks, but they need to fold correctly to work. IPD’s RoseTTAFold and DeepMind’s AlphaFold have been used to predict the shapes of thousands of proteins since their release.
Inside cells, proteins often interact with each other in machine-like protein complexes that perform a variety of tasks. Many approved drugs also interfere with protein complexes, such as chemotherapies that hijack machinery involved in DNA replication and cell division.
“To really understand the cellular conditions that give rise to health and disease, it’s essential to know how different proteins in a cell work together,” said co-lead author Ian Humphreys, a graduate student in the lab of IPD head David Baker, in a press release.
In the new study, Humphreys, Baker and their colleagues model most of the protein interactions that occur in the yeast Saccharomyces cerevisiae. This single-celled organism resembles human cells in how it carries out basic functions like growth, division, waste disposal and environmental sensing — all controlled by protein complexes.
The yeast has about 6,000 proteins. To predict which of these proteins might interact, the researchers turned to evolutionary biology. As proteins evolve, they often accumulate mutations in tandem — if a building block is changed in one protein, a corresponding building block is changed in a partner protein. Such tandem changes assure that the complex stays intact.
The researchers identified pairs of proteins that acquired mutations in such a linked way, suggesting that they might physically interact. They then deployed RoseTTAFold and AlphaFold to model the three-dimensional shape of these interacting proteins.
After sifting through millions of potential pairings, the deep learning tools pulled out 1,506 proteins that were likely to interact. From these proteins, the tools successfully modeled 712 protein complexes.
More than 100 of the protein interactions had not been identified before. One of the new complexes contains a protein involved in DNA repair and cancer progression, and another contains an enzyme implicated in neurodevelopmental disorders and cancer.
The new findings open the door to future investigations examining the complexes and how they work. And they could ultimately lead to drugs that interfere with cellular machinery involved in disease.
“These models give hypotheses for experimentalists to test,” Qian Cong told Science, which published the study Thursday. Cong is co-corresponding author with Baker and was a UW research fellow before becoming an assistant professor at University of Texas Southwestern Medical Center last year.
The new findings also set the stage for later studies deploying RoseTTAFold and DeepMind to map the universe of human protein complexes.
The study involved computational experts, researchers in evolution, and structural biologists who helped interpret the three-dimensional protein models.
“As computer methods become more powerful, it is easier than ever to generate large amounts of scientific data, but to make sense of it all still requires scientific experts,” said Baker in the release. “This is community science at its best.”