“I have been working in this field throughout my entire career, fully aware that my peers share the same belief — in our lifetime, the issue of 'protein folding' is unlikely to be solved, especially the problem of predicting protein structures. And then AlphaFold came along!” On October 9, Professor Ma Jianpeng, a doctoral advisor, internationally renowned computational biologist, and the director of the Multiscale Research Institute of Complex Systems at Fudan University, told The Paper's Tech Division.
On October 9, 2024, in Stockholm, Sweden, members of the Nobel Chemistry Committee, including John Aquist, permanent secretary Hans Ellegren, and committee chair Heiner Linke, awarded this year's Nobel Prize in Chemistry to David Baker, Demis Hassabis, and John Jumper at the Royal Swedish Academy of Sciences. Image Credit: Visual China
On October 9 local time, the Royal Swedish Academy of Sciences announced that the 2024 Nobel Prize in Chemistry was awarded to three scientists. Half of the prize was awarded to David Baker from the University of Washington for his contributions to computational protein design, while the other half was jointly awarded to Demis Hassabis and John Jumper from the British artificial intelligence company Google DeepMind for their contributions to protein structure prediction.
This marks the second occurrence of Nobel Prizes being awarded to AI scientists, following the awarding of the 2024 Nobel Prize in Physics to two pioneers in AI on October 8.
In 2021, Professor Ma Jianpeng’s team at the Multiscale Research Institute of Complex Systems published the OPUS-Rota4 algorithm for predicting side-chain structures based on the polypeptide backbone, which significantly improved the accuracy of protein side-chain structural tests, addressing a vulnerability of Google’s AlphaFold.
It was reported that the aforementioned prediction algorithm has “currently advanced to OPUS-Rota6, with precision exceeding that of AlphaFold 2 and 3.”
Professor Ma Jianpeng, director of the Multiscale Research Institute of Complex Systems at Fudan University.
Regarding the 2024 Nobel Prize in Chemistry, Ma Jianpeng acknowledged that while there are doubts about AI scientists receiving the Nobel Prize, it actually involves two separate questions: first, whether predicting protein structures is worthy of a Nobel Prize; and second, whether the contributions of AI in this field merit such recognition.
“This is essentially why they were awarded this prize. Although this problem has not been entirely solved, significant progress has been made, and expectations have already surpassed what we thought possible in our lifetime. It is now usable,” said Ma Jianpeng.
He believes, just like Fudan University announcing the launch of at least 100 AI-related courses, that we must embrace AI—one doesn’t necessarily need to know how to write algorithms, but at the very least, one should know how to use it. He suggests that we should start teaching this from a young age, ensuring kids can use AI effectively.
Professor Ma Jianpeng, director of the Multiscale Research Institute of Complex Systems at Fudan University.
He also emphasized, why has Google DeepMind achieved breakthroughs with AlphaFold? This question holds practical significance for China.
“The Crown Jewel”: An Age-Old and Extremely Difficult Scientific Problem
Why are proteins important?
“Within each cell in your body, billions of microscopic machines — proteins — are working tirelessly.”
Some even say that almost every characteristic of life is related to proteins.
Indeed, proteins are fundamental to every biological process in every living organism; they are the cornerstone of life. Without proteins, life could not exist. Their complex and varied structures correspond to a multitude of astonishing functions, contributing to the rich diversity of life. The mystery of life lies behind these structures.
“When I teach my students, the first thing I explain is why predicting protein structures is so difficult and complex,” said Ma Jianpeng.
Amino acids connect to form polypeptides, and these polypeptides, akin to long chains, fold into stable three-dimensional structures, resulting in functional proteins. The prediction of a protein’s final “folded structure” based on an amino acid sequence is what defines the protein structure prediction problem. It is considered the “crown jewel” of modern molecular biology.
Ma Jianpeng stated, “This is not a new problem. It is an age-old conundrum, yet it remains extremely challenging.”
He cited an example: a protein composed of 100 amino acids is quite small, but if each amino acid has only two states — folded and unfolded (though in truth there are infinitely many states), that protein would have 2 raised to the power of 100 possible states.
“This number is so enormous that if humanity were to exhaustively enumerate it using any computer or to search for a correct answer, it would take longer than the age of the universe. Yet proteins can fold instantly,” Ma Jianpeng noted.
Scientists use experimental techniques like X-ray crystallography or cryo-electron microscopy to determine protein structures, but these methods are time-consuming and labor-intensive.
Researchers John Kendrew and Max Perutz at Cambridge University achieved groundbreaking discoveries in the late 1990s by successfully demonstrating the three-dimensional model of the first protein using a method called X-ray crystallography. They were awarded the Nobel Prize in Chemistry in 1962 to honor this achievement.
“In 2020, AlphaFold solved one of the greatest scientific challenges of the past 50 years,” states DeepMind’s website, “achieving a fundamental breakthrough in predicting protein structures.”
To date, AlphaFold has predicted the structures of over 200 million proteins—nearly all known proteins in the scientific community—and has aided scientists in understanding how life molecules interact.
AlphaFold software has released three major versions. In December 2018, a research team utilizing AlphaFold 1 ranked first in the overall standings of the 13th Critical Assessment of protein Structure Prediction (CASP13). In November 2020, another team using AlphaFold 2 again claimed first place in the CASP14 competition.
On July 15, 2021, a research paper on AlphaFold 2 was published online in the prestigious journal Nature, titled “Highly Accurate Protein Structure Prediction with AlphaFold.” John Jumper and Demis Hassabis were co-corresponding authors.
AlphaFold 3 was released on May 8, 2024. It can predict the structures of complexes formed by proteins with DNA, RNA, various ligands, and ions. Related research papers were also published that same day in the journal Nature.
According to DeepMind's website, millions of researchers around the world have utilized AlphaFold 2 to achieve discoveries in fields such as malaria vaccines, cancer treatment, and enzyme design. AlphaFold 3 expands beyond proteins, entering a broader realm of biomolecules. This leap could open doors to transformative sciences, from developing biorecyclable materials and more resilient crops to accelerating drug design and genomic studies.
Ma Jianpeng remarked, “If we look solely at the modeling of protein structures or drug design in the pharmaceutical industry, the precision of AlphaFold is far from ideal. However, it is undoubtedly superior to previous tools!”
Structure Prediction is a Technique, Design is an Art
Ma Jianpeng explained that the problem of predicting protein structures actually involves two specific queries — the process of protein folding and the final structure prediction. “One question is: how exactly does a protein fold? It’s about how those 100 amino acids form a polypeptide and fold into a stable structure. How do you navigate the journey from start to finish? This problem remains unresolved. However, from a biologist's perspective, they can bypass the first question; they don’t care about the folding process as long as you can provide the final protein structure given the sequence. They do not concern themselves with the pathway. In fact, the pathway is the bigger headache.”
In comparison to predicting structures, Ma Jianpeng noted that designing a new protein is significantly more challenging. The former involves solving the problem of predicting the structure of a protein already existing in nature, while the latter entails creating a completely novel structure. “So I always say that folding is a technique, and design is an art.”
One of the 2024 Nobel Prize winners in Chemistry, David Baker, earned his Ph.D. in Biochemistry from the University of California, Berkeley, under the supervision of Randy Schekman and pursued postdoctoral research in Biophysics under David Agard at the University of California, San Francisco. He is currently a Professor of Biochemistry at the University of Washington and the Director of the Institute for Protein Design at the University of Washington. Baker’s lab develops protein design software and employs it to create molecules to address challenges in medicine, technology, and sustainability. One of his recent works involves developing robust machine learning methods for generating functional proteins.
Baker is also an adjunct professor in genomic science, bioengineering, chemical engineering, computer science, and physics at the University of Washington. He has published over 600 research papers, co-founded 21 companies, and holds more than 100 patents.
Ma Jianpeng explained that Baker was involved in protein structure prediction long before the emergence of AlphaFold, often winning the CASP competitions. His prediction accuracy reached over forty percent. “Baker’s outstanding strength is that he not only excels at computation and prediction, but he also conducts experiments and designs. He comes from an experimental background, and his team is a quintessential ‘wet and dry’ combined team, which leads to greater success.”
In the late 1990s, David Baker began developing the Rosetta software capable of predicting protein structures. The research group mapped a protein with a novel structure and then employed Rosetta to calculate which amino acid sequence could produce the desired protein. It turned out that Rosetta could indeed generate proteins. The protein Top7 they developed closely resembled the structure they designed.
It is evident that this software can be used to design desired proteins for applications in drugs, vaccines, nano materials, and micro sensors.
AI is Everywhere in Life: An Indispensable Part, Start from Childhood; Ignorance is No Longer an Option
“I have a perspective: I believe the success of AlphaFold may have a more significant impact on the fields of AI and computer science than on protein structure prediction itself,” Ma Jianpeng expressed.
This viewpoint stems from his long-term observations: in 1997, the chess-playing computer “Deep Blue” defeated international chess champion Garry Kasparov. At that time, some believed the sky would fall and computers would disrupt the world, yet nothing occurred; people considered chess to have a small enough board that it could be conquered, but it was thought that Go could never be defeated by a computer. In March 2016, AlphaGo defeated Korean Go master and world champion Lee Sedol with a score of 4 to 1. Again, people thought the world was ending. However, some dismissed it as merely a game of Go. Until DeepMind invested heavily to create AlphaFold.
“Once people in the AI domain saw that even such a complicated problem as predicting protein structures could be solved, they wondered if face recognition and self-driving cars were still significant challenges? Consequently, it truly became 'AI is everywhere in life.'” “Although AlphaFold is not perfect, it is genuinely usable; it can accelerate scientific research,” Ma Jianpeng stated, noting that AlphaFold's success has birthed a term we hear daily — AI for science, which integrates AI tools into scientific research.
At a press conference announcing Fudan University's 2024 admission and training policy, information revealed that starting from the fall semester of 2024, Fudan University will introduce at least 100 AI-related courses for the 2024-2025 academic year. AI courses will be included in all students' academic schedules. “We must start from childhood; you cannot remain uninformed about AI nor unskilled in using it.” “Not everyone needs to work on algorithms every day, but it’s crucial for a vast number of tech workers—even those merely conducting experiments—to at least know how to use AI,” Ma Jianpeng emphasized. He posited that AI algorithms are indeed powerful, and AlphaFold has genuinely practical value, unlike earlier theoretical pursuits that were self-serving. The existence of technology like AlphaFold means that experimental scientists, including renowned researchers like Yan Ning and Shi Yigong, could analyze protein structures more rapidly; however, this does not mean experiments are no longer necessary. “It still cannot replace experimentation. At least for today, the 'gold
Comments