AI advance could transform battle against disease
|bbc.com 22 Jul 2021 at 13:33|
Understanding protein structures is critical for advancing medicine, but until now, only a fraction of these have been worked out.
Researchers used a program to predict 350,000 protein structures belonging to humans and other organisms.
The instructions for making human proteins are contained in our genomes - the DNA contained in the nuclei of human cells.
There are around 20,000 of these proteins expressed by the human genome. Collectively, biologists refer to this full complement as the "proteome".
The AI program used for the work is called AlphaFold. It was able to make a confident prediction of the structural positions for 58% of the amino acids (the constituents of proteins) in the human proteome.
Of this, the positions of 35.7% were predicted with a very high degree of confidence, which is double the number of structures confirmed by experiment.
"We believe it s the most complete and accurate picture of the human proteome to date," said Dr Demis Hassabis, chief executive and co-founder of Deep Mind.
"We believe this work represents the most significant contribution AI has made to advancing the state of scientific knowledge to date.
"And I think it s a great illustration and example of the kind of benefits AI can bring to society." He added: "We re just so excited to see what the community is going to do with this."
image captionThe research could lead to enzymes that can break down the plastic polluting our environment
, DeepMind researchers detailed how AlphaFold predicted the structures for 350,000 different proteins, including not only the 20,000 in the human proteome, but those of so-called model organisms used in scientific research, such as E. coli, yeast, the fruit fly and the mouse.
The structural layout of different proteins can be worked out using various techniques, including X-ray crystallography, cryogenic electron microscopy (Cryo-EM) and others. But none of these is easy to do: "It takes a huge amount of money and resources to do structures," Prof John McGeehan, a structural biologist at the University of Portsmouth, told BBC News.
Therefore, structures are often determined as part of targeted scientific investigations, but no successful project until now had set out to systematically determine structures for all the proteins made by the body.
In fact, just 17% of the proteome is covered by a structure confirmed experimentally.
Commenting on the predictions from AlphaFold, Prof McGeehan said: "It s just the speed - the fact that it was taking us six months per structure and now it takes a couple of minutes. We couldn t really have predicted that would happen so fast."
"When we first sent our seven sequences to the DeepMind team, two of those we already had the experimental structures for. So we were able to test those when they came back. It was one of those moments - to be honest - where the hairs stood up on the back of my neck because the structures [AlphaFold] produced were identical."
Prof Edith Heard, from the European Molecular Biology Laboratory (EMBL), said: "We at EMBL believe this will be transformative for our understanding of how life works. That s because proteins represent the fundamental building blocks from which living organisms are made."
"The applications are limited only by our understanding."
The applications we can envisage now include developing new drugs and treatments for disease, to designing future crops that can resist climate change, or enzymes that can break down the plastic that pervades the environment.
Prof McGeehan s group is already using AlphaFold s data to help develop faster enzymes for degrading plastic. He said the program had provided predictions for proteins of interest whose structures could not be determined experimentally - helping accelerate their project by "multiple years".
Dr Ewan Birney, director of EMBL s European Bioinformatics Institute, said the AlphaFold predicted structures were "one of the most important datasets since the mapping of the human genome".
DeepMind has teamed up with EMBL to made the AlphaFold code and protein structure predictions openly available to the global scientific community.
Dr Hassabis said DeepMind planned to vastly expand the coverage in the database to almost every sequenced protein known to science - over 100 million structures.