One in a Googol: DeepMind's Protein Folding AI
A couple weeks ago I talked about all the hard science that’s being done by intrepid gamers and idle PCs. One of the coolest was, and is, Folding@Home, a groundbreaking research project that allows millions of distributed, amateur scientists to gain insight into diseases like Alzheimer’s, cancer, and even COVID-19 by simulating protein patterns to help determine how they’ll fold.
Hang on, I’ll explain.
Most biological functions, whether desired or destructive, revolve around proteins. And the shape of a protein determines its function. The problem is that while scientists have managed to identify something like 200 million proteins over the past six decades, the structures — and therefore the behavior — is known for only a small fraction.
Proteins are the most functionally and structurally complex molecules we know of, and inferring their final shape from their contents is a nearly impossible task.
These tiny chains of amino acids can bend and shift and form themselves into a near-incomprehensible number of different nucleic acid sequences, or, “shapes.” How many? 1 googol* cubed. That’s 1 and then 300 zeroes.
*Yes, that’s where they got the name.
In 1972, Christian Anfinsen won the Nobel Prize in Chemistry for his work that said it was theoretically possible to determine a protein’s shape based on the sequence of its amino acids.
Since then, groups of scientists around the world have challenged one another every two years to predict the shapes of 100 proteins from their amino acid structures alone. Their predictions are tracked against 3D-modeling results by the Critical Assessment of Protein Structure Prediction team.
CASP measures the accuracy of the results, on a scale of 0-100.
This week, Google-owned AI group DeepMind (which up until now was best known for being super good at Starcraft) changed the game.
AI to the Rescue
DeepMind’s latest AI program AlphaFold has proven it can, with accuracy comparable to laboratory testing but leagues faster, predict a protein’s shape knowing only its amino acid structure.
DeepMind trained AlphaFold on a database of 170,000 protein sequences and their shapes. After a few weeks, DeepMind entered AlphaFold into the CASP test, where it not only outperformed other computer programs, but performed at a level rivaling the best experimental laboratory methods.
Professor John Moult, co-founder of CASP, said:
“We have been stuck on this one problem – how do proteins fold up – for nearly 50 years. To see DeepMind produce a solution for this, having worked personally on this problem for so long and after so many stops and starts, wondering if we’d ever get there, is a very special moment.”
For the time being, AlphaFold will focus initially on malaria, sleeping sickness and leishmaniasis. But in the future, AI could unlock secrets like how some viruses’ proteins interact with the receptors on human cells.
Viruses like COVID-19.
I, for one, welcome our robot overlords.