Consortium aims to accelerate drug discovery process
DOI: 10.1063/PT.3.3814
A new national laboratory-industry-university collaboration will bring some of the world’s most powerful supercomputing resources to bear in the search for new cancer-fighting drug therapies. The effort moves beyond the pharmaceutical industry’s screening of thousands of candidate drug compounds against biological targets to screening of virtual molecules using physics-based biological models, according to key architects of the Accelerating Therapeutics for Opportunities in Medicine (ATOM) partnership.
The three-year collaboration announced this past October teams GlaxoSmithKline (GSK) with Lawrence Livermore National Laboratory; the University of California, San Francisco (UCSF); and the National Cancer Institute’s (NCI’s) Frederick National Laboratory for Cancer Research. ATOM aims to develop algorithms that greatly reduce the time needed to proceed from the identification of drug targets such as proteins to the development of compounds that can be clinically shown to affect those targets. The process currently takes four to six years; ATOM’s goal is 12 months.
“There are four concurrent challenges that any drug has to meet,” says Eric Stahlberg, director of strategic and data science initiatives at the Frederick lab. “It has to be effective, safe, able to be made, and able to be delivered.” ATOM will tackle them together, he said, using computational models that can provide “much earlier and advanced insight than one could get through the traditional biological experimental approach.”
John Baldoni, GlaxoSmithKline’s senior vice president for R&D, was the driving force behind the ATOM partnership, which seeks to develop algorithms to speed drug discovery.
GLAXOSMITHKLINE
Needle in a haystack
The number of possible drug compounds is almost unlimited, but only around 1700 molecules have been approved as drugs by the Food and Drug Administration. Any search must begin with a hypothesis to narrow the scope of possibilities. Every feature needed to make a drug effective—selective binding, target modulation, nontoxicity, and so on—is encoded in the molecule’s structure, says Michelle Arkin, codirector of the Small Molecule Discovery Center at UCSF. “It’s virtually impossible to take some molecule off the shelf and expect it to have all those properties. What we are trying to do is find one that has some of those properties and iteratively optimize it for all those properties.”
In addition to an undisclosed amount of funding and staff, GSK is providing ATOM developers with access to its library of 2 million compounds. The molecules were found through in vitro experiments to bind to protein targets that are believed to be involved in disease, but they were abandoned during preclinical or early clinical testing for various reasons. A molecule might be toxic, for example, or have other issues involving absorption, distribution, metabolism, or excretion, says John Baldoni, GSK senior vice president for R&D. The company is also contributing information on an additional 500 molecules that failed in later development stages, and data on some that became drugs.
“The pharmaceutical industry is predicated on failure,” says Baldoni. “Whatever successes we have in the industry, there is a lot of failure before that. The vast majority of the data we have is failed molecules from hypothesis-testing experiments.”
The data from failed compounds, though archived, typically aren’t used again. But Baldoni says GSK’s failures—and hopefully those of other companies that will join—provide valuable information on the underlying biology of candidate compounds and how they might interact with the human body. “Consider them informative building blocks for future molecules. More importantly, they create a learning set for a computer-based algorithm,” he says. The algorithm, it’s anticipated, will generate new dynamic models that can better predict how molecules will interact with targets, compared with current, time-consuming iterative practices.
The Sierra high-performance computer will be the third fastest in the world when its assembly is completed early this year at Lawrence Livermore National Laboratory. The computer will be used in the ATOM collaboration.
LAWRENCE LIVERMORE NATIONAL LABORATORY
Jim Brase, Livermore’s deputy associate director for data science, says the machine learning to be developed by ATOM differs from the purely data-driven approach used by internet search engines. “If you’re trying to predict whether a user is going to click on a link to look at an ad, you can’t write down any theory; there’s no partial differential equation to describe that,” he notes. But there is theory that can describe whether a particular molecular interaction will occur. “We can use a combination of data-driven approaches where we have a lot of experimental data, coupled with constraints based on what we know theoretically about what should happen, to do a better job of prediction.”
Whenever a molecule is found to bind to the target during the virtual screening or uncertainties are identified in predictions, laboratory experiments will need to be conducted to validate the predicted interaction or to reduce the uncertainties. “The idea is to use the data about past success and failure to help guide us away from things that are never going to work and toward things that will work,” says Arkin.
A designer drug
Initially, ATOM’s focus will be on “precision oncology”—drugs for populations of patients with cancers for which there are no effective therapies or for whom current drugs don’t work. For its demonstration, ATOM will strive to find a new drug to treat a single patient with one disease phenotype within one of those populations. If the therapy is effective, it may well work for others in that phenotype, says Baldoni. Further drug development using ATOM tools will be left to the industry partners to perform outside the venture; that arrangement was essential to avoid thorny intellectual property issues, he says.
The ATOM collaboration is an element of a broad NCI–Department of Energy collaboration to accelerate cancer research with high-performance computing—part of former vice president Joe Biden’s 2016 cancer-research “moonshot” initiative. Livermore has expertise in simulating molecular interactions, data analytics, and machine learning at large scales, says Brase. The lab will also bring in the field of research known as active learning, in which the machine-learning process suggests additional experiments or new data to improve the certainty of the predictions that are made computationally.
DOE’s National Nuclear Security Administration is contributing $2 million to the effort as part of an initiative to develop applications for computing at the exaflop scale (1018 floating point calculations per second)—a hundred times as capable as today’s 10-petaflop systems available for unclassified use, says Brase.
Maintaining and understanding nuclear weapons without the ability to test them requires working on other difficult computational problems, Brase says. “Working on problems that are this hard computationally provides strong feedback to our computing capabilities in general,” such as the development of new molecular dynamics simulations and applications in materials design and synthesis. The knowledge gained also is of benefit to NNSA’s core mission of maintaining nuclear weapons. (For more on stockpile stewardship, see the article by Victor Reis, Robert Hanrahan, and Kirk Levedahl, Physics Today, August 2016, page 46
More members sought
Baldoni, who first proposed ATOM early in 2016, says that although high-performance computers are needed to develop the algorithms, industry will be able to run them on their own computers. Other pharmaceutical, biotech, and technology companies involved in modeling, artificial intelligence, and characterization of phenotypic changes at the cell level are being sought to join ATOM, he says. As an incentive, for one year the partners will have exclusive use of the tools that are developed; they will become open source after that.
ATOM’s 20–30 full-time staff, both employees and postdocs, will be housed in space adjacent to UCSF’s Mission Bay campus. GSK expects to contribute a dozen individuals; Livermore, seven or eight; and Frederick Lab, four.
The ATOM partnership is emblematic of a growing role for physics in drug discovery, Baldoni says, a trend that has been accompanied by an explosion of information about protein structures and protein dynamics. “As biology is better defined in terms of physics, mathematics, and engineering principles, less and less empiricism is going to be used,” he says. “More physics-based modeling is going to be used to predict biology, and empirical laboratory work will be done to validate” the results. The changes in skill sets that are required to meet that new demand is happening now at GSK, he says. “We are hiring people we never thought we’d hire. We’ve hired astrophysicists.”
“Computing power is a real revolution in biomedicine,” says UCSF’s Arkin. Capturing and processing data, and the ability to do more personalized medicine, requires a lot of data and computing hardware, not to mention sophistication, she adds. “Livermore has sophistication that they gained in other areas, and applying that kind of power to biological problems is synergistic.”
More about the Authors
David Kramer. dkramer@aip.org