Leiserson Using Machine Learning to Advance Cancer Treatments

Jan 28, 2019

Immunotherapies, which use a person’s own immune system to fight cancer, have produced revolutionary results in recent years, including curing people with previously inoperable, advanced disease. Such stellar successes have propelled some of these drugs to become the standard of care for treating many forms of cancer. And yet, most of them only work for a minority of patients. In some cases, fewer than 20 in 100 patients will benefit.

Predicting which patients will benefit and which ones won’t has been an elusive but very important goal, because immunotherapies can have significant, even life-threatening side effects, and costs can run into the hundreds of thousands of dollars per treatment.

University of Maryland computer scientist Mark (Max) Leiserson and colleagues from Microsoft Research and Memorial Sloan Kettering Cancer Center believe a solution lies in a new approach to the data, one that analyzes multiple facets of patients and their cancer simultaneously rather than seeking a handful of key factors that can predict success.

In a study published in the journal PLOS One on December 31, 2018, Leiserson and his colleagues used data from a clinical trial of bladder cancer patients to demonstrate that their approach could identify a suite of features that accurately predicted a key immune system response to treatment while reducing over treatment by half.

“If your goal is to treat everyone in that particular dataset who will respond, the type of multifactorial modeling we show in this paper will let you do that while treating many fewer people who won’t respond,” said Leiserson, the paper’s lead author and an assistant professor in the Department of Computer Science at UMD. Leiserson began conducting this study while he was a postdoctoral researcher at Microsoft Research, New England and continues to consult for the company.

“What’s also exciting about this study is that we were not just looking at patient outcome, but at a specific marker of immune response, which gave us a much better picture of what’s going on," said Leiserson, who also has a joint appointment in the University of Maryland Institute for Advanced Computer Studies.

Leiserson and his team analyzed data from a clinical trial of the drug atezolizumab on patients with advanced bladder cancer (metastatic urothelial cancers). Currently, the primary methods for predicting treatment success with this class of immunotherapy drugs (known as immune checkpoint therapy) rely on two key biomarkers—features of a patient’s disease that correspond to treatment outcomes.

One biomarker is the number of genetic mutations in the tumor cells, the other is the presence of a protein called PD-L1 that prevents immune cells from attacking cancer cells. In this case, using those two biomarkers to identify patients for treatment casts an overly wide net. In order to reach 100 percent of the patients who will benefit, clinicians would treat 77 percent of the patients who did not benefit from treatment.

In contrast, Leiserson and colleagues showed that their multifactorial computer model predictions of which patients would benefit could include as few as 38 percent of those who did not benefit while still capturing 100 percent of the patients who did. The key, they found, was to include three distinct types of data, something not currently standard in cancer studies or treatments.

Although immunotherapy researchers are beginning to collect more information about cancer patients and their responses to therapy, the focus is still largely on finding a few key markers that stand out as important predictors of success. The solution, however, may be far more complex. There may not be just a handful of important features or markers for all patients, and those that exist are likely to function in some complicated combination.

“People are realizing that predicting response is more and more appropriate and needed, and to be able to do this, the traditional kind of single biomarker approach isn’t always enough,” Leiserson said.

To generate their computer model, Leiserson and his team analyzed data from a clinical trial with a uniquely rich data set that captured information about tumor cells, immune cells, and patient information such as demographics and medical history. Like many studies, the trial was aimed at finding key features associated with a specific response to the drug. Recognizing the potential in such a multi-modal data set, the researchers saw an opportunity to apply machine learning to the problem. They fed 36 different features into their model and allowed the computer to identify patterns that could predict increases in potential tumor-fighting immune cells in a patient’s blood after treatment. (In the study patients, expansion of T cells in the blood post-therapy was associated with progression-free survival.)

The resulting algorithm identified 20 features that, when analyzed together, explained 79 percent of the variation in patient immune responses. According to Leiserson, this means that the unusually comprehensive set of features gathered for these patients is sufficient to predict the patient immune response with high accuracy.

Even more importantly, they found that if they eliminated any one of the three categories of data from the model (tumor data, immune cell data or patient clinical data) the immune response was no longer predictable—their model could only predict at most 23 percent of the variation. Leiserson stresses that it’s not necessarily the 20 characteristics that are important, but rather the reliance on a multifactorial approach.

“These features we identified may not be the only features that can be used to predict how a patient will respond,” he said. “There may be others that you could replace these with, but it’s about the method and the inclusion of all three categories of features.”

Another unique aspect of the team’s approach is that they focused on predicting changes in T cells in patient blood rather than simply predicting overall patient outcome.

“We looked at immune cells in the tumor and in the blood before and after treatment, and we asked the question, ‘Which of the immune cells that were in the tumor before treatment expanded in the blood after, potentially meaning they were responding to something?’” Leiserson said.

Understanding that response provides a much finer grained picture of the interactions between tumor, therapy and immune system.

“Progression-free survival is a standard measures of response to a cancer treatment,” Leiserson said. “But in our case, we have a more continuous value that is closer to a direct readout of how much of an effect the treatment has on the immune system. And, in this data set, the predicted immune response is strongly associated with durable clinical benefit.” Meaning, the cancer had not progressed in the six months after treatment.

Leiserson sees this work as a natural parallel to current efforts in precision oncology, which aims to tailor treatments to the genetics and molecular profiles of individual patients’ tumors.

“We are trying to predict what’s going to happen for a single patient by looking at their molecular profile and clinical history,” he said. “It’s about building an understanding of the molecular landscape of the tumor, which provides additional information beyond which tissue it’s in or what the tumor looks like under the microscope.”

The model the scientists developed isn’t ready to be used as a diagnostic tool because it only incorporated data from 21 patients, which is far too few to be predictive for the general population. Leiserson said they are hoping to add patients to the model as more data comes in. He hopes the study’s success will encourage hospitals and other researchers to invest the time and effort into gathering more information than they traditionally have.

“One of the goals of this work was to ask the question, ‘Should hospitals prioritize gathering this type of data?’” Leiserson said. “And now we can say that this multifactorial approach lets us better predict the response to these immunotherapies. I hope that it motivates the effort and expenditure of continuing to collect this data.”

Both the data used for the study and the algorithm Leiserson and his team developed are open source and available on Github.

###

The research paper “A multifactorial model of T cell expansion and durable clinical benefit in response to a PD-L1 inhibitor,” Mark D. M. Leiserson, Vasilis Syrgkanis, Amy Gilson, Miroslav Dudik, Sharon Gillett, Jennifer Chayes, Christian Borgs, Dean F. Bajorin, Jonathan E. Rosenberg, Samuel Funt, Alexandra Snyder, Lester Mackey was published in the journal PLOS One on December 31, 2018.