MIDS Capstone Project Summer 2021

Screen Ahead Rx

Screen Ahead Rx is a tool designed for individualized cancer treatment. The model predicts the anti-cancer drugs that will elicit an "exceptional response" in the cancer patient, based on the tumor's genomic profile and the drug's chemical structure. The Screen Ahead Rx team hopes to both improve individual cancer outcomes and seek potential new cancer treatments by employing novel machine learning techniques.

What’s the problem?

In precision medicine, teams of doctors and researchers use information from all aspects of a patients’ health, including DNA to tailor treatments and make the best health decisions possible. A branch of precision medicine called precision oncology holds promise for improving cancer treatment.  However, predicting whether a cancer drug will be effective in treating a given patient's cancer remains very difficult. In the US today, only 3 in 10 patients’ cancers will be responsive to the first drug that is tried. The problem is difficult because any combination of mutations on any of the cancer cells’ twenty-thousand genes might make its underlying biology very different than a cancer that outwardly appears to be similar. Humans cannot deal with such a vast feature space and even computer models have trouble with such extreme dimensionality.

What did we do?

The goal of this project is to develop a machine learning model that successfully predicts how effective a drug will be in treating a given cancer based on the underlying properties of the cancer drug and the alterations to the cancer cells’ DNA. We created a number of novel embeddings to represent drug chemical structure, employing techniques used in NLP. We performed extensive feature engineering to characterize the cancer cell molecular features and encoded features based on biological knowledge of drug classes and cancer subtypes. A combination of different drug embeddings and cancer DNA feature sets were trained on a number of different machine learning architectures. These models were evaluated on the same set of never-seen cancer cell lines. The performance of these models and their predictions can be evaluated in interactive data visualizations on our website. This project will serve as the basis for a peer-reviewed paper that will identify the best features and architectures for modeling cancer drug response.

More Information

Last updated:

September 7, 2021