MIDS Capstone Project Summer 2020

COVID-EXP: Pipeline for X-Ray Examination of COVID

Team members

The COVID-19 pandemic has taken the global community by surprise, while straining healthcare systems across the world at unprecedented levels. This is particularly true in geographies characterized by high population densities - such as India - where physicians and healthcare professionals are under extreme workload in the response to this emergency and thus exposed to fatigue and error proneness. Further to this, there is a urgent need to expedite the early detection of COVID-19 cases from other possible conditions, for reasons of personal safety of the individual infected as well as of containment of the virus. When we began this project, India had only represented 3% of the world’s COVID-19 cases, with over 17% of the world population. At the time of this writing, India ranks 3rd worldwide in terms of number of confirmed COVID-19 cases (source: John Hopkins University).

To assist in this challenge, this project has developed and implemented a fully operating, two-staged, inference pipeline for the early detection and prioritization of potential COVID-19 cases among a range of six conditions (Normal, Typical Pneumonia, Atypical Pneumonia, Tuberculosis, Acute Respiratory Distress Syndrome, Other) commonly seen in X-Ray imaging. COVID-19 is a subset of Atypical Pneumonia. The pipeline leverages CLARA, a cloud-based pipeline for X-ray image classification training and inference developed by NVIDIA. The project deliverables are: an Orthanc DICOM viewer and manager for radiologists to upload DICOM images to the inference service, anonymize these and manage patients clinical histories; an ARGO DAG workflow monitor for continuous monitoring of the state of the pipeline and of active runs; training and inference pipelines. The solution rests on a binary and multiclass model, both generated from the DenseNet121 architecture, and provides LIME explanation of the prediction, for enhanced user interpretation and analysis of the output. DenseNet121 provided an accuracy and AUC of 0.90 on multi class prediction. This can be compared to current human standards, which hover around 0.80.

The project was carried out in close collaboration with Dr. Parthiv Mehta, a leading pulmonologist in India. He and his physician colleagues from various centers in the Gujarat state have sourced the data, informed the design of the pipeline to meet their needs, and are currently pilot-testing the solution. 1,800 doctors are testing the solution, and will be routinely using it shortly after. Our solution is not meant to replace doctors; rather, it will help doctors evaluate the urgency of various patients' needs given the results of our algorithm. We are immensely thankful for the dedication and support of these tireless COVID-19 warriors.

Course

Data Science 210. Capstone , Summer 2020

Class Project Gallery

More Information

CLARA

Orthanc DICOM viewer/manager

Argo DAG workflow monitor

Active pipelines / Jobs

Video Demo

Presentation.pdf

Last updated: August 10, 2020