Performant Deep Learning at Scale with Apache Spark & Apache SystemML
In this talk, we will present our work on creating a deep learning library for Apache SystemML, allowing for large-scale learning on the underlying Apache Spark platform, while maintaining the simple, modular, high-level mathematics at the core of the field. Deep learning is an exciting area of machine learning that has allowed for state of the art performance in areas including, but not limited to, computer vision, speech recognition, and natural language processing. It is essentially focused on creating large, complex, nonlinear functions mapping from raw inputs to predictions, deep learning aims to automatically learn complex representations of data with minimal human-driven feature engineering. A key feature is that these complex functions are built via a deep composition of simple, modular units, which are easily expressed with SystemML. As datasets grow in size and complexity, the need for performant deep learning at scale is increasing.
Mike Dusenberry is an engineer at the IBM Spark Technology Center, creating a deep learning library for SystemML and solving for performant deep learning at scale. He was on his way to an M.D. and a career as a physician in his home state of North Carolina when he teamed up with professors on a medical machine learning research project. Two years later in San Francisco, Mike is contributing to Apache SystemML as a committer and researching medical applications for deep learning.