Performant Deep Learning at Scale with Apache Spark & Apache SystemML

Wednesday, August 24, 2016
12:00 pm to 1:00 pm PDT
Mike Dusenberry

In this talk, we will present our work on creating a deep learning library for Apache SystemML, allowing for large-scale learning on the underlying Apache Spark platform, while maintaining the simple, modular, high-level mathematics at the core of the field. Deep learning is an exciting area of machine learning that has allowed for state of the art performance in areas including, but not limited to, computer vision, speech recognition, and natural language processing. It is essentially focused on creating large, complex, nonlinear functions mapping from raw inputs to predictions, deep learning aims to automatically learn complex representations of data with minimal human-driven feature engineering. A key feature is that these complex functions are built via a deep composition of simple, modular units, which are easily expressed with SystemML. As datasets grow in size and complexity, the need for performant deep learning at scale is increasing.

To connect to the webinar

  1. Visit http://berkeleydatascience.adobeconnect.com/myseminar/

  2. Type in your name and enter the room as a guest. There may be a short waiting period; please be patient.

  3. When you get into the room, a popup window will appear with instructions to connect your audio.

    • If you are accessing the session from within the United States, click on the “Dial-Out” option, type in your phone number and choose “Join.” You will receive a call directly from the room's conference line. Answer it to connect to the lecture's audio.

    • If you are accessing the session from outside of the United States, click on the “Dial-In” option. The conference number and participant code will appear so that you can access the audio from your phone or Internet phone service.

To run a compatibility test to ensure that your system is properly configured, please go to http://link.datascience.berkeley.edu/907SFM2620001Lz004c9Q00

Mike Dusenberry is an engineer at the IBM Spark Technology Center, creating a deep learning library for SystemML and solving for performant deep learning at scale. He was on his way to an M.D. and a career as a physician in his home state of North Carolina when he teamed up with professors on a medical machine learning research project. Two years later in San Francisco, Mike is contributing to Apache SystemML as a committer and researching medical applications for deep learning.

Last updated:

August 15, 2016