Data Science 261
Machine Learning at Scale
3 units
Course Description
This course teaches the underlying principles required to develop scalable machine learning pipelines for structured and unstructured data at the petabyte scale. Students will gain hands-on experience in Apache Hadoop and Apache Spark.
Skill Sets
Code up machine learning algorithms on single machines and on clusters of machines / Amazon AWS / Working on problems with terabytes of data / Machine learning pipelines for petabyte-scale data / Algorithmic design / Parallel computing
Tools
Apache Hadoop / Apache Spark
Current Course Designers
Original Course Designer
Previously listed as DATASCI W261.
Prerequisites
Video
Course History
Spring 2022
Fall 2021
Summer 2021
Spring 2021
- ‹ previous
- 4 of 9
- next ›