Data scientist interested in machine learning. Eleven years of experience as a technical project manager in the software industry managing complex, cross-functional projects. Dynamic and passionate. Able to work in both start-up and established companies.
Technical: R, Shiny, Python, Apache Spark, Hadoop, D3, Tableau, SQL and MongoDB
Data Science: Machine Learning, Statistical Analysis & Research Design and Data management
Master of Information and Data Science (MIDS), University of California, Berkeley, graduation August 2016
Machine Learning at Scale
Advanced Data Storage/Retrieval
Data Analysis and Data Visualization
Project Management Professional (PMP) certification, 2004
Master of Science in Computer Science, Massachusetts Institute of Technology (MIT), 1995
Bachelor of Science in Computer Science, Massachusetts Institute of Technology (MIT), 1994
Kaggle Facial Key Points Detection Competition, May 2016
Created convolutional neural network to predict 15 locations (example: tip of the nose) on the human face given 96x96-pixel digital images of faces. Used Python with Theano Python library on an AWS GPU server.
Video Rating and Prediction System, May 2016
Created a video collection, rating and prediction system. Collected Twitter, Facebook and YouTube video data using Python APIs from each source. Stored data in a MongoDB database using PyMongo Python library. Used Apache Spark’s MLlib machine learning library and Python to create a logistic regression model to predict video popularity.
Million Song Database Analysis, December 2015
Parsed Million Song Database files (HDF5 format) and stored them in AWS S3 as comma separated values (CSV) files. Used Apache Hive to aggregate additional song data from Billboard charts and stored it in a MySQL database. Used R and Shiny to display histograms, scatter plots and linear regressions about song data.
Improv, biking, traveling, playing Scrabble, cooking and baking pies