Open Data Project Exhibition

Tuesday, May 7, 2013
2:15 pm - 3:30 pm
210 South Hall

Join us for an exhibition of the final projects from the course “Info 290T. Working with Open Data.”

Open data — data that is free for use, reuse, and redistribution — is an intellectual treasure-trove that has given rise to many unexpected and often fruitful applications. In this course, students will 1) learn how to access, visualize, clean, interpret, and share data, especially open data, using Python, Python-based libraries, and supplementary computational frameworks and 2) understand the theoretical underpinnings of open data and their connections to implementations in the physical and life sciences, government, social sciences, and journalism.

Student Projects

Stock Performance of Product Releases
Edward Lee, Eugene Kim

Drawing connections for open data available pertaining to Apple in order to examine how Apple's stock performance was impacted by a certain product. We examine Wikipedia data for detailed information on Apple's product releases, make use of Yahoo Finance's API for specific stock performance metrics, and openly available Form 10-Q's for internal (financial) changes to Apple. The main purpose is to examine available data to draw new conclusions centered on the time around the product release date.

Education First
Carl Shan, Bharathkumar Gunasekaran, Haroon Rasheed Paul Mohammed, Sumedh Sawant

Most parents nowadays have a general sense of the significant factors for choosing a school for their children. However, with a lack of existing tools and information sources, most of these parents have a hard time measuring, weighing and comparing these factors in relation to geographical areas when they are trying to pick the best place to live in with the best schools. Thus, our team aims to address this problem by visualizing the statistical data from the NCES with geo-data to help the parents through the process of picking the best area to live in with the best schools. Parents can specify exactly what parameters they consider important in their decision process and we will generate a heat-map of the state they’re interested in living in and dynamically color it according to how closely each county matches their preferences. The heat map will be displayed with a web browser.

The League of Champions
Natarajan Chakrapani, Mark Davidoff, Kuldeep Kapade

In the soccer world, there is a lot of money involved in transfer of players in the premier leagues around the world. Focusing on the English Premier League, our project - “The League Of Champions” aims to analyze the return on investment on soccer transfer done by teams in the English premier league. It aims to measure club return on each dollar spent on their acquired players for a season, on parameters like Goals scored, active time on the field, assists etc. In addition, we also look to analyze how big a factor player age is, in commanding a high transfer fee, and if clubs prefer to pay large amounts for specialist players in specific field positions.

All About TEDx
Chan Kim, JT Huang

TED is a nonprofit devoted to Ideas Worth Spreading. It started out (in 1984) as a conference bringing together people from three worlds: Technology, Entertainment, Design. The TED Open Translation Project brings TED Talks beyond the English-speaking world by offering subtitles, interactive transcripts and the ability for any talk to be translated by volunteers worldwide. The project was launched with 300 translations, 40 languages and 200 volunteer translators; now, there are more than 32,000 completed translations from the thousands-strong community. The TEDx program is designed to give communities the opportunity to stimulate dialogue through TED-like experiences at the local level. Our project wants to encourage people to translate TEDx Talk as well by showing how TEDx Talk videos are translated and spreaded among different languages, places and topics, and comparing the spreading status with TED Talk videos.

Environmental Health Gap
Rohan Salantry, Deborah Linton, Alec Hanefeld, Eric Zan

There is growing evidence to support environmental factors trigger chronic diseases such as asthma that result in billions in health care costs. However a gap in knowledge exists concerning the extent of the link and where it is more prevalent. We aim to create a framework for closing this gap by integrating health and environmental condition data sets. Specifically, the project will link emissions data from the EPA and the California Department of Public Health in an attempt to find a correlation between incidences of asthma treatments and emissions seen as triggers for asthma. The project hopes to be a stepping stone for policy decisions concerning the value tradeoff between health care treatment and environmental regulation as well as where to concentrate resources based on severity of need.

World Bank Data Analysis
Aisha Kigongo, Sydney Friedman, Ignacio Pérez

Our goal was to use a variety of tools to investigate the impact of project funding in developing countries. In order to do so, we looked at open data from the World Bank, which keeps a strong track of every project that gets funded, who funds it, and the goal of the project whether agricultural, economic or related to health. By using Python, we used an index of the World Bank to see where the most funded countries were and how they related to various indicators such as the Human Development Index, the Freedom Index, and for the future, health, educational and other economic indexes. Our secondary goal is to analyze what insight open data can give us as to how effective initiatives and funding actually is as opposed to what it’s meant to be.

Dr. Book
AJ Renold, Shohei Narron, Alice Wang

When we read a book, all the information is contained in that resource. But what if you could learn more about a concept, historical figure, or location presented in a chapter? Dr. Book expands your reading experience by connecting people, places, topics and concepts within a book to render a webpage linking these resources to Wikipedia.

Book Hunters
Fred Chasen, Luis Aguilar, Sonali Sharma

When we search for books on the internet we are often overwhelmed with results coming from various sources. It’s difficult to get direct trusted urls to books. Project Gutenberg, HathiTrust and Open Library all provide an extensive library of books online, each with their own large repository titles. By combining their catalogs, Book Hunters enables querying for a book across those different sources, our project will highlight key statistics about the three datasets. These statistics include: number of books in all the three data sources, formats, language, publishing date. Apart from that we will ask users to search for a particular book of interest and we will return combined results from all the three resources and also provide the direct link to the pdf, text or epub format of the book. This will be an exercise to filter out results for the users and provide them with easy access to the books that they are looking for. An example of analysis on a subset of data can be found at:

March 26, 2015