Exploring the menu collection of the New York Public Library
With approximately 17,500 transcribed menus dating from the 1850s to the present, The New York Public Library’s restaurant menu collection is one of the largest in the world, used by historians, chefs, novelists and everyday food enthusiasts.
The menus contain specific information about dishes, prices, and the location on menus. From there, we can extrapolate the stories these items tell us about the evolution of various dishes over time.
Through our exploration of this extensive collection, we learned that the data contained several menus from various special occasions such as Thanksgiving, Christmas, and birthdays, and there could be hundreds of different ways to list a single menu item such as 'fried sweet potatoes'.
As such, we began by cleaning the data and clustering common dishes together to gain a better understanding of overall trends.
WHAT'S IN THE ARCHIVE
In total, here were the number of menus, dishes, and years that the NYPL collection contained.
- 17,544 Menus
- 1,331,458 Dishes
- 157 Years
MENU ITEM CLUSTERING
How can we efficiently use the menu collection?
From 1.3 million dishes spanning menus around the globe, we narrowed the scope of our project to about 90,000 menu items in the U.S. We chose roughly the top 13,000 menus and 90,000 dishes based on "popularity," or how frequently they appeared. Then, we grouped those dishes into 25 clusters based on their common name. Finally, we placed them into food groups analogous to the food pyramid. Below, you can explore the tree map that demonstrates our clustering methodology.
Placement of Dishes on Menus
How did dishes rise and fall on the menu pages themselves?
One exciting feature of the NYPL dataset was the way each menu item was tagged with an (X,Y) coordinate indicating its spatial placement on a menu page. As you read a menu from top to bottom, generally you see starters on top, then main courses or special dishes, then sides, and finally desserts and beverages. It's also a proxy for how special or trendy a dish might be. We wondered whether the Top 25 menu items shifted roles from starter to main or from main course to after-dinner fare, so we plotted the change in Y-coordinate for each dish over time.
Modern Menu Comparisons
How do the NYPL's top dishes stack up against modern menus?
Finally, we wondered how the historical menu data might match up with today's cuisine. So we scraped menu data from more than 1,000 restaurants in San Francisco from spring 2015, ranging from top spots like Gary Danko all the way down to corner sandwich shops. We then used the same criteria to group dish names as we did for the NYPL dataset to see how prevalent its Top 25 dishes are on a modern set of menus.
Credits & Acknowledgement
Special thanks to the New York Public Library for providing us an incredible (and fun!) resource to work with over the past month. We would also like to thank scholar Trevor Munoz, whose work with the NYPL dataset helped to inspire our clustering methods. Finally, an extra helping of dessert for Victor Yee, whose guidance during our data crunching was invaluable.