MIDS Capstone Project Spring 2025

DeepBeauty AI

Team members

Problem & Motivation

In the cosmetic industry, consumers frequently encounter opaque pricing structures, marketing misinformation, and difficulty identifying product alternatives¹. Motivated by these user pain points, our team developed DeepBeauty AI, a Chrome extension that delivers transparent, ingredient-driven product recommendations directly within the online shopping experience. Our goal is to empower consumers to discover trustworthy, cost-effective alternatives without the burden of extensive research or brand bias.

Primary Data Sources

EWG Skin Deep Database² (85,000 products)

Ingredient Information: Offers comprehensive, reputable data on cosmetic ingredient formulations and their associated safety profiles.
Foundation for Ingredients Formulation Analysis: We focused on product ingredients when finding an alternative product because this information is readily available on product labels and because products of the same category tend to share similar ingredient characteristics^3,5,7.

Data Pipeline

Overall System Architecture

Model Development

Our approach combines supervised and unsupervised methods using a transformer-based model for text embeddings with a logistic regression classifier to deliver accurate and transparent product recommendations. The following steps outline the key components of our data science pipeline:

Pre-Processing

Preprocessing & Tokenization: We standardized product ingredients lists from EWG products by removing non-informative tokens, punctuation, and extraneous formatting. Product metadata such as: brand and product name, category and product image were consolidated and stored in Amazon S3 for easy retrieval.
Transformer Embeddings: We employed the E5-Large model to convert preprocessed ingredient lists into dense vector embeddings, capturing both local and contextual cues essential for returning similarity scores based on ingredients. L2 Normalization ensured consistent embedding magnitude, facilitating reliable cosine-similarity comparisons.

Unsupervised Similarity Search

FAISS Index: Product embeddings were indexed in FAISS (Facebook AI Similarity Search), enabling high-speed nearest-neighbor searches for product score retrieval.
Candidate Retrieval: A user on Sephora’s website can select a product and that product’s ingredients list and associated metadata gets passed in via a post request to our API Gateway. This request triggers a top-k retrieval of products based on ingredient similarities. Results are filtered before being passed into our classifier to highlight brand-diverse alternatives and ensure products are within the same category, improving recommendation variety and robustness.

Supervised Refinement

Manual Labeling: To simulate user feedback, we created a labeled dataset of candidate dupe pairs, annotating each as “valid”(1) or “invalid”(0). Establishing “ground truth” was pivotal for training a supervised classifier to weed out false positives which we considered to be “bad matches”. This proxy for user feedback was important considering that a similar ingredient formulation is not the only consideration people will have when deciding whether or not a product is a good alternative for another^4,6. When annotating candidates, we considered product characteristics such as their color and finish i.e. “gloss” vs “matte” as deciding factors.
Classifier Comparisons: We evaluated Random Forest, Logistic Regression, XGBoost, and a Multilayer Perceptron (MLP). Input features included additional cosine similarity scores on embeddings derived from product names, the number of ingredients, and ingredient differential.
Chosen Approach: Logistic Regression offered the best trade-off between performance and interpretability, serving as a final filter on top-k results to ensure high-quality “dupe” recommendations.

Deployment & Integration

AWS Sagemaker, Lambda & API Gateway: We leverage Amazon Sagemaker for model inference and Lambda to connect our model endpoints and retrieve product metadata stored in S3. Our first model endpoint outputs the top “n” products and their similarity scores. These products and their scores are then passed into a logistic filtering model to rank products beyond just using top similarity scores. The ranked products are then passed back to the user alongside metadata stored in S3 including product image, name, and Google shopping URL.
Chrome Extension: As users browse Sephora or similar sites, we query for similar items; the top matches and relevant purchase links are displayed in-line. The extension not only surfaces product recommendations but also includes associated metadata such as product thumbnails, names, brand information, ratings, and prices. Additionally, users can view similarity scores, ingredient details, and confidence levels, ensuring complete transparency. The user-friendly interface is designed to integrate seamlessly into the browsing experience, providing real-time access to actionable insights and facilitating informed purchasing decisions.
Scalability: This multi-stage pipeline (data wrangling → unsupervised retrieval → supervised refinement → deployment) ensures that DeepBeauty AI delivers relevant recommendations, while remaining flexible enough to accommodate continual data updates and user feedback with the hopes of accommodating different websites in the future.

Evaluation

Cosine Similarity Thresholding: We employed a threshold-based heuristic to flag products that closely resembled each other in the embedding space. This initial pass isolated potential “dupes,” helping to streamline our manual labeling efforts.

Model Classification Performance

Manual Labeling: Our team curated a labeled dataset of dupe/non-dupe pairs, providing a robust ground truth for training a supervised filter to refine the FAISS results.
64/16/20 Split: We conducted training, validation, and testing in a 64/16/20 ratio, tracking Accuracy, Precision, Recall, and F1-Score across multiple classifiers.

Results Summary

Model	Training Runtime	Accuracy	Precision	Recall	F-1 Score
Logistic Regression	1 min	0.58	0.59	0.58	0.58
XGBoost	2 min	0.53	0.53	0.53	0.53
Random Forest	3 min	0.51	0.51	0.51	0.51
MLP	2 min	0.53	0.54	0.53	0.53

Takeaway: Logistic Regression was the strongest overall, balancing performance and interpretability. It serves as the final post-filter to refine FAISS results.

By combining unsupervised similarity detection with a targeted supervised layer, DeepBeauty AI delivers both scalable retrieval and high-quality final recommendations. This hybrid approach ensures that our system remains robust and adaptable as product catalogs evolve and user feedback accumulates.

Key Learnings & Impact

Our experience with DeepBeauty AI highlighted the value of prioritizing ingredient-focused recommendations, allowing users to discover product alternatives grounded in actual formulation similarities rather than superficial marketing claims. While manual labeling demanded additional effort, it proved instrumental in improving precision: even a relatively small set of well-labeled examples significantly clarified which products qualified as valid “dupes.” In addition, we found that a hybrid approach—combining unsupervised similarity via FAISS with a lightweight supervised classifier (Logistic Regression) offered both scalability and accuracy, while simultaneously reinforcing consumer empowerment by delivering real-time, lower-cost, and ingredient-conscious alternatives. This process was notably scalable thanks to FAISS indexing and AWS Lambda integration, creating a flexible, performance-oriented infrastructure that can easily extend to additional product categories and future expansions.

Next Steps for DeepBeauty

Beyond the ten-week course timeline, we envision enriching DeepBeauty AI in several ways. First, we would integrate price sensitivity into our re-ranking logic, tailoring recommendations to user-specific budgets or desired price ranges. Additionally, we aim to incorporate more detailed reviews through advanced NLP, extracting granular sentiments on factors such as dryness, irritation, and scent to offer more personalized product suggestions. To ensure broader coverage, we would extend our dataset beyond the Environmental Working Group and incorporate additional retailers for a more expansive product catalog. Finally, implementing a user feedback loop in which users can upvote or downvote recommendations would allow the system to continuously refine its suggestions in real time, ensuring that DeepBeauty AI remains both adaptive and user-centered.

Acknowledgements

We extend our deepest gratitude to our instructors, Todd Holloway and Uri Schonfeld, whose invaluable guidance and thoughtful feedback shaped our project’s development. Special thanks to the Environmental Working Group (EWG) for providing essential ingredient data that underpinned our data-driven approach. We are also grateful to the broader teaching staff and our fellow students and beta testers, whose collective insights and perspectives enriched our understanding and helped refine DeepBeauty AI into a more robust and user-focused tool.

DeepBeauty AI stands as a testament to the potential of data science to tackle practical, everyday consumer problems. By merging publicly available datasets with advanced embedding and filtering methods, we offer a seamless, trustworthy, and cost-conscious shopping experience.

References

“Beauty Dupes: Friend or Foe?” (2023). Nielsen IQ.
Retrieved from https://nielseniq.com/global/en/insights/infographic/2023/beauty-dupes-friend-or-foe-beauty-inner-circle/
Environmental Working Group Skin Deep Database.
Retrieved from https://www.ewg.org/skindeep/
Kulak, Natalia. (2021, July). “The 1% Line Skincare Ingredients Rule”. https://www.pblmagazine.co.uk/news/the-1-line-skincare-ingredients-rule#:~:text=There%20is%20something%20in%20cosmetics,as%20Vitamin%20C%2C%20niacinamide%20etc.
Li, Joan. (2023). “74% of Beauty Consumers Agree That Makeup Products from Affordable Brands Work Just as Well as Products from Premium Brands.” Mintel.
Retrieved from https://www.mintel.com/press-centre/74-of-beauty-consumers-agree-that-makeup-products-from-affordable-brands-work-just-as-well-as-products-from-premium-brands/
“Mindful Beauty Shopping: How to Decode a Beauty Product Label”. (2017, March). https://www.thelifestyle-files.com/mindful-beauty-shopping-how-to-decode-an-ingredient-list/
“Personal Care: Market Data and Analysis.” (2024, October). Statista.
Retrieved from https://www.statista.com/study/48846/personal-care-market-data-and-analysis/
Panico, A. et al. (2019, March). “Skin Safety and Health Prevention: An Overview of Chemicals in Cosmetic Products.” https://pmc.ncbi.nlm.nih.gov/articles/PMC6477564/

Course

Data Science 210. Capstone , Summer 2025

Class Project Gallery

More Information

DeepBeauty AI Website

DeepBeautyAI Slides

Video

Last updated: April 21, 2025