MIDS Capstone Project Fall 2023

Satire Spotter

Team members

Problem & Motivation

With the rise of digital news consumption, there is a growing risk of misinterpreting satirical content as factual, leading to an urgent need for clarity in the media landscape. Satire Spotter is designed to address this challenge by introducing a label at a granular post level for all digital content, to clarify the intent of each post and reduce confusion. It flags if a post or image is satirical and provides an accuracy rating.

Data Source & Data Science Approach

Our project leverages diverse datasets such as the Satire News Detection Corpus, Crawled Current Onion Articles, a Princeton Senior Thesis Dataset, and the Common Crawl News Dataset. These collections include a mix of satirical and true news text articles, posts, and images, forming a solid foundation for training and evaluating our AI models.

We conducted extensive Exploratory Data Analysis (EDA) on both satirical and non-satirical text articles. This involved utilizing techniques such as word clouds, sentiment analysis, and Named Entity Recognition (NER) to meticulously compare aspects like word usage, language patterns, and readability. Interestingly, our analysis revealed no significant differences in sentence length, word count, or sentiment between satirical and non-satirical texts. This finding underscores the necessity of employing machine learning algorithms to identify subtle patterns that are not immediately apparent to the average reader.

Evaluation

In our initial tests, we applied Logistic Regression 2.0 and Random Forest 2.0 models, achieving accuracy and precision scores of around 86%. To enhance our performance, we integrated an LSTM deep learning model, which improved our accuracy to the mid to upper 90% range.

For image-based satire detection, after experimenting with CNN and SVM, we adopted the MobileNet V2 model. This choice resulted in consistently high accuracy, with scores in the mid to upper 80% range.

Key Learnings & Impact

Widespread Impact: Our findings indicate that this issue affects a broad audience, with 50% of Americans relying on social media for news. The challenge is even more pronounced for international audiences, who may not be familiar with the intricacies of Western satire. The confusion created by this blurred line underscores the critical need for a solution that can demystify media content.

Our project’s response to this need is the development of an AI-driven satire detection tool. This tool is not just a technological advancement; it represents a significant stride toward enhancing media literacy and fostering a more transparent information environment. It holds the potential to benefit a diverse range of users, from everyday social media consumers to international audiences grappling with cultural and linguistic barriers in understanding satire.

Acknowledgments

Special thanks to our team members and all who contributed to the development of Satire Spotter. We acknowledge the valuable insights from domain experts and the feedback from our target users, which have been instrumental in shaping the project.

Mission

Satire Spotter uses AI to detect satire in text and images, clarifying the media landscape and reducing misinformation to foster a better-informed, transparent digital society.

Future Roadmap

We're proud of our progress in illuminating digital content, and moving towards a more informed online society and a transparent digital landscape. Our next steps involve expanding detection to audio and video, collaborating with social media platforms for direct model integration, and enhancing reach beyond our browser extension. We aim to establish a robust feedback loop with our users, incorporating their ratings and recommendations to continuously refine our model.

Course

Data Science 210. Capstone , Fall 2023

Class Project Gallery

More Information

Satire Spotter Github Page

Satire Spotter Website

final_presentation_1.pdf

Satire Spotter Extension

Video

Last updated: December 12, 2023