logov4_900x600_v3_0.png
MIDS Capstone Project Fall 2022

TitleAI

Our Mission

TitleAi is focused on tackling the disconnect in relevance and interpretability between a Reddit title and its post, which is a pervasive issue among the various Reddit communities across the platform. Using data scraped from Reddit, our team finetuned two state of the art summarization models in T5 and BART to produce more accurate and relevant titles that meet standards expected from the average Reddit user, thereby improving the ability of all Reddit users to find topical information and increasing the ease with which a user can create a post.

Motivation

There were an estimated 430 million Reddit users monthly and 300 million posts in 2021. After conducting online research and market analysis, we estimate that 45% of content-driven reddit posts contain either sub optimal title lengths, irrelevant title content, and/or ineffective titles. We researched everything from what causes content engagement, to what a high quality title should consist of, from newspaper headlines to blog posts, and there was a consensus on having a certain length of words or characters, effectively grabbing the reader`s attention, and accurately summarizing the content.

How it Works

We finetuned two state of the art headline generation models in BART and T5 on a large corpus of Reddit data that was cleansed and preprocessed for optimal training examples. This enables our product to generate more effective and relevant titles while maintaining the flavor of titles that users might observe on the Reddit platform.

More Information

Last updated:

December 8, 2022