MIDS Capstone Project Spring 2020

Humorbot: Can NLP understand humor?

With Humorbot we are exploring state of the art NLP and deep learning via humor. The goal is to use jokes available on Reddit and other public sources to classify humor, to generate jokes and to analyze the innerworkings of GPT2 and BERT to understand how the models are understanding humor and jokes. We have created a working text generator with a pipeline that generates jokes via GPT2 and then runs those generated jokes through multiple BERT models to classify (and remove) jokes for redundancy and toxicity and then the jokes are classified with BERT to pick the jokes with the maximum likelihood of humor. We then posted our generated jokes to Reddit to for some real-world feedback.

Lastly, we attempted to figure out whether our joke classification results were the result of a Clever Hans effect by slightly modifying the jokes to remove the humor, but keep the meaning of the joke.

Our analysis suggests that the state of the art models perform well for classifying jokes, but quality is uneven when generating. Combined with what we think is a Clever Hans effect in our classification results, we think deep learning does not "understand" humor, but does pick up on the patterns of jokes and puns.

More Information

Last updated:

April 15, 2020