MIMS Final Project 2025

Beyond the Benchmark: Generalization Limits of Deepfake Detectors in the Wild

Abstract

In light of the rapid evolution of image generation, differentiating between what is real and fake has become increasingly challenging. This paper presents an evaluation on the potential of fine-tuning to generalize deepfake detection models across various generative models. We investigated two types of generalization; adaptive in-domain generalization, which refers to a model's ability to learn a task on which it is trained directly, and zero-shot generalization, which is the ability to perform on tasks outside of the data it was trained on. After running various experiments with a CLIP-based deepfake detector, we found that no fine-tuning method could achieve meaningful zero-shot generalization, although the model showed strong adaptive in-domain generalization. During the rehearsal learning, However, we saw consistent pattern of how the training on a specific generative model impacted the performance on the other generative models. These data points are inspiring our future work on parameter-efficient fine-tuning and model architecture. 

Last updated: May 15, 2025