"AI" can haz cheezburger

Study finds AI-generated meme captions funnier than human ones on average

Mollick proclaims "the meme Turing Test has been passed," but a new study offers a key caveat.

Benj Edwards – Mar 19, 2025 10:12 PM | 89

Sometimes memes are so funny, you drop your phone on your face. Credit: jeffbergen via Getty Images

A new study examining meme creation found that AI-generated meme captions on existing famous meme images scored higher on average for humor, creativity, and "shareability" than those made by people. Even so, people still created the most exceptional individual examples.

The research, which will be presented at the 2025 International Conference on Intelligent User Interfaces, reveals a nuanced picture of how AI and humans perform differently in humor creation tasks. The results were surprising enough to have one expert declaring victory for the machines.

"I regret to announce that the meme Turing Test has been passed," wrote Wharton professor Ethan Mollick on Bluesky after reviewing the study results. Mollick studies AI academically, and he's referring to a famous test proposed by computing pioneer Alan Turing in 1950 that seeks to determine whether humans can distinguish between AI outputs and human-created content.

But maybe it's too soon to crown the robots. As the paper states, "While AI can boost productivity and create content that appeals to a broad audience, human creativity remains crucial for content that connects on a deeper level."

The international research team from KTH Royal Institute of Technology in Sweden, LMU Munich in Germany, and TU Darmstadt in Germany set up three test scenarios comparing meme-creation quality. They pitted humans working alone against humans collaborating with large language models (LLMs), specifically OpenAI's GPT-4o, and memes generated entirely by GPT-4o without human input.

Some of the meme image templates used in the study, taken from the paper. Credit: Wu et al.

Ars Video

The researchers tested meme captions across three relatable categories (work, food, and sports) to explore how well AI and humans handled humor in familiar contexts. They found notable differences in performance across these categories—for example, memes about work tended to be rated higher for humor and shareability than those about food or sports—highlighting how context can influence the effectiveness of meme humor, whether created by humans or AI.

It's worth clarifying that AI models did not generate the images used in the study. Instead, researchers used popular, pre-existing meme templates, and GPT-4o or human participants generated captions for them.

More memes, not better memes

When crowdsourced participants rated the memes, those created entirely by AI models scored higher on average in humor, creativity, and shareability. The researchers defined shareability as a meme's potential to be widely circulated, influenced by humor, relatability, and relevance to current cultural topics. They note that this study is among the first to show AI-generated memes outperforming human-created ones across these metrics.

However, the study comes with an important caveat. On average, fully AI-generated memes scored higher than those created by humans alone or humans collaborating with AI. But when researchers looked at the best individual memes, humans created the funniest examples, and human-AI collaborations produced the most creative and shareable memes. In other words, AI models consistently produced broadly appealing memes, but humans—with or without AI help—still made the most exceptional individual examples.

Diagrams of meme creation and evaluation workflows taken from the paper. Credit: Wu et al.

The study also found that participants using AI assistance generated significantly more meme ideas and described the process as easier and requiring less effort. Despite this productivity boost, human-AI collaborative memes did not rate higher on average than memes humans created alone. As the researchers put it, "The increased productivity of human-AI teams does not lead to better results—just to more results."

Participants who used AI assistance reported feeling slightly less ownership over their creations compared to solo creators. Given that a sense of ownership influenced creative motivation and satisfaction in the study, the researchers suggest that people interested in using AI should carefully consider how to balance AI assistance in creative tasks.

From the paper: "Top 4 Memes Generated by AI, Humans, and Human-AI Collaboration Across Humor, Creativity, and Shareability Metric." Credit: Wu et al.

So, how can an AI model make reportedly funny things that humans appreciate? The researchers attribute the AI model's strong average performance to its training on vast amounts of Internet data, allowing it to identify broadly appealing humor patterns. Human-created memes, meanwhile, often reflected more personal experiences that occasionally produced particularly funny content but somehow resulted in lower average scores.

The researchers included a few examples of the meme captions from the study, seen in the images above. When a Bluesky user recently pointed out that the AI-generated memes in the study are "not great," Mollick offered an observation that might partially explain the study results: "One lesson is many people find bad memes funny and interesting." His comment raises a key question about the findings: does AI's success reflect statistical proficiency in reproducing common humor patterns, or simply its ability to target the lowest common denominator of Internet comedy?

Limitations

The study had several limitations worth noting. The meme caption creation sessions were relatively short, and participants didn't always fully utilize the collaborative capabilities of the AI tools. Future research could investigate whether extended use of AI tools and better prompting might further enhance human-AI creative collaborations.

Additionally, the use of crowdsourced evaluators introduces subjectivity and potential biases toward mainstream or conventional humor, possibly favoring AI-generated memes optimized for broad appeal. Future studies might alternately incorporate expert panels or targeted demographics to better capture nuanced and culturally specific aspects of humor and creativity.

The research team suggests future work should explore scenarios where an AI model rapidly generates multiple ideas, allowing humans to act as curators who select and refine the best content. But for now, humans remain the champions of meme captions.

Benj Edwards Senior AI Reporter

Benj Edwards is Ars Technica's Senior AI Reporter and founder of the site's dedicated AI beat in 2022. He's also a tech historian with almost two decades of experience. In his free time, he writes and records music, collects vintage computers, and enjoys nature. He lives in Raleigh, NC.

89 Comments

Ars Video

More memes, not better memes

Limitations

nproxy.org