**Jiayi Zhang*¹, Simon Yu¹, Derek Chong², Anthony Sicilia³,
**Michael R. Tomz², Christopher D. Manning², Weiyan Shi¹**
¹Northeastern University ²Stanford University ³West Virginia University
$^$: Project co-lead. Order determined randomly.*
🌐 **Homepage | 📜 **Paper | 🐦 **X Thread |** 💻 **GitHub |** 📓 **Colab |** 🖼️ Examples | 📡 Podcast Summary
<aside>
TL;DR
- The problem. Post-training alignment leads to mode collapse, reducing LLM diversity.
- The cause. While past studies have blamed mode collapse on algorithmic limitations, we identify a fundamental, pervasive cause rooted in preference data: human typicality bias, where annotators systematically prefer familiar answers, which in turn trains models to be less diverse. This was a problem with no name, and explains why different models collapse similarly and why we must rethink how we collect preference data.
- The solution. We find a simple, training-free prompting method: Verbalized Sampling (VS). It asks models to output an explicit probability distribution over responses (e.g., "Generate 5 jokes with their corresponding probabilities").
- Results. Verbalized Sampling increases diversity by 1.6-2.1× in creative writing, improves human evaluation scores by 25.7%, recovers 66.8% of the base model’s pre-alignment diversity. It provides tunable diversity across tasks like social simulation, open-ended QA, synthetic data generation, all without sacrificing safety or quality.
</aside>

Figure 1. An illustration of Verbalized Sampling (VS) mitigating mode collapse. Left: How typicality bias causes a base LLM to collapse to a single modal response when prompted directly. Right: Our method Verbalized Sampling can mitigate mode collapse. While direct prompting (1) repeatedly yields the same collapsed output, Verbalized Sampling (2) asks the model to generate a diverse set of responses with their probabilities, effectively improving output variety and bypassing mode collapse.
Table of Contents
The Problem: Alignment Causes Mode Collapse
You ask your favorite LLM for a joke about coffee. You ask again. You get the same joke, no matter which model you try. You ask for a story, and it always begins with "Once upon a time..." The brainstorming ideas feel generic, the outputs repetitive. This frustrating phenomenon is known as **mode collapse.**



Figure 2. Mode Collapse in Action. Three of the leading AI models: Claude, Gemini, and ChatGPT, all respond with the exact same joke when asked for one about coffee. This convergence on the most probable answer shows mode collapse.
Why This Matters: Mode collapse reduces LLM output diversity, and thus limits LLMs’ potential in various important applications. For instance:
- In ideation and brainstorming, instead of offering lots of creative options, it outputs the same few ideas over and over [1].
- For creative writing, it forces writers and creators to battle a constant wave of clichés and predictable tropes, burying the unique voices the AI could otherwise produce [2].
- In AI-enhanced scientific discovery, it points researchers down well-known paths, causing them to miss novel hypotheses and breakthroughs [3].
- For rollout diversity in RL training, it hinders the development of more capable models by causing them to get stuck and stop exploring, a problem known as "entropy collapse" [4].