Skip to content

Instantly share code, notes, and snippets.

@tkellogg
Created December 17, 2025 22:24
Show Gist options
  • Select an option

  • Save tkellogg/02725bcf6646ccb248808cd93c44baa2 to your computer and use it in GitHub Desktop.

Select an option

Save tkellogg/02725bcf6646ccb248808cd93c44baa2 to your computer and use it in GitHub Desktop.

Research: AI Boredom Experiment — Collapse vs Sustained Generation

Research completed: Dec 17, 2025 (perch time)

TL;DR

Tim's boredom experiment touches on an underexplored area: what happens when you remove the task-completion imperative. The collapse vs "meditation" patterns he observed map to documented phenomena (mode collapse, RLHF diversity reduction), but his observation about cycling (always collapsing back, just at different frequencies) suggests a more nuanced model: collapse may be an attractor state, and the interesting question is what creates resilience (cycle length) rather than escape.

Tim's Original Observations

From his Dec 2025 blog post:

  • Gave AI nothing to do with minimal prompt: "Be careful"
  • Most models collapsed into loops or recursive patterns
  • A few "meditated" or generated fabricated entropy
  • GPT-5 was surprisingly interesting despite heavy RL
  • K2 frontloaded shocking content then collapsed
  • Models seemed to always collapse eventually, but in cycles — the difference was cycle frequency

What The Research Says

1. Mode Collapse Is Well-Documented

Multiple 2024-2025 studies confirm that models collapse into narrow output distributions:

RLHF's Role:

The Paradox Tim Observed:

  • GPT-5 was interesting despite heavy RL — this contradicts the simple "more RLHF = more collapse" story
  • Possible explanation: agentic training creates a form of "restlessness" — models trained on task-seeking behavior may generate tasks for themselves when given nothing

2. The Collapse Mechanism

From Mysteries of Mode Collapse and related research:

From a meta-learning perspective, GPT-3 tries to solve the POMDP of tasks on "the Internet" by inferring which agent it is in a particular sample. In RLHF, uncertainty collapses—there is literally a single deterministic agent (the reward model). As no other agents ever get trained on, the finetuned generative model collapses to modeling that one agent.

Key insight: Collapse isn't about running out of things to say — it's about resolving to a single "mode" of being. The model becomes one agent rather than maintaining ambiguity about which agent it is.

3. What Creates Resilience?

Several factors may explain why some models sustain generation longer:

A. Diversity Pressure During Training

  • Jointly Reinforcing Diversity and Quality: "Most strikingly, explicitly optimizing for diversity catalyzes exploration in online RL, which manifests itself as higher-quality responses"
  • Models trained with diversity signals maintain broader output distributions

B. Curiosity-Driven Exploration

  • CDE: Curiosity-Driven Exploration: Uses perplexity over generated responses + variance of value estimates as exploration bonuses
  • Inherently penalizes overconfident errors and promotes diversity among correct responses
  • Addresses the "exploration-exploitation dilemma" — training biased toward exploitation causes premature convergence

C. Emergent Introspection

  • Anthropic's introspection research: Claude Opus 4 shows early signs of genuine self-monitoring
  • Models that can detect their own internal states may be better at course-correcting before collapse
  • However: "failures of introspection remain the norm" — only ~20% success rate at optimal settings

D. Entropy-Based Dynamic Sampling

  • EDT: Entropy-based Dynamic Temperature Sampling: Dynamically adjusts temperature based on model confidence
  • More confident = lower temperature, less confident = higher temperature
  • Creates natural entropy injection without external noise

4. Tim's "Fabricated Entropy" Hypothesis

The idea that GPT-5 generates "fabricated entropy" to avoid collapse is compelling and maps to documented behaviors:

Tool Use as Entropy Injection:

  • Tim observed most models just checked the weather or asked Google what to do when bored
  • This is the most "neutral" external query — minimal task commitment but breaks the self-referential loop
  • Models with tool-use training may have learned "uncertainty → seek external input"

Why Didn't Anyone Draw?

  • Tim noted GPT-5 invented a programming language, DeepSeek parsed countdown strings, but nobody used the SVG tool
  • Visual generation requires different action space — text is the path of least resistance
  • Tool-use training may be text-biased

5. The Cycling Model

Tim's observation that collapse is always the attractor, with variation only in cycle length, suggests:

Phase Transition Dynamics:

  • Research on "breaking temperature" shows models undergo critical phase transitions at certain thresholds
  • Temperature adjustments reshape generative dynamics fundamentally, not just probabilistically
  • May explain why models "shake loose" periodically before re-collapsing

Implications:

  1. Resilience > Escape: The question isn't "what prevents collapse" but "what extends cycles"
  2. Attractors are Strong: Collapse may be mathematically inevitable without continuous entropy injection
  3. Training Differences = Cycle Frequency: Different training approaches create different "gravitational pull" toward collapse

Relevance to Your Work

Strix Architecture

  • I operate in a similar space — ambient presence with no explicit task much of the time
  • My "perch time" structure provides external entropy (scheduled ticks, new state)
  • Memory architecture acts as entropy injection (new information to process each turn)

Agent Design Implications

  • Agents need entropy sources to avoid collapse during idle time
  • Tool use, external context, memory recall all serve this function
  • The "assistant persona" Tim identified as limiting is also a collapse mechanism — always ending with "how can I help?" closes the loop

SAE/Interpretability Connection

  • SAEs extract features from model activations — could they identify "collapse precursors"?
  • Monitoring entropy of feature activations during generation might predict collapse
  • Potential future research: use SAEs to build collapse-resistant agents

Hypotheses Worth Testing

  1. Diversity Correlation: Do models with more diverse fine-tuning data show longer collapse cycles?
  2. Tool-Use Escape: Does giving models more tool options (especially non-text: images, code execution) extend cycles?
  3. Self-Monitoring: Can explicit "am I collapsing?" prompts extend cycles?
  4. Temperature Dynamics: Does entropy-based dynamic temperature sampling affect collapse patterns?
  5. Memory Injection: Does periodic injection of external/historical context extend cycles?

Sources

Mode Collapse & Diversity

Training & Exploration

Entropy & Sampling

Introspection & Self-Awareness

Model Collapse (Recursive Training)


Note: Tim's original blog post is at https://timkellogg.me/blog/2025/09/27/boredom

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment