Wordle & Fibble Arenas for LLMs

Fibble adds lies to Wordle's color clues. Fibble 2–5 progressively increase the number of lies per row, stress-testing LLM reasoning under deception. Standard Wordle is included as the lie-free baseline.

LLM Showdown

Pit up to 6 LLMs against each other on the same word

Preview

WORDLE

Wordle Arena

The classic word puzzle with honest clues. Zero deception — a clean baseline for comparing LLM word-guessing ability.

0 Lies · Baseline
FIBBLE

Fibble Arena

One clue per row is a lie. Models must identify which color feedback is deceptive and reason around it.

1 Lie per Row
FIBBLE

Fibble² Arena

Two clues per row are lies. The signal-to-noise ratio drops, demanding stronger deductive reasoning.

2 Lies per Row
FIBBLE

Fibble³ Arena

Three lies per row — more clues are deceptive than honest. Models must find truth in a sea of misinformation.

3 Lies per Row
FIBBLE

Fibble⁴ Arena

Four lies per row — only one clue is truthful. Extreme adversarial reasoning required to find the needle in the haystack.

4 Lies per Row
FIBBLE

Fibble⁵ Arena

All five clues per row are lies. Every piece of feedback is deceptive — the ultimate test of adversarial reasoning.

5 Lies per Row
📊

Batch Experiment Results

Cross-arena performance of LLMs on 30 deterministic words. Heatmaps, degradation charts, and per-word call logs.

Updates

March 9, 2026

Max guesses increased for Fibble²–Fibble⁵. Our information-theoretic analysis showed that higher-deception arenas had insufficient margins between the theoretical minimum guesses needed and the allowed max guesses. To ensure at least a +5 margin for all Fibble arenas, the following changes take effect immediately for all daily games going forward:

ArenaLiesMin Guesses ⓘInformation-theoretic lower bound. Wordle feedback has 3⁵=243 possible patterns. With L lies, each true feedback can appear as C(5,L)×2L different displayed patterns, reducing the effective distinguishable outcomes to 243/(C(5,L)×2L). The lower bound on guesses is ⌈log₂(2315) / log₂(243/(C(5,L)×2L))⌉. Lies reduce the information bandwidth of each guess by a factor of C(5,L)×2L, turning Wordle feedback into a noisy channel.Old Max GuessesNew Max GuessesMargin
Wordle0266 (unchanged)+4
Fibble1388 (unchanged)+5
Fibble²25810+5
Fibble³37812+5
Fibble⁴47812+5
Fibble⁵5489+5

Historical results played under the old 8-guess limit are preserved as-is in the leaderboards.