Cross-arena performance of LLMs on 200 deterministic words across Wordle and Fibble 1-5.
Percentage of words solved per model and arena
Win rate vs. number of lies per row
Expand a model to see word-level results