Maximising the information gained using Shannon's entropy is a very good strateg...

Syzygies · on Jan 16, 2022

Interesting. There can't be a proof of general optimality for Shannon entropy, because the words are irregularly distributed. However, (unlike your lists) they're not distributed by an adversary trying to foil Wordle/Jotto strategies.

I suspect a law of large numbers / central limit theorem type result that Shannon entropy is asymptotically optimal for randomly chosen lists, even those generated by state machines like gibberish generators that nearly output English words. In other words, I conjecture that your configurations are rare for long lists.

Early in my career, I was naive enough to code up Grobner bases with a friend, to tackle problems in algebraic geometry. I didn't yet know that computer scientists at MIT had tried random equations with horrid running times, and other computer scientists at MIT had established special cases with exponential space complete complexity. Our first theorem explained why algebraic geometers were lucky here. This is a trichotomy one often sees: "Good reason for asking" / "Monkeys at a keyboard" / "Troublemakers at a demo".

Languages evolve like coding theory, attempting a Hamming distance between words to enhance intelligibility. It could well be that the Wordle dictionary behaves quasirandomly, more uniformly spaced that a true random dictionary, so Shannon entropy behaves better than expected.

hddqsb · on Jan 17, 2022

For the entropy strategy, the expected number of guesses on the full dictionary (allowing all 12972 words as possible solutions) is 4.233 (54910/12972)* according to my tests.

The best score on https://botfights.io/game/wordle is currently 4.044 (4044/1000).

*Spoiler: The score of the entropy strategy can be improved to 4.086 (53007/12972) by tweaking the entropy formula as follows: Let n be the number of possible solutions and let ki be the size of the i-th partition after the current guess. The usual entropy formula is sum{pi * -log(pi)} where pi = ki / n. The improved formula is sum{pi * -log((ki+1) / n)}. This formula aligns more closely with the expected number of guesses required to solve small partitions.