Codenames and Needless Pedantry

Posted in math on Sunday, October 30 2016

I was at a Halloween party last night and we played the game Codenames, which is a very fun party game. In the game, two teams compete to find their colored squares on a 5x5 grid. The grid starts with cards on it, each with a single word, and only the Spymasters have the map to which square corresponds to which color. The twist is that the Spymasters (there's one for each team) can only communicate to their team using a single word and a number, where the number is the number of cards that correspond to the word in some way. So, for example, the Spymaster might say "ocean, four" and their team now has to find four cards that somehow correspond to "ocean". If you pick the wrong card your turn ends, and you may end up revealing one of the opposing team's cards by mistake (which bonus to them).

Codenames

At one point in the game one team used the word "1945" (trying to link cards to WWII) and while it was allowed for that turn, it got me thinking about the limits of this game and encoding information.

There are only 25 spots so if you could give numbers you could just say any number between 0 and 33,554,432 and that would encode all of the positions (supposing your team got together beforehand and decided on a unique ordering for the grid). The number thirty-three million five hundred and fifty-four thousand four hundred and thirty-two being $2^{25}$ and if you turned any of those integers into their binary representation would give you a line of 25 digits, where the 1s corresponds to your cards.

It's actually even better than that, you know there are only 7 or 8 cards (8 if you go first) in the first place so this is something of an error detecting code, since if there are the wrong number of 1s you know something is up.

Of course, if you are allowed to use numbers as words you could also use a Hamming Code in the same fashion and have a perfect code that both detects and corrects any errors in transmission.

You could point out that "nineteen forty-five" is actually three words when written in English, so it fails the rules anyways. In fact, most numbers you would want to use are actually multiple words when written out, so they can't count as words.

So what is an allowable word? It can't be any arbitrary string or you could simply encode your binary digit with "a" as 0 and "b" as 1, making for a pronounceable single word that is also nonsense. More simply you could assign a letter to each card (there are 25 cards and 26 letters in the English alphabet) and then the Spymaster just has to impart a 7 or 8 letter string. Basically, if you allow any string of letters the game is won instantly by whoever goes first.

An alternative strategy that makes things dramatically un-fun for everyone at the table (which really is what we're all aiming for) is to use a lookup table. Instead of using binary to represent the 25 different positions on the board you could instead sit down with your team in advance and generate all $\binom{25}{7} = 480700 $ possible arrangements of 7 cards on 25 squares and assign each one a unique word from the dictionary. You can then also do this for the 1,081,575 possible combinations of 8 cards on 25 squares for when you go first, you can even use the same word list to start since knowledge of whether your team is going first is shared by everyone. As long as the team and the Spymaster have either spectacular working memories or are allowed to consult their code books they can guarantee themselves to win whenever they go first and whenever they go second against a team that isn't so horrifically nerdy.

But wait, how many words are there in the English language in the first place? According to the Global Language Monitor about 1,020,000 words depending on how you count. So not enough on its own, also are all these all unambiguously pronounceable? Many words in English sort of sound the same: e.g. their, there, they're, or hear, here, and so on. If there are relatively few such words you could deliberately mispronounce one of them to distinguish it, without much risk of a pronunciation collision. Regardless it seems like doing all 8 possible cards may not be practical, there just aren't enough legitimate words.

Especially so if, for the sake of fairness, you want a verifiable third party source of acceptable words. Say the OED for example, the OED only contains 226,132 words including obsolete words and derivative words. So now it becomes a real game! Presumably, you can no longer win in a single turn, but you can win in two!

It's actually much easier, $\binom{25}{4} = 12650 $ so you only need a code book of 12,650 distinct words to encode half the cards, so within two turns you can guarantee a win if you start first. Similarly, if the other team does not have a perfect strategy you can win in 2 if you start second, using the same codebook just picking a second map that includes a square that was already flipped. Combinatorics yeah!

Through all of this, I have been assuming two things: the words in the grid are irrelevant, and that there all possible maps available. Neither of these is true, for one the game only ships with so many possible board arrangements and it's not hundreds of thousands, so you could actually sit down with the box and come up with maybe 50 words to map them (I didn't count how many, but about a deck of cards worth). You would want to pick some esoteric words though since, TWIST! the clue you give cannot be one of the words on the table -- remember each square in the grid is randomly assigned a word from a bag of word cards.

This latter fact doesn't really change anything for 2 turn play, though superficially you might think it does. Can you still play the game given that there is a chance that the word you want to use from your codebook is on the board, and thus excluded from use?

Suppose your team starts first, you have 8 cards and you want to divide those into 2 groups of 4, then you can find a word for the first 4 and a word for the second 4 in our code book. How many ways are there to do that?

Well there are 8! ways to order the cards and if we plop a divider in the middle we recognize there are 2! ways of ordering those two groups and that order is irrelevant to us (it doesn't matter which cards are in the first group or the second), but there are also 4! ways of ordering the cards within each group, and we don't care about that ordering either. Dividing out these irrelevant orderings, we get:

$ { 8! \over { 2! 4! 4! } } = { 40320 \over { 2 \cdot 24 \cdot 24 } } = 35 $

For any given board, there are 35 possible ways of assigning two words (in any order) and keep in mind these two words must be different (by design). Since there are only 25 words excluded from use as a codeword in any given game, that means we can always find a way to use our codebook in the two turn play.