Binary Search or: How to Win at Guess Who?

guesswho

I love board games. It’s not every day that a game gets mentioned in the news, but last week a story about the game Guess Who? made the rounds on social media. A six-year-old girl sent an email to Hasbro, complaining about the uneven distribution of male and female characters in the game. Hasbro’s initial response failed to clear up the issue for the six year old — or for her 37-year-old mother (source). To make sense of the distribution, let’s explore the best strategy for the game in terms of a computer science algorithm known as a dichotomic search.

Guess Who? is a simple two-player game that teaches the process of elimination. There are twenty-four characters. Each player secretly selects one character. Players take turns asking a series of Yes/No questions about the other player’s selected character. (Here are some sample questions: “Does your character have black hair?” “Does your character have a beard?” “Is your character wearing earrings?”) Each question narrows down the list of possible characters. The winner is the player who correctly guesses the opponent’s character first.

The game itself can be understood as a dichotomic search. This search algorithm “operates by selecting between two distinct dichotomies at each step” (source), dichotomies like hat/hatless and male/female. To discover your opponent’s character in the fewest number of turns (on average over multiple games, of course), you want to make the dichotomic search as efficent as possible. You do this by asking questions such that either answer will eliminate half of the remaining characters. You want the answer to your first question to be “Yes” for twelve of the twenty-four characters and “No” for the other twelve.

If you follow this algorithm, you’ll always guess the player correctly after five or six questions. After the first question, you want to have twelve characters remaining. You then want your second question to divide the field in half again, down to six. At the end of your third turn, you want to have three left. Three does not halve evenly, so your fourth question may reduce the pool either to one or to two. If only one remains, you’ll guess it on your fifth turn. If two remain, you’ll have to ask a fifth question to reduce it from two to one; you’ll then guess it correctly with your sixth question.

Hasbro’s response to the six-year-old included this statement: “You will notice that there are five [characters] of any given characteristic” (source). Five characters have glasses, for example, and five characters have blonde hair. I think this is the key, though Hasbro’s response did not expand on it. The distribution of every dichotomy is intentionally uneven. If both players use an efficient dichotomic search algorithm, the game will end after five or six turns every time. (The player going first will have a substantial advantage.) We want to ask a question that eliminates exactly half of the characters; but if the game has no such questions, then the outcome of the game would have more variance. The goal of the unbalanced distribution seems to me to prevent players from using an efficient dichotomic search algorithm! The distribution seems aimed at removing all questions (including the question, “Are you a male?”) with an answer of “Yes” for exactly twelve of the characters.

I won’t speak to the societal impact of and gender inequality of the uneven distribution. (That aspect has been covered well by others.) But I want to speak to the game mechanics behind it. Even with the uneven distribution for each of the characteristics (gender, earrings, hats, etc.), questions that eliminate exactly twelve of the characters do still exist. They are not immediately obvious, but someone committed to efficient search algorithms and game theory can find them. Here are two of them:

  • Yes or No: Does your character’s name start with a letter between A and G (inclusive)?
  • Yes or No: Is your character named “Alfred”, “Anne”, “Bill”, “Claire”, “Eric”, “George”, “Joe”, “Max”, “Peter”, “Richard”, “Sam”, or “Tom”?

The answer to each of these questions is “Yes” for half of the characters. This first question uses a specific type of dichotomic search, a binary search. If you are searching for a particular item and you have all the possible items in some kind of ordered list, you can use a binary search. You start by looking at the middle of the list to see if the item is less than or greater than the middle; you can then eliminate the half of the items on the other side of the middle. You next look at the middle of the list that remains, eliminate half of those items, and then continue in this manner until you have reduced the number to one.

The common example of a binary search is looking for a key in an ordered array that matches a particular value. Suppose you have an array of items (like Guess Who? character names) ordered alphabetically. Suppose also you want to find the key in the array of one of the names (like Susan). Let me walk you through some JavaScript code that conducts this binary search. (My descriptions come after each code block.)

 var characters = ["Alex","Alfred","Anita","Anne","Bernard","Bill","Charles","Claire","David","Eric","Frans","George","Herman","Joe","Maria","Max","Paul","Peter","Philip","Richard","Robert","Sam","Susan","Tom"];
 var min = 0;
 var max = characters.length-1;
 var question = 0;

These lines of code create the array of items and set up the working variables we’ll need to track our progress through the binary search. The min variable tells us the lowest possible key the one we’re looking for could be; we’ll set it to zero to start. The max variable tells us the highest possible key it could be; we’ll set that to the last key in the array. We’ll also keep track of how many questions we’ve asked in the question variable; we should set that to zero to start.

 var secret = characters[Math.floor(Math.random()*(max+1))];

We’ll create a variable called secret that will contain a random name from the array. (In a real program, that value would be specified outside of our code.) The Math.random() method returns a number between 0 and 1 (exclusive). We multiply that by 24 to get a decimal number between 0 and 24 (exclusive), round it down to make it an integer between 0 and 23 (inclusive), and then load the value from the array of characters with that key into the  secret variable.

 while (min < max) {

This code creates a loop that we’ll use to narrow down the possible characters. As we go, we’ll decrease the difference betwen min and max until we only have one possible character left. At that point, min and max will be identical; they will both equal the key of the element we’re looking for. We’ll want this loop to execute as long as min is less than max.

 var middle = 0;
 while (min < max) {
     middle = min + Math.ceil(((max - min - 1) / 2) + 1);

At each step, we’ll need to figure out which item is the middle one in the remaining possible items. I’ll declare a working variable outside the loop, and then inside the loop I’ll determine what the middle item is between min and max.

    question++;

Each time through the loop we’ll increase the question count by one.

     if (secret < characters[middle]) {
         max = middle - 1;
     }

This conditional checks if the character’s name is less than the item in the middle. (Using the less than sign with a string in JavaScript will check if the string is earlier in the alphabet.) The first time through, middle will be 12. This conditional asks if the character’s name is less than the name in element 12. If the name is earlier in the alphabet than “Herman,” then we’ll know the key is between 0 and 11: we’ll leave min at 0 and change max to 11 (middle minus 1).

    } else {
        min = middle;
    }
 }

If instead the name is not earlier in the alphabet than “Herman,” we’ll leave max at 23 and change min to 12.

This ends the loop. The range starts at 0-23. After the first time through the loop, the range will be either 0–11 or 12–23, depending on the name selected. Since min is still less than max in either case, we’ll go through the loop again and narrow the range down from twelve to six items. We’ll keep doing this until min and max are the same.

You can take a look at the code in action here: view binary search. (Be sure to view the messages I added to the JavaScript console so you can walk through the search step by step: View > Developer > JavaScript Console in Chrome, then Refresh.) If your opponent has selected Susan, these are the six questions you’ll ask in this binary search.

  1. Does your character’s name come before Herman alphabetically? No
  2. Does your character’s name come before Philip alphabetically? No
  3. Does your character’s name come before Sam alphabetically? No
  4. Does your character’s name come before Susan alphabetically? No
  5. Does your character’s name come before Tom alphabetically? Yes
  6. Guess: Is your character Susan? Yes
Should Hasbro add more female characters? I think that’s a societal issue bigger than the game mechanics. It is worth mentioning that Guess Who? is a game for players ages 6 and up, and I doubt many of them (children and grown-ups alike) are worried about the efficiency of their algorithm. If the distribution were even, I suspect that more people would ask “Is your character female?” as their first question. That could make the game less variable and exciting, but it also might cause players to learn the optimal strategy and the basics of efficient dichotomic searches.

Free Workshops

Watch one of our expert, full-length teaching videos. Choose from either HTML, CSS or Wordpress.

Start learning

Randy Hoyt

Randy Hoyt is on the Teaching team at Treehouse. He's been building web sites and web applications for years, and you'll often find him speaking about his experience at local events and conferences. He spends whatever time he can find offline studying mythology and playing board games. Twitter: @randyhoyt

Comments

Comments are closed.