A curiosity journal of math, physics, programming, astronomy, and more.

Finding hide and seek probabilities analytically

In the previous post we calculated the expected probabilities of how soon or late an item would be found when several things were hidden in a set of locations searched in random order. For many items and locations, though, brute force computation of all possible combinations of hiding places is prohibitive, so in this post we'll find a much cheaper analytical solution.

Working some facts out manually:

  1. Given ten hiding spots, we can hide a single item ten different ways: in what later turns out to be the first place searched, in what will be the second place searched, the third place searched, etc.
  2. Hiding two items in ten locations, there are ten ways to hide the first item, but (since we don't allow hiding multiple items in the same place) only nine ways to hide the second item. While it seems like that would give us 90 ways to hide two items, the order we hide the items in doesn't matter. Hiding one item in the first location and the other in the second location is the same as hiding one item in the second location and the other in the first location. To account for that, we need to divide our total by two, giving us 45 total combinations.
  3. Hiding three items, there are ten ways to hide the first, nine to hide the second, and eight to hide the third. But like when hiding two items, some of those combinations are duplicates, so we need to divide by the number of ways three items can be ordered, which is 3 × 2 = 6.

From this, we can write a general formula. Let's call the total number of combinations for n hiding spots and m items hidden T(n, m):

StartLayout 1st Row 1st Column upper T left-parenthesis n comma m right-parenthesis 2nd Column equals StartFraction n factorial Over m factorial left-parenthesis n minus m right-parenthesis factorial EndFraction EndLayout

In fact, this is the choose function, usually written like this:

StartBinomialOrMatrix n Choose m EndBinomialOrMatrix

We can confirm our examples above:

StartLayout 1st Row 1st Column StartBinomialOrMatrix 10 Choose 1 EndBinomialOrMatrix 2nd Column equals StartFraction 10 factorial Over 1 factorial dot 9 factorial EndFraction equals 10 2nd Row 1st Column StartBinomialOrMatrix 10 Choose 2 EndBinomialOrMatrix 2nd Column equals StartFraction 10 factorial Over 2 factorial dot 8 factorial EndFraction equals StartFraction 10 dot 9 Over 2 EndFraction equals 45 3rd Row 1st Column StartBinomialOrMatrix 10 Choose 3 EndBinomialOrMatrix 2nd Column equals StartFraction 10 factorial Over 3 factorial dot 7 factorial EndFraction equals StartFraction 10 dot 9 dot 8 Over 6 EndFraction equals 120 EndLayout

We'll use this later to normalize a sum and produce an expected value, but first we need how many combinations have a lowest value of 1, how many have a lowest value of 2, and so on. (I'm only going to work out the expectation of how early an item is found, since how late an item is found is a symmetrical problem.) Let's call this count C(n, m, x), where x is the lowest value in the combination.

When hiding one item in ten locations, the lowest numbered location an item is hidden in is the location the item is hidden in, meaning each location only occurs in one "combination". Expressed mathematically,

StartLayout 1st Row 1st Column upper C left-parenthesis 10 comma 1 comma x right-parenthesis 2nd Column equals 1 EndLayout

Or, since the number of locations doesn't matter when hiding one item, we can generalize this slightly to

StartLayout 1st Row 1st Column upper C left-parenthesis n comma 1 comma x right-parenthesis 2nd Column equals 1 EndLayout

When hiding two items in ten locations, if the earliest item is found on the first search, there are nine other places the second item could be (where exactly doesn't matter). If the earliest item is found on the second search, there are eight other places the second item could be. Following that logic,

StartLayout 1st Row 1st Column upper C left-parenthesis n comma 2 comma x right-parenthesis 2nd Column equals n minus x EndLayout

Hiding three items in ten locations goes similarly: if the earliest item is found on the first search, there are nine other places the second item could be and eight the third could be (72 combinations); if the earliest item is found on the second search, there are eight other places the second item could be and seven the third could be (56 combinations). Again, the order of the second and third item don't matter, so we need to divide by the number of permutations of two items, 2, giving us 36 and 28 respectively:

StartLayout 1st Row 1st Column upper C left-parenthesis n comma 3 comma x right-parenthesis 2nd Column equals StartFraction left-parenthesis n minus x right-parenthesis left-parenthesis n minus x minus 1 right-parenthesis Over 2 EndFraction EndLayout

In general,

StartLayout 1st Row 1st Column upper C left-parenthesis n comma m comma x right-parenthesis 2nd Column equals StartFraction left-parenthesis n minus x right-parenthesis factorial Over left-parenthesis m minus 1 right-parenthesis factorial left-parenthesis n minus m minus x plus 1 right-parenthesis factorial EndFraction EndLayout

But this is the choose function again:

StartLayout 1st Row 1st Column StartFraction left-parenthesis n minus x right-parenthesis factorial Over left-parenthesis m minus 1 right-parenthesis factorial left-parenthesis left-parenthesis n minus x right-parenthesis minus left-parenthesis m minus 1 right-parenthesis right-parenthesis factorial EndFraction 2nd Column equals StartBinomialOrMatrix n minus x Choose m minus 1 EndBinomialOrMatrix EndLayout

That makes sense, since finding the earliest hidden item in one spot leaves smaller range of possibilities for the others to be in, and one fewer item to hide.

Our expected value for the earliest search to find an item, then, is:

StartLayout 1st Row 1st Column upper E left-parenthesis n comma m right-parenthesis 2nd Column equals StartFraction 1 Over StartBinomialOrMatrix n Choose m EndBinomialOrMatrix EndFraction sigma-summation Underscript x equals 1 Overscript n Endscripts x StartBinomialOrMatrix n minus x Choose m minus 1 EndBinomialOrMatrix EndLayout

Let's check this against the results in the previous post:

(defn factorial [n]
  (reduce * (range 1 (inc n))))
(defn choose [a b]
  (float
    (/ (transduce (map float) * (range a (- a b) -1))
       (factorial b))))
(defn expected-value [n m]
  (/ (transduce
       (map (fn [x]
              (* x (choose (- n x) (dec m)))))
       + (range 1 (inc n)))
     (choose n m)))
(expected-value 10 1)
5.5
(expected-value 10 2)
3.6666666666666665
(expected-value 10 3)
2.75
(expected-value 10 4)
2.2
(expected-value 10 5)
1.8333333333333333
(expected-value 10 6)
1.5714285714285714

Let's compute a larger scenario, hiding ten items in one hundred locations:

(expected-value 100 10)
9.181817923823846