Wiekvoet: Secret Santa

On reddit somebody asked For n individuals, what's the probability that the last person to pick during a round of Secret Santa name picking, will pick their own name.. "With each person picking in turn, and re-picking if they pick out their own name.' Two simulations were proposed. After some thinking I think it can be calculated rather easily.

Problem consideration

I think, it is possible to estimate the chance five people together draw any sequence. It is also possible to enumerate all sequences. Categorize the sequences, add their chances. Problem solved.

So, for example the sequence 2 5 1 3 4.

The first person has has 1/4 chance to select #2, as 2, 3, 4 and 5 can be selected with equal chances.

The second person has 1/4 chance to select #5, as 1, 3, 4 and 5 can be selected with equal chances.

The third person has 1/2 chance to select #1, as 1 or 4 can be selected.

The fourth person selects 3, as there is no alternative (4 results in a repick).

The fifth person selects 4, being the remaining number.

Chance to get this sequence: 1/32.

It is possible to recurse through all these trees, but why? After all we can easily enumerate all these sequences.

nn <-5

pp <- randtoolbox::permut(nn)

for(i in 1:(nn-1)) pp<- pp[!(pp[,i]==i),]

head(pp)

i i i

[1,] 5 4 1 3 2

[2,] 5 4 2 3 1

[3,] 5 4 1 2 3

[4,] 5 4 2 1 3

[5,] 5 3 4 1 2

[6,] 5 3 4 2 1

Then, the chances. For each column it is just 1/#possible_selections. That is one(!) statement.

la <- lapply(1:(nn-1),function(x) 1/rowSums(pp[,x:nn]!=x))

str(la)

List of 4

 $ : num [1:53] 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 0.25 ...

 $ : num [1:53] 0.333 0.333 0.333 0.333 0.333 ...

 $ : num [1:53] 0.5 0.5 0.5 0.5 0.333 ...

 $ : num [1:53] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 1 1 ...

Having these chances, they must be multiplied

prob <- Reduce('*',la)

sum(prob)

[1] 1

Luckily they add to 1. So, what is the chance the last person draws a 5? 0.13, exactly the same as a simulation number found on reddit.

sum(prob[pp[,nn]==nn])

[1] 0.1319444

To get a bit more results, we can put it in a function and get the answers up to 9 persons:

secsan1 <- function(nn) {
pp <- randtoolbox::permut(nn)
for(i in 1:(nn-1)) pp<- pp[!(pp[,i]==i),]
la <- lapply(1:(nn-1),function(x) 1/rowSums(pp[,x:nn]!=x))
prob <- Reduce('*',la)
sum(prob[pp[,nn]==nn])
}
data.frame(n=3:9,sapply(3:9,secsan1))

  n sapply.3.9..secsan1.

1 3           0.25000000

2 4           0.13888889

3 5           0.13194444

4 6           0.11277778

5 7           0.10053241

6 8           0.09049461

7 9           0.08238237

These numbers are close to one of the simulations.

The second simulation shows the chances the last person has to draw a person.

secsan2 <- function(nn) {

pp <- randtoolbox::permut(nn)

for(i in 1:(nn-1)) pp<- pp[!(pp[,i]==i),]

la <- lapply(1:(nn-1),function(x) 1/rowSums(pp[,x:nn]!=x))

prob <- Reduce('*',la)

xtabs(prob ~ pp[,nn])

}

as.data.frame(secsan2(8))

pp...nn. Freq

1 1 0.10485639

2 2 0.10729167

3 3 0.11072279

4 4 0.11591789

5 5 0.12471088

6 6 0.14283588

7 7 0.20316988

8 8 0.09049461

These numbers are not equal to simulation numbers on Reddit. This is because this simulation contains an error. It reads:

selsan <- function(who,persons) {
  if (length(persons)==1) return(persons)
  sel <- sample(persons[persons!=who],1)
  return(c(sel,selsan(who+1,persons[persons!=sel])))
}
#selsan(1,1:5)
finselsan <- function(n){
  selsan(1,1:n)[n]
}
nrep=1e4
sa <- sapply(1:nrep,function(x) finselsan(8))
table(sa)/nrep

What happens is that sample() is pulling a trick on us. Most of the time sample() gets some values and selects one of those. It is possible however, that the second last person has also himself to draw. At which point sample thinks only one number is available. To quote the manual: If x has length 1, is numeric (in the sense of is.numeric) and

x >= 
1

, sampling via sample takes place from 1:x. Note that this convenience feature may lead to undesired behaviour when x is of varying length in calls such as sample(x). See the examples. It is even documented. But who reads these all the time? A corrected simulation avoids this:

selsan <- function(who,persons) {

if (length(persons)==1) return(persons)

if (length(persons[persons!=who])==1) return(c(persons[persons!=who],who))

sel <- sample(persons[persons!=who],1)

return(c(sel,selsan(who+1,persons[persons!=sel])))

}

finselsan <- function(n){

selsan(1,1:n)[n]

}

nrep=1e4

sa <- sapply(1:nrep,function(x) finselsan(8))

as.data.frame(table(sa)/nrep)

sa Freq

1 1 0.1033

2 2 0.1024

3 3 0.1108

4 4 0.1123

5 5 0.1270

6 6 0.1498

7 7 0.2024

8 8 0.0920

This shows the exact calculation gives similar results as two simulations.

Extensions

What are the chances for any person to get anybody? It is simple now.

nn <- 6

pp <- randtoolbox::permut(nn)

for(i in 1:(nn-1)) pp<- pp[!(pp[,i]==i),]

la <- lapply(1:(nn-1),function(x) 1/rowSums(pp[,x:nn]!=x))

prob <- Reduce('*',la)

sapply(1:nn,function(x) xtabs(prob ~ factor(pp[,x],levels=1:nn)))

[,1] [,2] [,3] [,4] [,5] [,6]

1 0.0 0.24 0.2250000 0.2083333 0.1875000 0.1391667

2 0.2 0.00 0.2375000 0.2194444 0.1972222 0.1458333

3 0.2 0.19 0.0000000 0.2388889 0.2138889 0.1572222

4 0.2 0.19 0.1791667 0.0000000 0.2500000 0.1808333

5 0.2 0.19 0.1791667 0.1666667 0.0000000 0.2641667

6 0.2 0.19 0.1791667 0.1666667 0.1513889 0.1127778

The persons are columns, the rows are presents. Obviously 0 on the diagonal, except the last person. Closer to the last person, the chances get more unequal.

Conclusion

Don't do this. Get everybody to draw, then determine if a redraw is needed.

5 comments:

cellocgw12 November, 2012 17:50
I'm not sure this works, since most of the trees are discarded when someone in [1:(j-1)] picks his own name. Next: if person[j-1] draws his own name, then he re-draws, and person[j] is guaranteed NOT to draw his own name. So I'm not convinced your comment on the reddit sim is correct.
And finally, I'm assuming your "Conclusion" is a joke, as it clearly violates the problem definition.
Wingfeet13 November, 2012 21:09
I think that's what I meant. English is a bit too sloppy to use exactly. Besides, it is a foreign language for me, I might have missed a nuance.

Now I am wondering how the alternative might work out. If person x draws own name then everybody returns the names and it starts at person 1 again.

Saturday, November 10, 2012

Secret Santa

Problem consideration

Extensions

Conclusion

5 comments: