birthday {stats} | R Documentation |
Computes approximate answers to a generalised birthday paradox
problem. pbirthday
computes the probability of a coincidence
and qbirthday
computes the smallest number of observations
needed to have at least a specified probability of coincidence.
qbirthday(prob = 0.5, classes = 365, coincident = 2) pbirthday(n, classes = 365, coincident = 2)
classes |
How many distinct categories the people could fall into |
prob |
The desired probability of coincidence |
n |
The number of people |
coincident |
The number of people to fall in the same category |
The birthday paradox is that a very small number of people, 23, suffices to have a 50–50 chance that two or more of them have the same birthday. This function generalises the calculation to probabilities other than 0.5, numbers of coincident events other than 2, and numbers of classes other than 365.
This formula is approximate. For coincident = 2
the exact
computation is straightforward (see the examples) and may be
preferable. The approximation based on Diaconis and Mosteller (1989),
and was a mis-reading prior to R 2.13.1.
qbirthday |
Minimum number of people needed for a probability of at least
prob that k or more of them have the same one out of
classes equiprobable labels.
|
pbirthday |
Probability of the specified coincidence. |
Diaconis, P. and Mosteller F. (1989) Methods for studying coincidences. J. American Statistical Association, 84, 853–861.
require(graphics) ## the standard version qbirthday() ## same 4-digit PIN number qbirthday(classes = 10^4) ## 0.9 probability of three or more coincident birthdays qbirthday(coincident = 3, prob = 0.9) ## Chance of 4 or more coincident birthdays in 150 people pbirthday(150, coincident = 4) ## 100 or more coincident birthdays in 1000 people: very rare pbirthday(1000, coincident = 100) ## Accuracy compared to exact calculation x1 <- sapply(10:100, pbirthday) x2 <- 1-sapply(10:100, function(n) prod((365:(365-n+1))/rep(365,n))) par(mfrow = c(2, 2)) plot(x1, x2, xlab = "approximate", ylab = "exact") abline(0, 1) plot(x1, x1-x2, xlab = "approximate", ylab = "error") abline(h = 0) plot(x1, x2, log = "xy", xlab = "approximate", ylab = "exact") abline(0, 1) plot(1-x1, 1-x2, log = "xy", xlab = "approximate", ylab = "exact") abline(0, 1)