Archive-name: puzzles/archive/probability Last-modified: 17 Aug 1993 Version: 4 ==> probability/amoeba.p <== A jar begins with one amoeba. Every minute, every amoeba turns into 0, 1, 2, or 3 amoebae with probability 25% for each case ( dies, does nothing, splits into 2, or splits into 3). What is the probability that the amoeba population eventually dies out? ==> probability/amoeba.s <== If p is the probability that a single amoeba's descendants will die out eventually, the probability that N amoebas' descendents will all die out eventually must be p^N, since each amoeba is independent of every other amoeba. Also, the probability that a single amoeba's descendants will die out must be independent of time when averaged over all the possibilities. At t=0, the probability is p, at t=1 the probability is 0.25(p^0+p^1+p^2+p^3), and these probabilities must be equal. Extinction probability p is a root of f(p)=p. In this case, p = sqrt(2)-1. The generating function for the sequence P(n,i), which gives the probability of i amoebas after n minutes, is f^n(x), where f^n(x) == f^(n-1) ( f(x) ), f^0(x) == x . That is, f^n is the nth composition of f with itself. Then f^n(0) gives the probability of 0 amoebas after n minutes, since f^n(0) = P(n,0). We then note that: f^(n+1)(x) = ( 1 + f^n(x) + (f^n(x))^2 + (f^n(x))^3 )/4 so that if f^(n+1)(0) -> f^n(0) we can solve the equation. The generating function also gives an expression for the expectation value of the number of amoebas after n minutes. This is d/dx(f^n(x)) evaluated at x=1. Using the chain rule we get f'(f^(n-1)(x))*d/dx(f^(n-1)(x)) and since f'(1) = 1.5 and f(1) = 1, we see that the result is just 1.5^n, as might be expected. ==> probability/apriori.p <== An urn contains one hundred white and black balls. You sample one hundred balls with replacement and they are all white. What is the probability that all the balls are white? ==> probability/apriori.s <== This question cannot be answered with the information given. In general, the following formula gives the conditional probability that all the balls are white given you have sampled one hundred balls and they are all white: P(100 white | 100 white samples) = P(100 white samples | 100 white) * P(100 white) ----------------------------------------------------------- sum(i=0 to 100) P(100 white samples | i white) * P(i white) The probabilities P(i white) are needed to compute this formula. This does not seem helpful, since one of these (P(100 white)) is just what we are trying to compute. However, the following argument can be made: Before the experiment, all possible numbers of white balls from zero to one hundred are equally likely, so P(i white) = 1/101. Therefore, the odds that all 100 balls are white given 100 white samples is: P(100 white | 100 white samples) = 1 / ( sum(i=0 to 100) (i/100)^100 ) = 63.6% This argument is fallacious, however, since we cannot assume that the urn was prepared so that all possible numbers of white balls from zero to one hundred are equally likely. In general, we need to know the P(i white) in order to calculate the P(100 white | 100 white samples). Without this information, we cannot determine the answer. This leads to a general "problem": our judgments about the relative likelihood of things is based on past experience. Each experience allows us to adjust our likelihood judgment, based on prior probabilities. This is called Bayesian inference. However, if the prior probabilities are not known, then neither are the derived probabilities. But how are the prior probabilities determined? For example, if we are brains in the vat of a diabolical scientist, all of our prior experiences are illusions, and therefore all of our prior probabilities are wrong. All of our probability judgments indeed depend upon the assumption that we are not brains in a vat. If this assumption is wrong, all bets are off. ==> probability/bayes.p <== One urn contains black marbles, and the other contains white or black marbles with even odds. You pick a marble from an urn; it is black; you put it back; what are the odds that you will draw a black marble on the next draw? What are the odds after n black draws? ==> probability/bayes.s <== Every time you draw a black marble, you throw out (from your probability space) half of those possible urns that contain both colors. So you have 1/2^n times as many ways to have a white marble in the urn after n draws, all black, as at the start. But you have exactly the same number of ways to have both marbles black. The numbers (mixed cases vs. all-black cases) go as 1:1, 1:2, 1:4, 1:8,... and the chance of having a white marble in the urn goes as 1/2, 1/3, 1/5, 1/9, ..., 1/(1+2^(n-1)), hence the odds of drawing a white marble on the nth try after n-1 consecutive drawings of black are 1/4 the first time 1/6 the second time 1/10 the third time ... 1/(2+2^n) the nth time ==> probability/birthday/line.p <== At a movie theater, the manager announces that they will give a free ticket to the first person in line whose birthday is the same as someone who has already bought a ticket. You have the option of getting in line at any time. Assuming that you don't know anyone else's birthday, that birthdays are distributed randomly throughtout the year, etc., what position in line gives you the greatest chance of being the first duplicate birthday? ==> probability/birthday/line.s <== Suppose you are the Kth person in line. Then you win if and only if the K-1 people ahead all have distinct birtdays AND your birthday matches one of theirs. Let A = event that your birthday matches one of the K-1 people ahead B = event that those K-1 people all have different birthdays Then Prob(you win) = Prob(B) * Prob(A | B) (Prob(A | B) is the conditional probability of A given that B occurred.) Now let P(K) be the probability that the K-th person in line wins, Q(K) the probability that the first K people all have distinct birthdays (which occurs exactly when none of them wins). Then P(1) + P(2) + ... + P(K-1) + P(K) = 1 - Q(K) P(1) + P(2) + ... + P(K-1) = 1 - Q(K-1) P(K) = Q(K-1) - Q(K) <--- this is what we want to maximize. Now if the first K-1 all have distinct birthdays, then assuming uniform distribution of birthdays among D days of the year, the K-th person has K-1 chances out of D to match, and D-K+1 chances not to match (which would produce K distinct birthdays). So Q(K) = Q(K-1)*(D-K+1)/D = Q(K-1) - Q(K-1)*(K-1)/D Q(K-1) - Q(K) = Q(K-1)*(K-1)/D = Q(K)*(K-1)/(D-K+1) Now we want to maximize P(K), which means we need the greatest K such that P(K) - P(K-1) > 0. (Actually, as just given, this only guarantees a local maximum, but in fact if we investigate a bit farther we'll find that P(K) has only one maximum.) For convenience in calculation let's set K = I + 1. Then Q(I-1) - Q(I) = Q(I)*(I-1)/(D-I+1) Q(I) - Q(I+1) = Q(I)*I/D P(K) - P(K-1) = P(I+1) - P(I) = (Q(I) - Q(I+1)) - (Q(K-2) - Q(K-1)) = Q(I)*(I/D - (I-1)/(D-I+1)) To find out where this is last positive (and next goes negative), solve x/D - (x-1)/(D-x+1) = 0 Multiply by D*(D+1-x) both sides: (D+1-x)*x - D*(x-1) = 0 Dx + x - x^2 - Dx + D = 0 x^2 - x - D = 0 x = (1 +/- sqrt(1 - 4*(-D)))/2 ... take the positive square root = 0.5 + sqrt(D + 0.25) Setting D=365 (finally deciding how many days in a year!), desired I = x = 0.5 + sqrt(365.25) = 19.612 (approx). The last integer I for which the new probability is greater then the old is therefore I=19, and so K = I+1 = 20. You should try to be the 20th person in line. Computing your chances of actually winning is slightly harder, unless you do it numerically by computer. The recursions you need have already been given. -- David Karr (karr@cs.cornell.edu) ==> probability/birthday/same.day.p <== How many people must be at a party before you have even odds or better of two having the same bithday (not necessarily the same year, of course)? ==> probability/birthday/same.day.s <== 23. See also: archive entry "coupon" ==> probability/cab.p <== A cab was involved in a hit and run accident at night. Two cab companies, the Green and the Blue, operate in the city. Here is some data: a) Although the two companies are equal in size, 85% of cab accidents in the city involve Green cabs and 15% involve Blue cabs. b) A witness identified the cab in this particular accident as Blue. The court tested the reliability of the witness under the same circumstances that existed on the night of the accident and concluded that the witness correctly identified each one of the two colors 80% of the time and failed 20% of the time. What is the probability that the cab involved in the accident was Blue rather than Green? If it looks like an obvious problem in statistics, then consider the following argument: The probability that the color of the cab was Blue is 80%! After all, the witness is correct 80% of the time, and this time he said it was Blue! What else need be considered? Nothing, right? If we look at Bayes theorem (pretty basic statistical theorem) we should get a much lower probability. But why should we consider statistical theorems when the problem appears so clear cut? Should we just accept the 80% figure as correct? ==> probability/cab.s <== The police tests don't apply directly, because according to the wording, the witness, given any mix of cabs, would get the right answer 80% of the time. Thus given a mix of 85% green and 15% blue cabs, he will say 20% of the green cabs and 80% of the blue cabs are blue. That's 20% of 85% plus 80% of 15%, or 17%+12% = 29% of all the cabs that the witness will say are blue. Of those, only 12/29 are actually blue. Thus P(cab is blue | witness claims blue) = 12/29. That's just a little over 40%. Think of it this way... suppose you had a robot watching parts on a conveyor belt to spot defective parts, and suppose the robot made a correct determination only 50% of the time (I know, you should probably get rid of the robot...). If one out of a billion parts are defective, then to a very good approximation you'd expect half your parts to be rejected by the robot. That's 500 million per billion. But you wouldn't expect more than one of those to be genuinely defective. So given the mix of parts, a lot more than 50% of the REJECTED parts will be rejected by mistake (even though 50% of ALL the parts are correctly identified, and in particular, 50% of the defective parts are rejected). When the biases get so enormous, things starts getting quite a bit more in line with intuition. For a related real-life example of probability in the courtroom see People v. Collins, 68 Cal 2d319 (1968). ==> probability/coupon.p <== There is a free gift in my breakfast cereal. The manufacturers say that the gift comes in four different colors, and encourage one to collect all four (& so eat lots of their cereal). Assuming there is an equal chance of getting any one of the colors, what is the expected number of boxes I must consume to get all four? Can you generalise to n colors and/or unequal probabilities? ==> probability/coupon.s <== The problem is well known under the name of "COUPON COLLECTOR PROBLEM". The answer for n equally likely coupons is (1) C(n)=n*H(n) with H(n)=1+1/2+1/3+...+1/n. In the unequal probabilities case, with p_i the probability of coupon i, it becomes (2) C(n)=int_0^infty [1-prod_{i=1}^n (1-exp(-p_i*t))] dt which reduces to (1) when p_i=1/n (An easy exercise). In the equal probabilities case C(n)~n log(n). For a Zipf law, from (2), we have C(n)~n log^2(n). A related problem is the "BIRTHDAY PARADOX" familiar to people interested in hashing algorithms: With a party of 23 persons, you are likely (i.e. with probability >50%) to find two with the same birthday. The non equiprobable case was solved by: M. Klamkin and D. Newman, Extensions of the birthday surprise, J. Comb. Th. 3 (1967), 279-282. ==> probability/darts.p <== Peter throws two darts at a dartboard, aiming for the center. The second dart lands farther from the center than the first. If Peter now throws another dart at the board, aiming for the center, what is the probability that this third throw is also worse (i.e., farther from the center) than his first? Assume Peter's skilfulness is constant. ==> probability/darts.s <== Since the three darts are thrown independently, they each have a 1/3 chance of being the best throw. As long as the third dart is not the best throw, it will be worse than the first dart. Therefore the answer is 2/3. Ranking the three darts' results from A (best) to C (worst), there are, a priori, six equiprobable outcomes. possibility # 1 2 3 4 5 6 1st throw A A B B C C 2nd throw B C A C A B 3rd throw C B C A B A The information from the first two throws shows us that the first throw will not be the worst, nor the second throw the best. Thus possibilities 3, 5 and 6 are eliminated, leaving three equiprobable cases, 1, 2 and 4. Of these, 1 and 2 have the third throw worse than the first; 4 does not. Again the answer is 2/3. ==> probability/derangement.p <== 12 men leave their hats with the hat check. If the hats are randomly returned, what is the probability that nobody gets the correct hat? ==> probability/derangement.s <== Suppose we are handing out hats to n people. First we start with all the possible outcomes. Then we subtract off those that assign the right hat to a given person, for each of the n people. But this double-counts each outcome that assigned 2 hats correctly, so we have to add those outcomes back in. But now we've counted one net copy of each outcome with 3 correct hats in our set, so we have to subtract those off again. But now we've taken away each 4-correct-hat outcome once too often, and have to put it back in, and so forth ... until we add or subtract the outcome that involves all n people getting the correct hats. Putting it all in probabilities, the measure of the original set is 1. For a given subset of k people, the probability that they all get their correct hats is (n-k)!/n!, while there are (n choose k) such subsets of k people altogether. But then (n choose k)*(n-k)!/n! = (n!/((n-k)!*k!))*(n-k)!/n! = 1/k! is the total probability measure we get by counting each subset of k people once each. So we end up generating the finite series 1 - 1/1! + 1/2! - 1/3! +- ... +/- 1/n! which is of course just the first n+1 terms of the Taylor series expansion for f(x) = e^x centered at 0 and evaluated at -1, which converges to 1/e quite fast. You can compute the exact probability for any n >= 1 simply by rounding n!/e to the nearest whole number, then dividing again by n!. Moreover I think you will find you are always rounding down for odd n and rounding up for even n. For example, 12! / e = 176214840.95798... which is within 0.05 (absolute error, not relative) of the correct intermediate result, 176214841. Another fact is that if you set the probability of no matching hats equal to m/n!, then m is an integer divisible by (n-1). That's because the number of ways you can give hat #2 to person #1 is exactly the same as the number of ways you can give that person hat #3, likewise for hat #4, and so forth. -- David Karr (karr@cs.cornell.edu) ==> probability/family.p <== Suppose that it is equally likely for a pregnancy to deliver a baby boy as it is to deliver a baby girl. Suppose that for a large society of people, every family continues to have children until they have a boy, then they stop having children. After 1,000 generations of families, what is the ratio of males to females? ==> probability/family.s <== The ratio will be 50-50 in both cases. We are not killing off any fetuses or babies, and half of all conceptions will be male, half female. When a family decides to stop does not affect this fact. ==> probability/flips/once.in.run.p <== What are the odds that a run of one H or T (i.e., THT or HTH) will occur in n flips of a fair coin? ==> probability/flips/once.in.run.s <== ~References: John P. Robinson, Transition Count and Syndrome are Uncorrelated, IEEE Transactions on Information Theory, Jan 1988. First we define a function or enumerator P(n,k) as the number of length "n" sequences that generate "k" successes. For example, P(4,1)= 4 (HHTH, HTHH, TTHT, and THTT are 4 possible length 4 sequences). I derived two generating functions g(x) and h(x) in order to enumerate P(n,k), they are compactly represented by the following matrix polynomial. _ _ _ _ _ _ | g(x) | | 1 1 | (n-3) | 4 | | | = | | | | | h(x) | | 1 x | |2+2x | |_ _| |_ _| |_ _| The above is expressed as matrix generating function. It can be shown that P(n,k) is the coefficient of the x^k in the polynomial (g(x)+h(x)). For example, if n=4 we get (g(x)+h(x)) from the matrix generating function as (10+4x+2x^2). Clearly, P(4,1) (coefficient of x) is 4 and P(4,2)=2 ( There are two such sequences THTH, and HTHT). We can show that mean(k) = (n-2)/4 and sd= square_root(5n-12)/4 We need to generate "n" samples. This can be done by using sequences of length (n+2). Then our new statistics would be mean = n/4 sd = square_root(5n-2)/4 Similar approach can be followed for higher dimensional cases. ==> probability/flips/twice.in.run.p <== What is the probability in n flips of a fair coin that there will be two heads in a row? ==> probability/flips/twice.in.run.s <== Well, the question is then how many strings of n h's and t's contain hh? I would guess right off hand that its going to be easier to calculate the number of strings that _don't_ contain hh and then subtract that from the total number of strings. So we want to count the strings of n h's and t's with no hh in them. How many h's and t's can there be? It is fairly clear that there must be from 0 to n/2 h's, inclusive. (If there were (n/2+1) then there would have to be two touching.) How many strings are there with 0 h's? 1 How many strings are there with 1 h? Well, there are (n-1) t's, so there are a total of n places to put the one h. So the are nC1 such strings. How many strings are there with 2 h's? Well, there are (n-1) places to put the two h's, so there are (n-1)C2 such strings. Finally, with n/2 h's there are (n/2+1) places to put them, so there are (n/2+1)C(n/2) such strings. Therefore the total number of strings is Sum (from i=0 to n/2) of (n-i+1)C(i) Now, here's where it get's interesting. If we play around with Pascal's triangle for a while, we see that this sum equals none other than the (n+2)th Fibonacci number. So the probability that n coin tosses will give a hh is: 2^n-f(n+2) ---------- 2^n (where f(x) is the xth Fibanocci number (so that f(1) is and f(2) are both 1)) ==> probability/flips/unfair.p <== Generate even odds from an unfair coin. For example, if you thought a coin was biased toward heads, how could you get the equivalent of a fair coin with several tosses of the unfair coin? ==> probability/flips/unfair.s <== Toss twice. If both tosses give the same result, repeat this process (throw out the two tosses and start again). Otherwise, take the first of the two results. ==> probability/flips/waiting.time.p <== Compute the expected waiting time for a sequence of coin flips, or the probabilty that one sequence of coin flips will occur before another. ==> probability/flips/waiting.time.s <== Here's a C program I had lying around that is relevant to the current discussion of coin-flipping. The algorithm is N^3 (for N flips) but it could certainly be improved. Compile with cc -o flip flip.c -lm -- Guy _________________ Cut here ___________________ #include #include char *progname; /* Program name */ #define NOT(c) (('H' + 'T') - (c)) /* flip.c -- a program to compute the expected waiting time for a sequence of coin flips, or the probabilty that one sequence of coin flips will occur before another. Guy Jacobson, 11/1/90 */ main (ac, av) int ac; char **av; { char *f1, *f2, *parseflips (); double compute (); progname = av[0]; if (ac == 2) { f1 = parseflips (av[1]); printf ("Expected number of flips until %s = %.1f\n", f1, compute (f1, NULL)); } else if (ac == 3) { f1 = parseflips (av[1]); f2 = parseflips (av[2]); if (strcmp (f1, f2) == 0) { printf ("Can't use the same flip sequence.\n"); exit (1); } printf ("Probability of flipping %s before %s = %.1f%%\n", av[1], av[2], compute (f1, f2) * 100.0); } else usage (); } char *parseflips (s) char *s; { char *f = s; while (*s) if (*s == 'H' || *s == 'h') *s++ = 'H'; else if (*s == 'T' || *s == 't') *s++ = 'T'; else usage (); return f; } usage () { printf ("usage: %s {HT}^n\n", progname); printf ("\tto get the expected waiting time, or\n"); printf ("usage: %s s1 s2\n\t(where s1, s2 in {HT}^n for some fixed n)\n", progname); printf ("\tto get the probability that s1 will occur before s2\n"); exit (1); } /* compute -- if f2 is non-null, compute the probability that flip sequence f1 will occur before f2. With null f2, compute the expected waiting time until f1 is flipped technique: Build a DFA to recognize (H+T)*f1 [or (H+T)*(f1+f2) when f2 is non-null]. Randomly flipping coins is a Markov process on the graph of this DFA. We can solve for the probability that f1 precedes f2 or the expected waiting time for f1 by setting up a linear system of equations relating the values of these unknowns starting from each state of the DFA. Solve this linear system by Gaussian Elimination. */ typedef struct state { char *s; /* pointer to substring string matched */ int len; /* length of substring matched */ int backup; /* number of one of the two next states */ } state; double compute (f1, f2) char *f1, *f2; { double solvex0 (); int i, j, n1, n; state *dfa; int nstates; char *malloc (); n = n1 = strlen (f1); if (f2) n += strlen (f2); /* n + 1 states in the DFA */ dfa = (state *) malloc ((unsigned) ((n + 1) * sizeof (state))); if (!dfa) { printf ("Ouch, out of memory!\n"); exit (1); } /* set up the backbone of the DFA */ for (i = 0; i <= n; i++) { dfa[i].s = (i <= n1) ? f1 : f2; dfa[i].len = (i <= n1) ? i : i - n1; } /* for i not a final state, one next state of i is simply i+1 (this corresponds to another matching character of dfs[i].s The other next state (the backup state) is now computed. It is the state whose substring matches the longest suffix with the last character changed */ for (i = 0; i <= n; i++) { dfa[i].backup = 0; for (j = 1; j <= n; j++) if ((dfa[j].len > dfa[dfa[i].backup].len) && dfa[i].s[dfa[i].len] == NOT (dfa[j].s[dfa[j].len - 1]) && strncmp (dfa[j].s, dfa[i].s + dfa[i].len - dfa[j].len + 1, dfa[j].len - 1) == 0) dfa[i].backup = j; } /* our dfa has n + 1 states, so build a system n + 1 equations in n + 1 unknowns */ eqsystem (n + 1); for (i = 0; i < n; i++) if (i == n1) equation (1.0, n1, 0.0, 0, 0.0, 0, -1.0); else equation (1.0, i, -0.5, i + 1, -0.5, dfa[i].backup, f2 ? 0.0 : -1.0); equation (1.0, n, 0.0, 0, 0.0, 0, 0.0); free (dfa); return solvex0 (); } /* a simple gaussian elimination equation solver */ double *m, **M; int rank; int neq = 0; /* create an n by n system of linear equations. allocate space for the matrix m, filled with zeroes and the dope vector M */ eqsystem (n) int n; { char *calloc (); int i; m = (double *) calloc (n * (n + 1), sizeof (double)); M = (double **) calloc (n, sizeof (double *)); if (!m || !M) { printf ("Ouch, out of memory!\n"); exit (1); } for (i = 0; i < n; i++) M[i] = &m[i * (n + 1)]; rank = n; neq = 0; } /* add a new equation a * x_na + b * x_nb + c * x_nc + d = 0.0 (note that na, nb, and nc are not necessarily all distinct.) */ equation (a, na, b, nb, c, nc, d) double a, b, c, d; int na, nb, nc; { double *eq = M[neq++]; /* each row is an equation */ eq[na + 1] += a; eq[nb + 1] += b; eq[nc + 1] += c; eq[0] = d; /* column zero holds the constant term */ } /* solve for the value of variable x_0. This will go nuts if therer are errors (for example, if m is singular) */ double solvex0 () { register i, j, jmax, k; register double max, val; register double *maxrow, *row; for (i = rank; i > 0; --i) { /* for each variable */ /* find pivot element--largest value in ith column*/ max = 0.0; for (j = 0; j < i; j++) if (fabs (M[j][i]) > fabs (max)) { max = M[j][i]; jmax = j; } /* swap pivot row with last row using dope vectors */ maxrow = M[jmax]; M[jmax] = M[i - 1]; M[i - 1] = maxrow; /* normalize pivot row */ max = 1.0 / max; for (k = 0; k <= i; k++) maxrow[k] *= max; /* now eliminate variable i by subtracting multiples of pivot row */ for (j = 0; j < i - 1; j++) { row = M[j]; if (val = row[i]) /* if variable i is in this eq */ for (k = 0; k <= i; k++) row[k] -= maxrow[k] * val; } } /* the value of x0 is now in constant column of first row we only need x0, so no need to back-substitute */ val = -M[0][0]; free (M); free (m); return val; } _________________________________________________________________ Guy Jacobson (201) 582-6558 AT&T Bell Laboratories uucp: {att,ucbvax}!ulysses!guy 600 Mountain Avenue internet: guy@ulysses.att.com Murray Hill NJ, 07974 ==> probability/flush.p <== Which set contains proportionately more flushes than the set of all possible poker hands? (1) Hands whose first card is an ace (2) Hands whose first card is the ace of spades (3) Hands with at least one ace (4) Hands with the ace of spades ==> probability/flush.s <== An arbitrary hand can have two aces but a flush hand can't. The average number of aces that appear in flush hands is the same as the average number of aces in arbitrary hands, but the aces are spread out more evenly for the flush hands, so set #3 contains a higher fraction of flushes. Aces of spades, on the other hand, are spread out the same way over possible hands as they are over flush hands, since there is only one of them in the deck. Whether or not a hand is flush is based solely on a comparison between different cards in the hand, so looking at just one card is necessarily uninformative. So the other sets contain the same fraction of flushes as the set of all possible hands. ==> probability/hospital.p <== A town has two hospitals, one big and one small. Every day the big hospital delivers 1000 babies and the small hospital delivers 100 babies. There's a 50/50 chance of male or female on each birth. Which hospital has a better chance of having the same number of boys as girls? ==> probability/hospital.s <== The small one. If there are 2N babies born, then the probability of an even split is (2N choose N) / (2 ** 2N) , where (2N choose N) = (2N)! / (N! * N!) . This is a DECREASING function. If there are two babies born, then the probability of a split is 1/2 (just have the second baby be different from the first). With 2N babies, If there is a N,N-1 split in the first 2N-1, then there is a 1/2 chance of the last baby making it an even split. Otherwise there can be no even split. Therefore the probability is less than 1/2 overall for an even split. As N goes to infinity the probability of an even split approaches zero (although it is still the most likely event). ==> probability/icos.p <== The "house" rolls two 20-sided dice and the "player" rolls one 20-sided die. If the player rolls a number on his die between the two numbers the house rolled, then the player wins. Otherwise, the house wins (including ties). What are the probabilities of the player winning? ==> probability/icos.s <== It is easily seen that if any two of the three dice agree that the house wins. The probability that this does not happen is 19*18/(20*20). If the three numbers are different, the probability of winning is 1/3. So the chance of winning is 19*18/(20*20*3) = 3*19/200 = 57/200. ==> probability/intervals.p <== Given two random points x and y on the interval 0..1, what is the average size of the smallest of the three resulting intervals? ==> probability/intervals.s <== In between these positions the surface forms a series of planes. Thus the volume under it consists of 2 pyramids each with an altitude of 1/3 and an (isosceles triangular) base of area 1/2, yielding a total volume of 1/9. ==> probability/killers.and.pacifists.p <== You enter a town that has K killers and P pacifists. When a pacifist meets a pacifist, nothing happens. When a pacifist meets a killer, the pacifist is killed. When two killers meet, both die. Assume meetings always occur between exactly two persons and the pairs involved are completely random. What are your odds of survival? ==> probability/killers.and.pacifists.s <== Regardless of whether you are a pacifist or a killer, you may disregard all events in which a pacifist other than yourself is involved and consider only events in which you are killed or a pair of killers other than yourself is killed. Thus we may assume that there are N killers and yourself. If N is odd, your odds of surviving are zero. If N is even, it doesn't matter to you whether you are a killer or not. So WLOG assume you are. Then your probability of survival is 1/(N+1). ==> probability/leading.digit.p <== What is the probability that the ratio of two random reals starts with a 1? What about 9? ==> probability/leading.digit.s <== What is the distribution of y/x for (x,y) chosen with uniform distribution from the unit square? First you want y/x in one of the intervals ... [0.01,0.02), [0.1,0.2), [1,2), [10/20), ... . This corresponds to (x,y) lying in one of several triangles with height 1 and bases on either the right or top edges of the square. The bases along the right edge have lengths 0.1 (for 0.1 <= y/x < 0.2), 0.01, 0.001, ... ; the sum of this series is 1/9. The bases along the top edge have lengths 0.5 (for 0.5 < x/y <= 1), 0.05, 0.005, ... ; the sum of this series is 5/9. So you have a total base length of 6/9 = 2/3, height 1, so the area is 1/3. The total area of the square is 1/3, so the probability that y/x starts with a 1 is 1/3 ~= 0.333333. In the second case you have the same lengths (but in different places) on the right edge, total 1/9. But on the top edge, 9 <= y/x < 10 gives you 1/10 < x/y <= 1/9 gives you a base of length 1/9 - 1/10 = 1/90, and the series proceeds 1/900, 1/9000, ... ; the sum is 1/81. Total base length then is 9/81 + 1/81 = 10/81, height 1, total area (and hence probability of a leading 9) is 5/81 ~= 0.061728. ==> probability/lights.p <== Waldo and Basil are exactly m blocks west and n blocks north from Central Park, and always go with the green light until they run out of options. Assuming that the probability of the light being green is 1/2 in each direction, that if the light is green in one direction it is red in the other, and that the lights are not synchronized, find the expected number of red lights that Waldo and Basil will encounter. ==> probability/lights.s <== Let E(m,n) be this number, and let (x)C(y) = x!/(y! (x-y)!). A model for this problem is the following nxm grid: ^ B---+---+---+ ... +---+---+---+ (m,0) | | | | | | | | | N +---+---+---+ ... +---+---+---+ (m,1) <--W + E--> : : : : : : : : S +---+---+---+ ... +---+---+---+ (m,n-1) | | | | | | | | | v +---+---+---+ ... +---+---+---E (m,n) where each + represents a traffic light. We can consider each traffic light to be a direction pointer, with an equal chance of pointing either east or south. IMHO, the best way to approach this problem is to ask: what is the probability that edge-light (x,y) will be the first red edge-light that the pedestrian encounters? This is easy to answer; since the only way to reach (x,y) is by going south x times and east y times, in any order, we see that there are (x+y)C(x) possible paths from (0,0) to (x,y). Since each of these has probability (1/2)^(x+y+1) of occuring, we see that the the probability we are looking for is (1/2)^(x+y+1)*(x+y)C(x). Multiplying this by the expected number of red lights that will be encountered from that point, (n-k+1)/2, we see that m-1 ----- \ E(m,n) = > ( 1/2 )^(n+k+1) * (n+k)C(n) * (m-k+1)/2 / ----- k=0 n-1 ----- \ + > ( 1/2 )^(m+k+1) * (m+k)C(m) * (n-k+1)/2 . / ----- k=0 Are we done? No! Putting on our Captain Clever Cap, we define n-1 ----- \ f(m,n) = > ( 1/2 )^k * (m+k)C(m) * k / ----- k=0 and n-1 ----- \ g(m,n) = > ( 1/2 )^k * (m+k)C(m) . / ----- k=0 Now, we know that n ----- \ f(m,n)/2 = > ( 1/2 )^k * (m+k-1)C(m) * (k-1) / ----- k=1 and since f(m,n)/2 = f(m,n) - f(m,n)/2, we get that n-1 ----- \ f(m,n)/2 = > ( 1/2 )^k * ( (m+k)C(m) * k - (m+k-1)C(m) * (k-1) ) / ----- k=1 - (1/2)^n * (m+n-1)C(m) * (n-1) n-2 ----- \ = > ( 1/2 )^(k+1) * (m+k)C(m) * (m+1) / ----- k=0 - (1/2)^n * (m+n-1)C(m) * (n-1) = (m+1)/2 * (g(m,n) - (1/2)^(n-1)*(m+n-1)C(m)) - (1/2)^n*(m+n-1)C(m)*(n-1) therefore f(m,n) = (m+1) * g(m,n) - (n+m) * (1/2)^(n-1) * (m+n-1)C(m) . Now, E(m,n) = (n+1) * (1/2)^(m+2) * g(m,n) - (1/2)^(m+2) * f(m,n) + (m+1) * (1/2)^(n+2) * g(n,m) - (1/2)^(n+2) * f(n,m) = (m+n) * (1/2)^(n+m+1) * (m+n)C(m) + (m-n) * (1/2)^(n+2) * g(n,m) + (n-m) * (1/2)^(m+2) * g(m,n) . Setting m=n in this formula, we see that E(n,n) = n * (1/2)^(2n) * (2n)C(n), and applying Stirling's theorem we get the beautiful asymptotic formula E(n,n) ~ sqrt(n/pi). ==> probability/lottery.p <== There n tickets in the lottery, k winners and m allowing you to pick another ticket. The problem is to determine the probability of winning the lottery when you start by picking 1 (one) ticket. A lottery has N balls in all, and you as a player can choose m numbers on each card, and the lottery authorities then choose n balls, define L(N,n,m,k) as the minimum number of cards you must purchase to ensure that at least one of your cards will have at least k numbers in common with the balls chosen in the lottery. ==> probability/lottery.s <== This relates to the problem of rolling a random number from 1 to 17 given a 20 sided die. You simply mark the numbers 18, 19, and 20 as "roll again". The probability of winning is, of course, k/(n-m) as for every k cases in which you get x "draw again"'s before winning, you get n-m-k similar cases where you get x "draw again"'s before losing. The example using dice makes it obvious, however. L(N,k,n,k) >= Ceiling((N-choose-k)/(n-choose-k)* (n-1-choose-k-1)/(N-1-choose-k-1)*L(N-1,k-1,n-1,k-1)) = Ceiling(N/n*L(N-1,k-1,n-1,k-1)) The correct way to see this is as follows: Suppose you have an (N,k,n,k) system of cards. Look at all of the cards that contain the number 1. They cover all ball sets that contain 1, and therefore these cards become an (N-1,k-1,n-1,k-1) covering upon deletion of the number 1. Therefore the number 1 must appear at least L(N-1,k-1,n-1,k-1). The same is true of all of the other numbers. There are N of them but n appear on each card. Thus we obtain the bound. ==> probability/oldest.girl.p <== You meet a stranger on the street, and ask how many children he has. He truthfully says two. You ask "Is the older one a girl?" He truthfully says yes. What is the probability that both children are girls? What would the probability be if your second question had been "Is at least one of them a girl?", with the other conditions unchanged? ==> probability/oldest.girl.s <== There are four possibilities: Oldest child Youngest child 1. Girl Girl 2. Girl Boy 3. Boy Girl 4. Boy Boy If your friend says "My oldest child is a girl," he has eliminated cases 3 and 4, and in the remaining cases both are girls 1/2 of the time. If your friend says "At least one of my children is a girl," he has eliminated case 4 only, and in the remaining cases both are girls 1/3 of the time. ==> probability/particle.in.box.p <== A particle is bouncing randomly in a two-dimensional box. How far does it travel between bounces, on average? Suppose the particle is initially at some random position in the box and is traveling in a straight line in a random direction and rebounds normally at the edges. ==> probability/particle.in.box.s <== Let theta be the angle of the point's initial vector. After traveling a distance r, the point has moved r*cos(theta) horizontally and r*sin(theta) vertically, and thus has struck r*(sin(theta)+cos(theta))+O(1) walls. Hence the average distance between walls will be 1/(sin(theta)+cos(theta)). We now average this over all angles theta: 2/pi * intg from theta=0 to pi/2 (1/(sin(theta)+cos(theta))) dtheta which (in a computation which is left as an exercise) reduces to 2*sqrt(2)*ln(1+sqrt(2))/pi = 0.793515. ==> probability/pi.p <== Are the digits of pi random (i.e., can you make money betting on them)? ==> probability/pi.s <== No, the digits of pi are not truly random, therefore you can win money playing against a supercomputer that can calculate the digits of pi far beyond what we are currently capable of doing. This computer selects a position in the decimal expansion of pi -- say, at 10^100. Your job is to guess what the next digit or digit sequence is. Specifically, you have one dollar to bet. A bet on the next digit, if correct, returns 10 times the amount bet; a bet on the next two digits returns 100 times the amount bet, and so on. (The dollar may be divided in any fashion, so we could bet 1/3 or 1/10000 of a dollar.) You may place bets in any combination. The computer will tell you the position number, let you examine the digits up to that point, and calculate statistics for you. It is easy to set up strategies that might win, if the supercomputer doesn't know your strategy. For example, "Always bet on 7" might win, if you are lucky. Also, it is easy to set up bets that will always return a dollar. For example, if you bet a penny on every two-digit sequence, you are sure to win back your dollar. Also, there are strategies that might be winning, but we can't prove it. For example, it may be that a certain sequence of digits never occurs in pi, but we have no way of proving this. The problem is to find a strategy that you can prove will always win back more than a dollar. The assumption that the position is beyond the reach of calculation means that we must rely on general facts we know about the sequence of digits of pi, which is practically nil. But it is not completely nil, and the challenge is to find a strategy that will always win money. A theorem of Mahler (1953) states that for all integers p, q > 1, -42 |pi - p/q| > q This says that pi cannot have a rational approximation that is extremely tight. Now suppose that the computer picks position N. I know that the next 41 * N digits cannot be all zero. For if they were, then the first N digits, treated as a fraction with denominator 10^N, satisfies: |pi - p / 10^N| < 10^(-42 N) which contradicts Mahler's theorem. So, I split my dollar into 10^(41N) - 1 equal parts, and bet on each of the sequences of 41N digits, except the one with all zeroes. One of the bets is sure to win, so my total profit is about 10(^-41N) of a dollar! This strategy can be improved a number of ways, such as looking for other repeating patterns, or improvements to the bound of 42 -- but the earnings are so pathetic, it hardly seems worth the effort. Are there other winning strategies, not based on Mahler's theorem? I believe there are algorithms that generate 2N binary digits of pi, where the computations are separate for each block of N digits. Maybe from something like this, we can find a simple subsequence of the binary digits of pi which is always zero, or which has some simple pattern. ==> probability/random.walk.p <== Waldo has lost his car keys! He's not using a very efficient search; in fact, he's doing a random walk. He starts at 0, and moves 1 unit to the left or right, with equal probability. On the next step, he moves 2 units to the left or right, again with equal probability. For subsequent turns he follows the pattern 1, 2, 1, etc. His keys, in truth, were right under his nose at point 0. Assuming that he'll spot them the next time he sees them, what is the probability that poor Waldo will eventually return to 0? ==> probability/random.walk.s <== I can show the probability that Waldo returns to 0 is 1. Waldo's wanderings map to an integer grid in the plane as follows. Let (X_t,Y_t) be the cumulative sums of the length 1 and length 2 steps respectively taken by Waldo through time t. By looking only at even t, we get the ordinary random walk in the plane, which returns to the origin (0,0) with probability 1. In fact, landing at (2n, n) for any n will land Waldo on top of his keys too. There's no need to look at odd t. Similar considerations apply for step sizes of arbitrary (fixed) size. ==> probability/reactor.p <== There is a reactor in which a reaction is to take place. This reaction stops if an electron is present in the reactor. The reaction is started with 18 positrons; the idea being that one of these positrons would combine with any incoming electron (thus destroying both). Every second, exactly one particle enters the reactor. The probablity that this particle is an electron is 0.49 and that it is a positron is 0.51. What is the probability that the reaction would go on for ever? Note: Once the reaction stops, it cannot restart. ==> probability/reactor.s <== Let P(n) be the probability that, starting with n positrons, the reaction goes on forever. Clearly P'(n+1)=P'(0)*P'(n), where the ' indicates probabilistic complementation; also note that P'(n) = .51*P'(n+1) + .49*P'(n-1). Hence we have that P(1)=(P'(0))^2 and that P'(0) = .51*P'(1) ==> P'(0) equals 1 or 49/51. We thus get that either P'(18) = 1 or (49/51)^19 ==> P(18) = 0 or 1 - (49/51)^19. The answer is indeed the latter. A standard result in random walks (which can be easily derived using Markov chains) yields that if p>1/2 then the probability of reaching the absorbing state at +infinity as opposed to the absorbing state at -1 is 1-r^(-i), where r=p/(1-p) (p is the probability of moving from state n to state n-1, in our case .49) and i equals the starting location + 1. Therefore we have that P(18) = 1-(.49/.51)^19. ==> probability/roulette.p <== You are in a game of Russian roulette, but this time the gun (a 6 shooter revolver) has three bullets _in_a_row_ in three of the chambers. The barrel is spun only once. Each player then points the gun at his (her) head and pulls the trigger. If he (she) is still alive, the gun is passed to the other player who then points it at his (her) own head and pulls the trigger. The game stops when one player dies. Now to the point: would you rather be first or second to shoot? ==> probability/roulette.s <== All you need to consider are the six possible bullet configurations B B B E E E -> player 1 dies E B B B E E -> player 2 dies E E B B B E -> player 1 dies E E E B B B -> player 2 dies B E E E B B -> player 1 dies B B E E E B -> player 1 dies One therefore has a 2/3 probability of winning (and a 1/3 probability of dying) by shooting second. I for one would prefer this option. ==> probability/transitivity.p <== Can you number dice so that die A beats die B beats die C beats die A? What is the largest probability p with which each event can occur? ==> probability/transitivity.s <== Yes. The actual values on the dice faces don't matter, only their ordering. WLOG we may assume that no two faces of the same or different dice are equal. We can assume "generalised dice", where the faces need not be equally probable. These can be approximated by dice with equi-probable faces by having enough faces and marking some of them the same. Take the case of three dice, called A, B, and C. Picture the different values on the faces of the A die. Suppose there are three: A A A The values on the B die must lie in between those of the A die: B A B A B A B With three different A values, we need only four different B values. Similarly, the C values must lie in between these: C B C A C B C A C B C A C B C Assume we want A to beat B, B to beat C, and C to beat A. Then the above scheme for the ordering of values can be simplified to: B C A B C A B C A B C since for example, the first C in the previous arrangement can be moved to the second with the effect that the probability that B beats C is increased, and the probabilities that C beats A or A beats B are unchanged. Similarly for the other omitted faces. In general we obtain for n dice A...Z the arrangement B ... Z A B ... Z ...... A B ... Z where there are k complete cycles of B..ZA followed by B...Z. k must be at least 1. CONJECTURE: The optimum can be obtained for k=1. So the arrangement of face values is B ... Z A B ... Z. For three dice it is BCABC. Thus one die has just one face, all the other dice have two (with in general different probabilities). CONJECTURE: At the optimum, the probabilities that each die beats the next can be equal. Now put probabilities into the BCABC arrangement: B C A B C x y 1 x' y' Clearly x+x' = y+y' = 1. Prob. that A beats B = x' B beats C = x + x'y' C beats A = y Therefore x' = y = x + x'y' Solving for these gives x = y' = 1-y, x' = y = (-1 + sqrt(5))/2 = prob. of each die beating the next = 0.618... For four dice one obtains the probabilities: B C D A B C D x y z 1 x' y' z' A beats B: x' B beats C: x + x'y' C beats D: y + y'z' D beats A: z CONJECTURE: for any number of dice, at the optimum, the sequence of probabilities abc...z1a'b'c...z' is palindromic. We thus have the equalities: x+x' = 1 y+y' = 1 z+z' = 1 x' = z = x + x'y' = x + x'y' y = y' (hence both = 1/2) Solving this gives x = 1/3, z = 2/3 = prob. of each die beating the next. Since all the numbers are rational, the limit is attainable with finitely many equiprobable faces. E.g. A has one face, marked 0. C has two faces, marked 2 and -2. B has three faces, marked 3, -1, -1. D has three faces, marked 1, 1, -3. Or all four dice can be given six faces, marked with numbers in the range 0 to 6. Finding the solution for 5, 6, or n dice is left as an exercise. -- ____ Richard Kennaway __\_ / School of Information Systems Internet: jrk@sys.uea.ac.uk \ X/ University of East Anglia uucp: ...mcsun!ukc!uea-sys!jrk \/ Norwich NR4 7TJ, U.K. Martin Gardner (of course!) wrote about notransitive dice, see the Oct '74 issue of Scientific American, or his book "Wheels, Life and Other Mathematical Amusements", ISBN 0-7167-1588-0 or ISBN 0-7167-1589-9 (paperback). In the book, Gardner cites Bradley Efron of Stanford U. as stating that the maximum number for three dice is approx .618, requiring dice with more than six sides. He also mentions that .75 is the limit approached as the number of dice increases. The book shows three sets of 6-sided dice, where each set has 2/3 as the advantage probability.