User Tools

Site Tools


This is an old revision of the document!

Here is an example from genetics. Probabilities and frequencies have been rounded and some genetic details of the system are simplified for convenience; however, this should serve as a reasonable first pass approximation. The mouse t-haplotype is an example of meiotic drive (meiotic drive is a fascinating example of potentially powerful positive selection that does not necessarily result in Darwinian adaptation of the organism and in fact often lowers fitness). Heterozygote ($t$/+) males that have one copy of each allele, a $t$ allele and a wildtype “+” allele do not have an equal (1/2, 1/2) Mendelian probability of passing on each allele to their offspring. Rather there is a 90% chance of passing on the $t$ allele and a 10% chance of the “+” allele. Incidentally $t$/$t$ homozygotes are immediately lethal and do not exist in the population. The t-haplotype has an easily observable dominant phenotype. Heterozygous mice have much shorter tails than normal. In wild populations 5% of mice carry the $t$ allele (i.e., are $t$/+ heterozygotes).

You are doing a project on quantifying the reproductive success of mice with the t-haplotype in the wild. However, you can only directly observe the offspring of individual females without knowing the identity of the male parents. (Here assume each batch of offspring has only one male parent and belongs to the female it is found with in the nest.) You want to calculate the probability that the male parent was a $t$/+ heterozygote given the observation of the number of +/+ offspring. First off, before observing any more data, we can use the information of the 5% frequency of heterozygotes in the population from earlier studies. Our “prior” probability of a heterozygous father is $P(M_1) = 0.05$. Here were are using $P()$ to represent probability and $M_1$ to represent one of our models. A model is a hypothesis and our first hypothesis in this example is that the father is heterozygous. Our second model is that the father is a wildtype homozygote.

Let's say that we observe a single +/+ offspring. Now we need to calculate the probability of our data, $P(D)$. This is integrated over all models. Either the parent is a heterozygote, with a probability of 5% and the probability of a +/+ offspring is 10%, or the parent is a +/+ homozygote and the probability of a +/+ offspring is 100%. $$P(D) = 0.05 \times 0.1 + 0.95 \times 1 = 0.00475$$ You can also see that $$P(D) = P(M_1) P(D|M_1) + P(M_2) P(D|M_2)\mbox{,}$$ where the | symbol means “given” so $P(D|M_1)$ is the probability of the data given model one is true. (In this case there are only two discrete models but we could integrate over more models or a range of parameter values for a model.)

We are interested in the probability of the model given the data and it turns out that $$P(M|D) P(D) = P(D|M) P(M)\mbox{.}$$ The probability of the model given the data times the probability of the data is equal to the probability of the data given the model times the probability of the model—these are kind of flip side perspectives of looking at the same relationships. This is not immediately intuitive or obvious but it works and can easily be rearranged to the classical Bayesian equation $$P(M|D) = \frac{P(D|M) P(M)}{P(D)}\mbox{.}$$

bayesian_statistics.1570001319.txt.gz · Last modified: 2019/10/02 07:28 by floyd