Hardy 1908

From Genetics Wiki
Revision as of 10:53, 2 September 2018 by Floyd (talk | contribs) (Fourth Paragraph)

Jump to: navigation, search

Citation

Hardy, G. H. (1908) Mendelian Proportions in a mixed population. Science 28(706): 49-50.

Links

https://www.jstor.org/stable/1636004

Notes

First Paragraph

The tone of the first paragraph is a bit amusing, at least to a modern reader. It is hard to know how much of this is overt criticism versus the norms of formal language in 1908. However, this sentence is unambiguous, "I should have expected the very simple point which I wish to make to have been familiar to biologists". Hardy definitely felt that the question of expected genotype proportions was far too trivial for him to waste time upon; yet, he was forced to do so by the blatant misunderstandings of others. The irony is that this is probably the result for which he is best known today. Elsewhere he keeps going with "a little mathematics of the multiplication-table type is enough to show ...", "it is easy to see ...", and "there is not the slightest foundation for the idea ...".

Second Paragraph

At the time there was a debate about the general validity of Mendelian genetics in terms of understanding biological heritability. This was part of the Biometric-Mendelian Debate (or mutationalists versus selectionists), which was not resolved until the Modern Synthesis of biology later in the 20th century. One objection to Mendelian genetics that was brought up by Yule is that phenotypes in natural populations do not follow Mendelian proportions. The example of brachydactyly (shortened fingers and/or toes, one form of which is a dominant trait) in humans is used with the observation that the ratio of brachydactylus to unaffected individuals is not three to one.

A a
A AA Aa
a Aa aa

The Punnett Square above for an F2 Cross shows the expected offspring from two heterozygous parents (whose alleles are represented at the top of the columns and beginning of the rows and are combined to yield the child's genotype) with three brachydactylus offspring in red (genotypes AA and Aa) to one brachydactylus child in blue (aa).

An F2 cross is a very artificial situation that begins by crossing pure breeding "parental" lines (AA and aa in this example) to generate heterozygous offspring (Aa) then these are crossed together to generate the F2s. It is unlikely that Yule was thinking of only this scenario. In a natural population offspring from all possible crosses (AA x AA, AA x Aa, AA x aa, Aa x Aa, Aa x aa, and aa x aa), only one of which is the F2 scenario, would be generated. If the two allele frequencies were precisely 1/2 then all possible crosses would also result in a 3:1 ratio of offspring phenotypes.

AA 2 Aa aa
AA AA 1/2 AA, 1/2 Aa Aa
2 Aa 1/2 AA, 1/2 Aa 1/4 AA, 1/2 Aa, 1/4 aa 1/2 Aa, 1/2 aa
aa Aa 1/2 Aa, 1/2 aa aa

Multiplying out the twos from the heterozygous parents gives:

AA Aa aa
AA AA AA, Aa Aa
Aa AA, Aa AA, 2 Aa, aa Aa, aa
aa Aa Aa, aa aa

The total proportions of offspring with all three genotypes are 4 AA, 8 Aa, and 4 aa, or a 1:2:1 ratio of genotypes and a 3:1 ratio of phenotypes (AA and Aa to aa).

In modern population genetic terms we tend to think about the random union of gametes and work with allele frequencies directly rather than keeping track of each genotype pair (usually this works well but it can be problematic in certain cases where the parental genotypes matter such as Medea systems). If p is the frequency of an allele (A) it is fairly easy to show that the expected genotype frequencies (f) are:

  • [math]f_{AA}=p^2[/math]
  • [math]f_{Aa}=2p(1-p)[/math]
  • [math]f_{aa}=(1-p)^2[/math],

and that p can take any value from zero to one.

It is also fairly easy to show that the expected allele frequency in the next generation (p') is equal to the allele frequency in the current generation (p).

[math]p' = f_{AA}+f_{Aa}/2 = p^2 + 2 p (1-p) / 2 = p^2 + p (1-p) = p^2 + p - p^2 = p[/math]

Thus, in the absence of additional forces, neither the allele nor the genotype frequencies are expected to change over time in a deterministic fashion.

Third Paragraph

However, this is not how Hardy framed the discussion beginning in the third paragraph. First of all he used p, q, and r to represent AA, Aa, and aa genotype frequencies. This can quickly get confusing when you are used to using these variables for allele frequencies. I like to either replace them with x, y, and z or write it out more completely (e.g., fAA}), to keep things clearer. Second he used q to represent half of the heterozygote frequency, [math]f_{Aa}=2\overset{\scriptscriptstyle H}{q}[/math] (I am indicating Hardy's definition of q by placing an H over it). Note that in this third paragraph he is already anticipating the effects of genetic drift ("numbers are fairly large"), non-random mating ("mating may be regarded as random"), unequal allele frequencies between the sexes ("the sexes are evenly distributed"), and selection ("all are equally fertile") on deviations from Hardy-Weinberg genotype predictions.

So the frequency of genotypes in one generation is expected to be:

  • [math]f_{AA}=\overset{\scriptscriptstyle H}{p}[/math]
  • [math]f_{Aa}=2\overset{\scriptscriptstyle H}{q}[/math]
  • [math]f_{aa}=\overset{\scriptscriptstyle H}{p}[/math].

He then states that the frequency of genotypes in the next generation is expected to be:

  • [math]f_{AA}=(\overset{\scriptscriptstyle H}{p}+\overset{\scriptscriptstyle H}{q})^2[/math]
  • [math]f_{Aa}=2(\overset{\scriptscriptstyle H}{p}+\overset{\scriptscriptstyle H}{q})(\overset{\scriptscriptstyle H}{q}+\overset{\scriptscriptstyle H}{r})[/math]
  • [math]f_{aa}=(\overset{\scriptscriptstyle H}{q}+\overset{\scriptscriptstyle H}{r})^2[/math].

And, this is equivalent to [math]\overset{\scriptscriptstyle H}{p}_1 : 2\overset{\scriptscriptstyle H}{q}_1 : \overset{\scriptscriptstyle H}{r}_1[/math], where the subscripted one indicates the next generation.

Let's start with the first part [math]f_{AA}=(\overset{\scriptscriptstyle H}{p}+\overset{\scriptscriptstyle H}{q})^2[/math]. This is similar to the upper left of the table we constructed above.

AA Aa
AA AA AA, Aa
Aa AA, Aa AA, 2 Aa, aa

However, [math]\overset{\scriptscriptstyle H}{q} = f_{Aa}/2[/math] so let's redraw it and multiply out the halves for the heterozygote frequencies.

fAA fAa/2
fAA fAA2 fAAfAa/2
fAa/2 fAAfAa/2 fAa2/4

All of the offpsring of AA x AA crosses are AA. Half of the offspring of AA x Aa crosses are AA. And, a quarter of Aa x Aa crosses are AA. There are no other crosses that can result in AA offspring. So the frequency of AA individuals in the next generation (f'AA) is the sum of these probabilities of the expected frequencies of the different types of crosses.

[math]f_{AA}' = f_{AA}^2 + f_{AA} f_{Aa} + f_{aa}^2/4= f_{AA}^2 + 2f_{AA} f_{Aa}/2 + f_{aa}^2/4 = (f_{AA} + f_{aa}/2)^2 = (\overset{\scriptscriptstyle H}{p}+\overset{\scriptscriptstyle H}{q})^2[/math]

This logic can be applied to the other two genotypes as well in the full table of crosses.

Fourth Paragraph

Hardy discusses the situation when the genotype proportions are equal between generations. He claims that it is "easy to see" that this is [math]\overset{\scriptscriptstyle H}{q}^2 = \overset{\scriptscriptstyle H}{p}\overset{\scriptscriptstyle H}{r}[/math].

Let's rewrite this as

[math]\left(\frac{f_{Aa}}{2}\right)^2=\frac{f_{Aa}^2}{4}=f_{AA} f_{aa}[/math].

It is not clear to me what is obvious about this. Let's rewrite it in allele frequency terms.

[math]\left(\frac{2p(1-p)}{2}\right)^2 = p^2 (1-p)^2[/math]

Simplifying the heterozygote side of the equation we get

[math]p^2 (1-p)^2 = p^2 (1-p)^2[/math],

which is true. So squaring the frequency of half of the heterozygotes is the same as multiplying the frequencies of the two homozygotes when at equilibrium. This is still not clear to me but I think it is indirectly saying that this is the case where the genotype frequencies arise from multiplying together the underlying allele frequencies, i.e., random union of gametes.


... to be continued.