Kimura 1968

From Genetics Wiki
Jump to: navigation, search

Citation

Kimura, M. (1968). Evolutionary rate at the molecular level. Nature, 217(5129), 624-626.

Links

Published Abstract

Calculating the rate of evolution in terms of nucleotide substitutions seems to give a value so high that many of the mutations involved must be neutral ones.

Notes

Before the neutral theory it was generally thought that most variation and evolutionary change was due to selection. This sparked the neutralist-selectionist debate. Also at this time there was little to no DNA sequence information. Kimura was working from a handful of protein sequences that were available at the time.

See also King and Jukes 1969.

Paragraph Four

Kimura constructs an estimate of the rate of genome-wide nucleotide substitutions based on observed amino acid differences between species.

  • Average years between amino acid substitutions in 100 amino acids [math]=28\times10^6[/math] yr (today this would be "a", Latin annus, symbolizing year).
  • Genome size estimate [math]=4\times10^9[/math] bp (base pairs).
  • Gene size coreesponding to 100 amino acids [math]=300[/math] bp.
  • Adjustment to also include an estimated additional 20% synonymous mutations that do not change the amino acid [math]1 + 0.2 = 1.2[/math]

[math]28\times10^6 \div \left( \frac{4\times10^9}{300} \right) \div 1{\cdot}2 \doteqdot 1{\cdot}8 \mbox{ yr}[/math]

The publication (I don't know if this comes from Kimura or the typesetter) uses an obscure symbol, [math]\doteqdot[/math], for approximately equal to, [math]\approx[/math]. (This might be, or have been, more common in French typography, ?) It also uses a British style interpunct for the decimal place, which looks like multiplication; this has faded from use today.

Kimura argues that a fixation event every 1.8 years is too high of a rate to only be explained by selection. Haldane (1957), referenced here, is a useful publication to understand this argument. Essentially, this requires many overlapping simultaneous selective sweeps across the genome (the time from the occurrence of a new mutation to its fixation in the population is many generations). The action of selection is limited by an organisms fecundity. If all of the offspring with the most fit genotype survive and reproduce (as the most extreme example) the action of selection is seen in the reduction of survival and/or reproduction of organisms with other genotypes. How much can the number of offspring be reduced before the species becomes extinct? If half of the offspring are removed to select for a single trait, and half of the remaining offspring removed to select for another independently inherited trait, ... simultaneous selection for ten traits results in only one out of 1,024 [math]\left(1/2^{10}\right)[/math] offspring remaining. For many species of mammals this is beyond the number of offspring possible and the species should rapidly decrease in number over time. Relaxing selection so that more offspring survive doesn't help as much as it first seems because the fixation of selected alleles takes longer and more genes under selection will simultaneously overlap in time---also reducing the number of offspring. Only when the force of selection is approximately equal to or less than the force of drift (in terms of changing allele frequencies), in other words "nearly neutral", does the problem become resolved.

This appears to assume that the entire genome is protein coding DNA sequence.

Let's update this calculation using human parameter values.

  • A haploid genome size of 3.2 Gbp (Morton 1991).
  • The proportion of fixed single base pair differences between humans and chimpanzees 0.0106 and an additional [math]5.09\times10^{6}[/math] indels (The Chimpanzee Sequencing and Analysis Consortium 2005).
  • An approximate lineage divergence between human and chimpanzee sequences of five million years (Kumar et al. 2005). There is a lot of uncertainty surrounding this value. If we take it to be six million years ago, in the common ancestor of humans and chimpanzees (and bonobos and gorillas) and subtract a million years to account for the time until coalescence within the modern species (because we are only considering fixed differences), then with get five million years.

[math]\frac{5\times10^6\mbox{ a}}{0.0106\times3.2\times10^{9}\mbox{ bp events}+5.09\times10^6 \mbox{ indel events}}\approx[/math]

[math]\approx\frac{5\times10^6\mbox{ a}}{33.9\times10^6\mbox{ events} + 5.09\times10^6 \mbox{ events}}\approx[/math]

[math]\approx\frac{5\times10^6\mbox{ a}}{39\times10^6\mbox{ events}}\approx\frac{5\mbox{ a}}{39\mbox{ events}}\approx 0.128 \mbox{ a/event}[/math]

So, approximately one and one-half months (47 days) go by between fixed evolutionary differences in the lineage leading to humans. The rate per year is larger than Kimura's original estimate and corresponds to a ridiculously large amount of selective forces (if all of the changes were due to selection). If we assume an average generation time of 25 years this is almost two hundred fixation events per generation, which is clearly impossible.

If we strike a rough first-level balance between neutrality and selection and assume only 5% of the genome is under functional constraint (Mouse Genome Sequencing Consortium 2002) then the rate of selective change is 20 times (1/0.05) lower predicting a species-wide fixation event every 0.128×20=2.56 years or approximately 10 per generation, which is closer to Kimura's original estimate and still not biologically reasonable. So, even if we limit ourselves to the fraction of the genome that is clearly under purifying selection, the majority of differences that occur must be selectively neutral or nearly neutral.

This argument touches on several ideas including a species' capacity for selection, the concept of hard selection versus soft selection, and the question, does selection predominantly occur on single mutations independently of each other or on groups of mutations simultaneously.

Counter arguments include the observation that adaptive change do unambiguously occur (the question is how wide-spread these are across the genome), for example, consider how well species are adapted to their environment; as a thought experiment consider exchanging species into different environments. There are a large number of changes, a tiger in the deep ocean, a grizzly bear in the Sahara, or a camel or macaque in the Arctic, a sperm whale on land in a tropical jungle, etc., that do not work out well; yet they all have a not-too-distant common ancestor (e.g., Springer et al. 2004). Despite these ecological differences gene sequences from very diverse species can, in some cases, be exchanged and have apparently equivalent functions, PAX6 is a good example [add references], suggesting the the sequence differences are effectively neutral. Also consider rapid selection resulting from human influence, the wide range of phenotypes selected for in breeding and domestication, the rapid occurrence and spread of antibiotic resistance among microbes, and insecticide resistance among insect species.

Paragraph Seven

Kimura gives the calculation for the genetic load of a single gene substitution, equation 1, which is a mess. he doesn't explain it and says the derivation "will be published elsewhere".

The probability of fixation of an allele under selection is discussed in Kimura 1962 and Kimura 1957 is referenced. This is an important equation and I am going to dedicate a separate page to it, probability of fixation.