Probability of fixation
This was derived in Kimura 1962.
[math]u(p)=\frac{1-e^{-4N_esp}}{1-e^{-4N_es}}[/math]
If we are considering the initial frequency of a single new mutation in the population p=1/(2Ne),
[math]u(p)_1=\frac{1-e^{-4N_es\frac{1}{2N_e}}}{1-e^{-4N_es}}=\frac{1-e^{-2s}}{1-e^{-4N_es}}[/math].
And if 4Nes is large
[math]u(p)_2\approx\frac{1-e^{-2s}}{1}=1-e^{-2s}[/math].
[math]e^{2s}\approx 1+2s[/math]
[math]u(p)_2 \approx 1-e^{-2s} \approx 1-1+2s = 2s[/math].
This agrees with the results of Fisher 1930 and Wright 1931.
It may be surprising at first the the probability of fixation of a new allele that confers a fitness advantage is only approximately 2s. So if it gives a 3% fitness advantage the probability of fixation is only about 6%. In other words there is a 94% chance the new adaptive allele will be lost due to genetic drift. This implies that adaptive evolution of a species is very inefficient and that adaptive alleles have to occur repeatedly by mutation, to be lost by drift, before they eventually fix.
Why is this process so inefficient? When an allele is rare, such as a single copy as a new mutation, the forces of drift are typically much larger than the forces of selection. As an example work out the probability of sampling zero copies of an allele at a count of one from one generation to the next with a Poisson distribution and a mean of λ = 1+s.
[math]P(k)=\frac{\lambda^k e^{-\lambda}}{k!}[/math]
If s = 0.03 there is a 0.357 probability of loss in the next generation. The probability in the next generation of one copy is 0.368, two copies is 0.189, three copies 0.065, four copies 0.0167, five copies 0.00345, etc. Even with strong selection in an evolutionary sense there is a large chance (>1/3) of immediate loss, and the allele is slow to increase away from a copy of one. If the allele survives to 1 through 5 copies in the second generation, the sum of the probabilities of loss in the third generation is greater than 0.15. The total probability of loss of the adaptive allele in either the second or third generation after it appears is greater than one half (approximately 0.52). The average in the fourth generation is still close to one ... etc.
The calculation gets very complicated from the fourth generation onward because of all the possible paths the allele can take. This is probably best calculated using matrix algebra with a transition matrix between the possible states (but it would be quite large if N is not very small). As just an example, the probability of one to one to one to zero counts across four generations is 0.0178. This is added to the probability of loss from the previous generation, approximately 0.0178 + 0.5157 = 0.533. According to this theory the probability should accumulate towards a limit of approximately 0.94 when all the possible paths that end in zero are explored.
Notes
Kimura's derivation
This is derived from
[math]u(p) = \frac{\int_0^p G(x)\, \mbox{d} x}{\int_0^1 G(x)\, \mbox{d} x}[/math],
equation 3 of Kimura 1962.
[math]u(p,t)[/math] is the probability of fixation of an allele at frequency p within t generations.
The change in allele frequency ([math]\delta p[/math]) over short periods of time ([math]\delta t[/math]) is
[math]u(p, t+\delta t) = \int f(p, p+\delta p; \delta t) u(p+ \delta p, t) \, \mbox{d} (\delta p)[/math],
integrating over all values of changes in allele frequency ([math]\delta p[/math]).
A mean and variance of the change in allele frequency (p) per generation are defined as
[math]M_{\delta p}=\lim_{\delta t \to 0} \frac{1}{\delta t} \int (\delta p) f(p, p+\delta p; \delta t) \, \mbox{d} (\delta p)[/math]
[math]V_{\delta p}=\lim_{\delta t \to 0} \frac{1}{\delta t} \int (\delta p)^2 f(p, p+\delta p; \delta t) \, \mbox{d} (\delta p)[/math]
The probability of fixation given sufficient time for fixation to occur is
[math]u(p)=\lim_{t \to \infty} u(p,t)[/math]
[math]G(x) = e^{-\int \frac{2M_{\delta x}}{V_{\delta x}} \, \mbox{d} x}[/math]
(to be continued ... I need to work through this and my calculus is rusty.)
A different approach
[math]u(p)=\frac{1-e^{-4N_esp}}{1-e^{-4N_es}}[/math]
The numerator is the probability of not zero (loss) in a Poisson distribution with a mean of 4Nesp. 2Np is the number of copies of the allele in the population. This is multiplied by 2s. [math]2Np \times 2s = 4Nsp[/math]
Why is s multiplied by two here?
The denominator is the same thing but with p=1 (the largest value possible). This rescales the numerator to be a fraction out of one(?).
I suspect there is a more intuitive approach to understanding this by exploring this line of reasoning but I am not quite seeing it yet.
[math]1-e^{-4N_esp}\approx 1- 1 - 4Nsp = -4Nsp[/math] ?
[math]u(p)=\frac{1-e^{-4N_esp}}{1-e^{-4N_es}}\approx\frac{-4Nsp}{-4Ns}=p[/math] ?
[math]1-e^{-4N_esp}\approx 1- (1 - 2s)^{2Np}[/math] ?