Population Division

From Genetics Wiki
Jump to: navigation, search

FST

A difference in allele frequencies between populations can be quantified by "missing heterozygosity" with FST.

[math]F_{ST} = \frac{H_T-H_S}{H_T} = 1-\frac{H_S}{H_T}[/math]

Migration Models

Under an "infinite migration model" where each migrant comes from a new population (analagous to the Infinite Alleles Model where each mutation results in a new allele). At migration-drift equilibrium

[math]F_{ST} = \frac{1/(2N)}{1/(2N)+2m}[/math]

This is the rate of drift (1/(2N) the rate of pairwise coalescence within a population each generation), driving allele frequency differences between populations out of the total rate of drift and migration (2m the rate of migration when gene copies are considered in pairs to put it on the same scale as pairwise drift).

We can multiply the numerator and denominator by 2N.

[math]F_{ST} = \frac{1/(2N)}{1/(2N)+2m}\times\frac{2N}{2N} = \frac{1}{1+4Nm}[/math]

This can be rearranged to solve for Nm.

[math]\frac{1}{4}\left(\frac{1}{F_{ST}}-1\right) = Nm[/math]

Nm the population size times the migration rate is the actual number of migrant individuals each generation.

(add finite numbers of populations and isolation by distance, Rousset 1997)

Common Ancestry Models

FST can also be related to the time since isolation of two populations. This is not an equilibrium model in contrast to the migration models above.

Imagine an ancestral population of size Na splits into two daughter populations that are isolated from each other g generations ago. For mathematical convenience all population sizes are equal, Na=N1=N2=N. (We can change this assumption later.)

The heterozygosity within a daughter population is expected to be

[math]H_S = 4N\mu[/math],

(see the Infinite Sites Model for an explanation).

If we combined samples from both populations and made pairwise comparisons half of the time the two alleles would be from within a daughter population, 1 and 1 or 2 and 2, and half of the time they would be from both populations, 1 and 2 or 2 and 1. So half of the time the heterozygosity is

[math]4N\mu[/math]

and half of the time the two lineages are traced back to the common ancestor and then coalesce with a total number of mutational differences of

[math]2g\mu + 4N\mu[/math].

Combining these we get

[math]H_T = \frac{4N\mu}{2} + \frac{2g\mu}{2} + \frac{4N\mu}{2}[/math].

This simplifies to

[math]H_T = 4N\mu + g\mu[/math].

So

[math]F_{ST} = 1-\frac{H_S}{H_T} = 1- \frac{4N\mu}{4N\mu + g\mu} = 1 - \frac{4N}{4N + g}[/math]

Multiply part of this by 1 = (1/N)/(1/N) to get

[math]F_{ST} = 1 - \frac{4N}{4N + g}\times\frac{1/N}{1/N} = 1 - \frac{4}{4+g/N}[/math].

So now we have the times of isolation scaled into N generations.

This can be rearranged to solve for g/N.

[math]F_{ST} - 1 = - \frac{4}{4+g/N}[/math]

[math]1 - F_{ST} = \frac{4}{4+g/N}[/math]

[math]\frac{1}{1 - F_{ST}} = \frac{4+g/N}{4}[/math]

[math]\frac{4}{1 - F_{ST}} = 4+g/N[/math]

[math]\frac{4}{1 - F_{ST}}-4 = g/N[/math]

[math]\frac{4}{1}\left(\frac{1}{1 - F_{ST}}-1 \right)= g/N[/math]

... to be continued.

Mathematical Symmetry

Compare the simple migration model and the simple isolation model.

[math]\frac{4}{1}\left(\frac{1}{1 - F_{ST}}-1 \right)= g/N = \frac{4}{1}\left(\frac{F_{ST}}{1 - F_{ST}} \right)[/math]

[math]\frac{1}{4}\left(\frac{1}{F_{ST}}-1\right) = Nm= \frac{1}{4}\left(\frac{1-F_{ST}}{F_{ST}}\right)[/math]

Despite being very different models, these are two ways of looking at the same thing. In terms of missing heterozygosity

[math]\frac{g}{N} = \frac{1}{Nm}[/math]

[math]g = \frac{1}{m}[/math] and [math]m = \frac{1}{g}[/math].

(show plot of both and inverse around one at Fst = 0.2)

Both the migration model and the common ancestry model can equivalently explain a given FST. In other words, FST alone cannot discriminate between these models.