Statistics

From Genetics Wiki
Revision as of 07:52, 13 August 2018 by Floyd (talk | contribs) (Benjamini–Hochberg Controll Procedure)

Jump to: navigation, search

Multiple Testing

Bonferroni Correction

The total number of independent tests is m. The "alpha" for significant p-values is adjusted by dividing alpha by m. Only tests that are lower than the new number are considered significant.

The Bonferroni procedure is generally viewed as overly conservative and alternative approaches such as false discovery rate control are often favored.

False Discovery Rate

False discovery rate control is a procedure that is less stringent than a classical Bonferroni correction.

Benjamini–Hochberg Control Procedure

Rank the p-values from smallest to largest. Begin at 1, and increase the rank: 1, 2, 3, 4, 5, ... the rank is k.

Choose a false discovery rate q, for example q=0.1.

The total number of independent tests is m. If the level of gene expression for 20,000 genes is measured and tested then m=20,000.

Accept the tests with p-values lower than qk/m.

You estimate that a fraction q of these tests are false positives, due to random chance, but that the remaining ones are true positives.

Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B. 57 (1): 289–300.

Normal Distribution

https://www.maa.org/sites/default/files/pdf/upload_library/22/Allendoerfer/stahl96.pdf

https://www.embedded.com/print/4413095

Derivation of the normal distribution from the binomial:

http://people.bath.ac.uk/pam28/Paul_Milewski,_Professor_of_Mathematics,_University_of_Bath/Past_Teaching_files/stirling.pdf

http://www.m-hikari.com/imf/imf-2017/9-12-2017/p/baguiIMF9-12-2017.pdf

http://mathforum.org/library/drmath/view/56600.html