Difference between revisions of "Statistics"

From Genetics Wiki
Jump to: navigation, search
Line 1: Line 1:
=False Discovery Rate=
+
=Multiple Testing=
 +
 
 +
==Bonferroni Correction==
 +
 
 +
The total number of independent tests is ''m''. The "alpha" for significant ''p''-values is adjusted by dividing alpha by ''m''. Only tests that are lower than the new number are considered significant.
 +
 
 +
The Bonferroni procedure is generally viewed as overly conservative and alternative approaches such as false discovery rate control are often favored.
 +
 
 +
==False Discovery Rate==
  
 
False discovery rate control is a procedure that is less stringent than a classical Bonferroni correction.
 
False discovery rate control is a procedure that is less stringent than a classical Bonferroni correction.
  
==Benjamini–Hochberg Controll Procedure==
+
===Benjamini–Hochberg Controll Procedure===
  
 
Rank the ''p''-values from smallest to largest. Begin at 1, and increase the rank: 1, 2, 3, 4, 5, ... the rank is ''k''.  
 
Rank the ''p''-values from smallest to largest. Begin at 1, and increase the rank: 1, 2, 3, 4, 5, ... the rank is ''k''.  

Revision as of 07:41, 13 August 2018

Multiple Testing

Bonferroni Correction

The total number of independent tests is m. The "alpha" for significant p-values is adjusted by dividing alpha by m. Only tests that are lower than the new number are considered significant.

The Bonferroni procedure is generally viewed as overly conservative and alternative approaches such as false discovery rate control are often favored.

False Discovery Rate

False discovery rate control is a procedure that is less stringent than a classical Bonferroni correction.

Benjamini–Hochberg Controll Procedure

Rank the p-values from smallest to largest. Begin at 1, and increase the rank: 1, 2, 3, 4, 5, ... the rank is k.

Choose a false discovery rate q, for example q=0.1.

The total number of independent tests is m. If the level of gene expression for 20,000 genes is measured and tested then m=20,000.

Accept the tests with p-values lower than qk/m.

You estimate that a fraction q of these tests are false positives, due to random chance, but that the remaining ones are true positives.

Benjamini, Y., Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society, Series B. 57 (1): 289–300.

Normal Distribution

https://www.maa.org/sites/default/files/pdf/upload_library/22/Allendoerfer/stahl96.pdf

https://www.embedded.com/print/4413095

Derivation of the normal distribution from the binomial:

http://people.bath.ac.uk/pam28/Paul_Milewski,_Professor_of_Mathematics,_University_of_Bath/Past_Teaching_files/stirling.pdf

http://www.m-hikari.com/imf/imf-2017/9-12-2017/p/baguiIMF9-12-2017.pdf

http://mathforum.org/library/drmath/view/56600.html