A Monte Carlo permutation method for multiple fisher.test correction in case/control association study

Description

A Monte Carlo permutation method for multiple fisher.test correction in case/control association study.

Usage

1
2
fisher.MCPerm(case_11, case_12, case_22, 
    control_11, control_12, control_22, repeatNum = 1000)

Arguments

case_11

a non-negative integer, the frequency of genotype "allele1/allele1" in case samples.

case_12

a non-negative integer, the frequency of genotype "allele1/allele2" in case samples.

case_22

a non-negative integer, the frequency of genotype "allele2/allele2" in case samples.

control_11

a non-negative integer, the frequency of genotype "allele1/allele1" in control samples.

control_12

a non-negative integer, the frequency of genotype "allele1/allele2" in control samples.

control_22

a non-negative integer, the frequency of genotype "allele2/allele2" in control samples.

repeatNum

an integer(default 1000) specifying the number of replicates used in the Monte Carlo permutation.

Details

Permutation tests exist for any test statistic, regardless of whether or not its distribution is known. Thus the permutation test is widely considered the gold standard for accurate multiple testing correction.

For case/control association study for snps, the permutation test proceeds as follows:

1) Combine the observations from all the samples;

2) Shuffle them and rearrangements of the labels(case/control) on the observed data;

3) Record the genotype frequency of case and control samples, respectively;

4) Calculate the statistic of interest;

5) Repeat many times(at least 1000) to obtain the distribution of the statistic;

6) Determine how often the resampled statistic of interest is as extreme as the observed value of the same statistic.

Obviously, for multiple test correction in case/control association study for millions of snp, the traditional method—permutation test is very computationally impractical. Thus propose an accurate, rapid and efficient method for multiple testing correction in genome-wide association studies—MCPerm.

Method—MCPerm generates the genotype frequency for rearranged case and control data by twice generating random numbers for the hypergeometric distribution, based on the genotype statistic of original data, taking the place of the step 2) and step 3) of the traditional method. And the genotype frequency distribution gener-ating by MCPerm is almost the same with permutation test, this simplified method greatly improves the efficiency of the permutation test and is faster. MCPerm method can be the perfect alternative to permutation test.

Value

pValue

the p value for the test.

obsP

the p value of the fisher.test for the true data.

permP

a matrix with one row and 'repeatNum' columns, the p value of the fisher.test for simulation data.

Author(s)

Lanying Zhang and Yongshuai Jiang <jiangyongshuai@gmail.com>

References

William S Noble(Nat Biotechnol.2009): How does multiple testing correction work?

Edgington. E.S.(1995): Randomization tests, 3rd ed.

See Also

OR, OR.TradPerm, OR.MCPerm, Armitage, Armitage.TradPerm, Armitage.MCPerm, chisq.test, chisq.TradPerm, chisq.MCPerm, fisher.test, fisher.TradPerm, meta, meta.TradPerm, meta.MCPerm, permuteGenotype, rhyper, permuteGenotypeCount, genotypeStat

Examples

1
2
3
4
5
6
7
8
# case_11=34
# case_12=0
# case_22=16
# control_11=14
# control_12=0
# control_22=13
# result=fisher.MCPerm(case_11,case_12,case_22,control_11,control_12,control_22,repeatNum=1000)
# p=result$pValue

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.