gc.fun: function that does genomic control correction to single SNP...

Description Usage Arguments Details Value Author(s) Examples

View source: R/gc.fun.R

Description

When high genomic control (GC) parameter (lambda) estimate is observed, gc.fun applies GC correction to SNPs with minor allele counts (MAC) less than a user specified threshold that may have inflated type I error rate for survival traits in particular, adjusts RData output accordingly, and recomputes sum of square statistic.

Usage

1
2
gc.fun(path,phen,snpinfoRdata,snp.cor,mac,aggregateBy="SKATgene",
maf.file,mafRange,ssq.beta.wts=c(1,25))

Arguments

path

path to directory that saves all 23 tab delimited single SNP analysis result files

phen

a character string for the phenotype name of a trait of interest

snpinfoRdata

a character string naming the RData containing SNP info to be loaded, this should at least include 'Name' (for SNP name), 'Chr', and aggregateBy (default='SKATgene') columns

snp.cor

a character string naming the RData containing lists of SNP correlation matrix within each 'SKATgene'

mac

user specified MAC threshold for applying GC correction to SNPs with MAC under it

aggregateBy

the column of SNP info on which single SNPs are to be aggregated for burden tests, default is 'SKATgene'

maf.file

a character string naming the comma delimited file containing 'snp.names' for SNP name and 'maf' for MAF

mafRange

range of MAF to include SNPs for gene-based burden tests, default is c(0,0.05)

ssq.beta.wts

a vector of parameters of beta weights used in proposed sum of squares test, default=c(1,25) as in SKAT

Details

When high lambda is observed from survival trait single SNP analysis, the gc.fun function applies GC correction to SNPs with user defined MAC, adjusts RData output based on GC corrected single SNP analysis results, recomputes sum of squares statistic and then outputs corrected single SNP analysis results, SSQ analysis results and RData. Initial single SNP analysis result files are required and the input arguments should be identical to the ones used in initial analysis (except for path).

Value

No value is returned. Instead, tab delimited result files and an RData are generated. A single SNP result file, named with phen and singleSNP, contains columns: gene, Name, maf, ntotal, nmiss, maf_ntotal, beta, se, Z, remark, p (p-value from LRT), MAC, n0, n1, and n2. A SSQ test result file, named with phen and SSQ, contains columns: gene, SSQ, cmafTotal, cmafUsed, nsnpsTotal, nsnpsUsed, nmiss, df, and p. A generated RData that is a list that contains scores, cov, n, maf and sey for each gene with gene names being the names of the list. Note maf in RData is MAF based on ntotal.

Author(s)

Ming-Huei Chen <mhchen@bu.edu> and Qiong Yang <qyang@bu.edu>

Examples

1
2
3
4
5
6
## Not run: 
gc.fun(path="/home/mhchen/",phen="trait1",mafRange=c(0,0.01),
snpinfoRdata="SNPinfo_EC.RData",aggregateBy="SKATgene",
maf.file="EC_MAF.csv",snp.cor="EC_SNPcor.RData",ssq.beta.wts=c(1,25))

## End(Not run)

RVFam documentation built on May 2, 2019, 8:26 a.m.