analysisPheWAS | R Documentation |
Implement three commonly used statistical methods to analyze data for Phenome Wide Association Study (PheWAS)
analysisPheWAS( method = c("firth", "glm", "lr"), adjust = c("PS", "demo", "PS.demo", "none"), Exposure, PS, demographics, phenotypes, data )
method |
define the statistical analysis method from 'firth', 'glm', and 'lr'. 'firth': Firth's penalized-likelihood logistic regression; 'glm': logistic regression with Wald test, 'lr': logistic regression with likelihood ratio test. |
adjust |
define the adjustment method from 'PS','demo','PS.demo', and 'none'. 'PS': adjustment of PS only; 'demo': adjustment of demographics only; 'PS.demo': adjustment of PS and demographics; 'none': no adjustment. |
Exposure |
define the variable name of exposure variable. |
PS |
define the variable name of propensity score. |
demographics |
define the list of demographic variables. |
phenotypes |
define the list of phenotypes that need to be analyzed. |
data |
define the data. |
Implements three commonly used statistical methods to analyze the associations between exposure (e.g., drug exposure, genotypes) and various phenotypes in PheWAS. Firth's penalized-likelihood logistic regression is the default method to avoid the problem of separation in logistic regression, which is often a problem when analyzing sparse binary outcomes and exposure. Logistic regression with likelihood ratio test and conventional logistic regression with Wald test can be also performed.
estimate |
the estimate of log odds ratio. |
stdError |
the standard error. |
statistic |
the test statistic. |
pvalue |
the p-value. |
Leena Choi leena.choi@vanderbilt.edu and Cole Beck cole.beck@vumc.org
## use small datasets to run this example data(dataPheWASsmall) ## make dd.base with subset of covariates from baseline data (dd.baseline.small) ## or select covariates with upper code as shown below upper.code.list <- unique(sub("[.][^.]*(.).*", "", colnames(dd.baseline.small)) ) upper.code.list <- intersect(upper.code.list, colnames(dd.baseline.small)) dd.base <- dd.baseline.small[, upper.code.list] ## perform regularized logistic regression to obtain propensity score (PS) ## to adjust for potential confounders at baseline phenos <- setdiff(colnames(dd.base), c('id', 'exposure')) data.x <- as.matrix(dd.base[, phenos]) glmnet.fit <- glmnet::cv.glmnet(x=data.x, y=dd.base[,'exposure'], family="binomial", standardize=TRUE, alpha=0.1) dd.base$PS <- c(predict(glmnet.fit, data.x, s='lambda.min')) data.ps <- dd.base[,c('id', 'PS')] dd.all.ps <- merge(data.ps, dd.small, by='id') demographics <- c('age', 'race', 'gender') phenotypeList <- setdiff(colnames(dd.small), c('id','exposure','age','race','gender')) ## run with a subset of phenotypeList to get quicker results phenotypeList.sub <- sample(phenotypeList, 5) results.sub <- analysisPheWAS(method='firth', adjust='PS', Exposure='exposure', PS='PS', demographics=demographics, phenotypes=phenotypeList.sub, data=dd.all.ps) ## run with the full list of phenotype outcomes (i.e., phenotypeList) results <- analysisPheWAS(method='firth', adjust='PS',Exposure='exposure', PS='PS', demographics=demographics, phenotypes=phenotypeList, data=dd.all.ps)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.