knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
install.packages("devtools") devtools::install_github("fumi-github/omicwas")
If you encounter dependency error for sva
package, install it from Bioconductor:
if (!requireNamespace("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install("sva")
omicwas
is a package for cell-type-specific disease association testing,
using bulk tissue data as input.
The package accepts DNA methylation data for epigenome-wide association studies (EWAS),
as well as gene expression data for differential gene expression analyses.
The main function is ctassoc
library(omicwas)
See description.
?ctassoc
Let's load a sample data.
data(GSE42861small) X = GSE42861small$X W = GSE42861small$W Y = GSE42861small$Y Y = Y[seq(1, 20), ] # for brevity C = GSE42861small$C
See description.
?GSE42861small
The conventional way is to use ordinary linear regression.
result = ctassoc(X, W, Y, C = C, test = "full") result$coefficients
We recommend nonlinear regression with ridge regularization.
For DNA methylation, we use the logit function for normalization,
and the test option is nls.logit
result = ctassoc(X, W, Y, C = C, test = "nls.logit", regularize = TRUE) print(result$coefficients, n = 20)
The first 19 lines show the result for CpG site cg10543797. Line 1 shows that the basal methylation level in CD4. (actually CD4+ T cells) is plogis(1.498) = 0.817, so this cell type is 81% methylated. Line 8 shows that the CD4.-specific effect of the disease is 7.10e-04 (in logit scale). Since the p.value is 0.64, this is not significant. Line 15 shows that the effect of sexF (female compared to male) is -4.14e-02 with a small p.value 9.15e-05. Since sexF is a covariate that has uniform effect across cell types, the celltype column is NA.
Let's load a sample data.
data(GTExsmall) X = GTExsmall$X W = GTExsmall$W Y = GTExsmall$Y + 1 Y = Y[seq(1, 20), ] # for brevity C = GTExsmall$C
See description.
?GTExsmall
The conventional way is to use ordinary linear regression.
result = ctassoc(X, W, Y, C = C, test = "full") result$coefficients
We recommend nonlinear regression with ridge regularization.
For DNA methylation, we use the log function for normalization,
and the test option is nls.log
result = ctassoc(X, W, Y, C = C, test = "nls.log", regularize = TRUE) print(result$coefficients, n = 15)
The first 13 lines show the result for transcript ENSG00000059804. Line 1 shows that the basal expression level in Granulocytes is exp(8.847) = 6955. Line 7 shows that the Granulocytes-specific effect of age is 4.62e-04 (in log scale). Since the p.value is 0.82, this is not significant. Line 13 shows that the effect of sexF (female compared to male) is -2.57e-03 with p.value 0.97 Since sexF is a covariate that has uniform effect across cell types, the celltype column is NA.
For QTL analyses, we use ctcisQTL
function instead of ctassoc
.
To speed up computation, we perform linear ridge regression,
thus the statistical test is almost identical to ctassoc(test = "nls.identity", regularize = TRUE)
.
We analyze only in the linear scale.
Association analysis is performed between each row of Y and each row of X.
See description.
?ctcisQTL
Let's load a sample data.
data(GSE79262small) X = GSE79262small$X Xpos = GSE79262small$Xpos W = GSE79262small$W Y = GSE79262small$Y Ypos = GSE79262small$Ypos C = GSE79262small$C X = X[seq(1, 3001, 100), ] # for brevity Xpos = Xpos[seq(1, 3001, 100)] Y = Y[seq(1, 501, 100), ] Ypos = Ypos[seq(1, 501, 100)]
See description.
?GSE79262small
Analyze mQTL.
ctcisQTL(X, Xpos, W, Y, Ypos, C = C)
The result is stored in a file.
head( read.table(file.path(tempdir(), "ctcisQTL.out.txt"), header = TRUE, sep ="\t"))
The first 3 lines show the result for the association of SNP rs6678176 with CpG site cg19251656. Line 1 shows that the CD4T-specific effect of the SNP is -0.003. Since the p.value is 0.75, this is not significant.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.