Association analysis of CNVs and imputed SNPs incorporating uncertainty
CNVassoc is an R package that carries out analysis of common Copy Number Variants (CNVs) and imputed Single Nucleotide Polymorphisms (SNPs) in population-based studies.
It includes tools for estimating association under a series of study designs (case-control, cohort, etc), using several dependent variables (class status, censored data, counts) as response, adjusting for covariates and considering various inheritance models.
Moreover, it is possible to perform epistasis studies with pairs of CNVs or imputed SNPs.
It has been optimized in order to make feasible the analyses of Genome Wide Association studies (GWAs) with hundreds of thousands of genetic variants (CNVs / imputed SNPs).
Also, it incorporates functions for inferring copy number (CNV genotype calling). Various classes and methods for generic functions (print, summary, plot, anova, ...) have been created to facilitate the analysis.
An extensive manual describing all CNVassoc capabilities with real examples is available in package vignette.
Install the CNVassoc
package from Github repository by typing:
library(devtools)
devtools::install_github(repo = "isglobal-brge/CNVassoc")
library(CNVassoc)
data(dataMLPA)
CNV <- cnv(x = dataMLPA$Gene2, threshold.0 = 0.01, mix.method = "mixdist")
CNV
Inferred copy number variant by a quantitative signal
Method: function mix {package: mixdist}
-. Number of individuals: 651
-. Copies 0, 1, 2
-. Estimated means: 0, 0.2435, 0.4469
-. Estimated variances: 0, 0.0041, 0.0095
-. Estimated proportions: 0.1306, 0.4187, 0.4507
-. Goodness-of-fit test: p-value= 0.4887659
-. Note: number of classes has been selected using the best BIC
plot(CNV, case.control = factor(dataMLPA$casco, labels=c("controls", "cases")))
getQualityScore(CNV)
--Probability of good classification: 0.9081028
modadd <- CNVassoc(casco ~ CNV + cov, data = dataMLPA, model = "add")
summary(modadd)
Call:
CNVassoc(formula = casco ~ CNV + cov, data = dataMLPA, model = "add")
Deviance: 874.6909
Number of parameters: 3
Number of individuals: 651
Coefficients:
OR lower.lim upper.lim SE stat pvalue
trend 0.58634 0.45457 0.75631 0.12987 -4.11060 0.000
cov 0.88435 0.75597 1.03454 0.08003 -1.53566 0.125
(Dispersion parameter for binomial family taken to be 1 )
Covariance between coefficients:
intercept CNVadd cov
intercept 0.6825 -0.0222 -0.0643
CNVadd 0.0169 -0.0001
cov 0.0064
fileprobs <- system.file("exdata/SNPTEST.probs", package = "CNVassoc")
resp <- resp<-rep(0:1, each = 500)
results <- fastCNVassoc(fileprobs, resp ~ 1, family = "binomial")
Reading .probs data...
Done! Took 0.31 seconds
results$pvalue <- p.adjust(results$pvalue)
head(results[order(results$pvalue),])
variant beta se zscore pvalue iter
1 1 0.09876262 0.09356259 1.0555781 1 4
2 2 0.03171118 0.12907790 0.2456747 1 4
3 3 0.14015608 0.09325326 1.5029617 1 4
4 4 0.05239490 0.10868035 0.4821010 1 4
5 5 0.16669960 0.09632611 1.7305754 1 4
6 6 0.12066259 0.09179185 1.3145239 1 4
Subirana I, Diaz-Uriarte R, Lucas G, Gonzalez JR. CNVassoc: Association analysis of CNV data using R. BMC Med Genomics. 2011 May 24;4:47. doi: 10.1186/1755-8794-4-47. PubMed PMID: 21609482; PubMed Central PMCID: PMC3121578
Subirana I, González JR. Genetic association analysis and meta-analysis of imputed SNPs in longitudinal studies. Genet Epidemiol. 2013 Jul;37(5):465-77. doi: 10.1002/gepi.21719. Epub 2013 Apr 17. PubMed PMID: 23595425; PubMed Central PMCID: PMC4273087.
Subirana I, González JR. Interaction association analysis of imputed SNPs in case-control and follow-up studies. Genet Epidemiol. 2015 Mar;39(3):185-96. doi: 10.1002/gepi.21883. Epub 2015 Jan 22. PubMed PMID: 25613387.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.