This is an R package which implements a Multivariate Copula-Based Rare-Varaints (CBMRV) region-based association test, in presence of familial data.
CBMRV is a statistical tool for testing the association between a mutivariate response vector and a genomic region. It is a copula-based method that allows for flexible modelling of the dependence between two test statitics and provides a unified p.value. It implements four test statistics:
optimal.p
(default);minimum.p
, minimum p-value;Qscore
, p-value based on the Q score statistic;QFisher
, p-value based on the Fisher combination of p-values derived from Q1 and Q2.CBMRV allows for dependence modelling using four copulas
Gaussian
(default);Student
, Student copula with degree of freedom equals 10;Clayton
, Clayton copula; Frank
, Frank copula.For this first version, we provide score test and p.value calculation in the bivariate case.
The CBMRV package is under construction, however, all the R source functions that are needed to calculate the CBMRV score test statistics and p.vlaues can be obtained from the GitHub repository CBMRV
. To run the CBMRV test, one can download the CBMRV R source files into a personal/home directory and call all R functions using source()
function in R. See below a toy example for more details on how to run the CBMRV main functions and calculate p.values.
The main functions are CMB.RV
and CMB.RV.null
, and indeed most users will only need these two functions. See the functions documentation for more information about their parameters and for some examples.
All the required R functions, including CMB.RV
and CMB.RV.null
, can be downloaded via source()
funtion in R, as follow:
path = "path_to_R_file_sources" source(paste(path,"/functions.R", sep="")) source(paste(path,"/inputChecksCBMRV.R", sep="")) source(paste(path,"/mainCBM-RV.R", sep="")) source(paste(path,"/CBM-RV-Null.R", sep=""))
Before running an association test with one (or more) of the methods, the following data needs to be present:
Phenotype data
; phenotype data should be present in the form of
an R matrix (one raw for each individual). It is up to you (as user)
to create the matrix, for example by reading it from a CSV file using R's
|read.csv()| function or like this:
source(paste(path,"/phenotypes.dat", sep="")) head(phenotypes)
Covariate data
; this data should be present in the form
of a matrix. Like the phenotype data it is up to you to load this
data. Here we will load the data from the CBM-RV package:
source(paste(path,"/covariates.dat", sep="")) dim(covariates)
Genotype data
; genotype data this data should be present in the form
of a matrix.
source(paste(path,"/genotypes.dat", sep="")) head(genotypes)
Relationship matrix
; each of the methods requires a (symmetric) relationship matrix (2 $\times$ the kinship matrix). In this example we will load this matrix from the CBM-RV package.
load(paste(path,"/kin1.Rdata", sep="")) kin <- 2 *kin1 dim(kin)
The CBMRV depends on some R packages and users need to load such packages before running the CBMRV main functions.
library(CompQuadForm); library(MASS); library(Matrix); library(mvtnorm); library(mnormt); library(numDeriv); library(kinship2); library(VineCopula); library(copula); library(pracma); library(matrixcalc); library(rARPACK)
Before running the main functoins To run the main function and get p.value estimates, one needs to estimate the phenotypes variances-covariances matrix under the null model.
null.obj = CBM.RV.Null(Y=phenotypes, covariates = covariates, kin, nb.running.guess = 3)
Now one can run the main function in order to calculate copula-based p.value:
CBM.RV(Y=phenotypes, G=genotypes, covariates=covariates, null.obj=null.obj, copfit="Gaussian", method="optimal.p")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.