CBMRV

This is an R package which implements a Multivariate Copula-Based Rare-Varaints (CBMRV) region-based association test, in presence of familial data.

CBMRV is a statistical tool for testing the association between a mutivariate response vector and a genomic region. It is a copula-based method that allows for flexible modelling of the dependence between two test statitics and provides a unified p.value. It implements four test statistics:

CBMRV allows for dependence modelling using four copulas

For this first version, we provide score test and p.value calculation in the bivariate case.

Installation

The CBMRV package is under construction, however, all the R source functions that are needed to calculate the CBMRV score test statistics and p.vlaues can be obtained from the GitHub repository CBMRV. To run the CBMRV test, one can download the CBMRV R source files into a personal/home directory and call all R functions using source() function in R. See below a toy example for more details on how to run the CBMRV main functions and calculate p.values.

The main functions are CMB.RV and CMB.RV.null, and indeed most users will only need these two functions. See the functions documentation for more information about their parameters and for some examples.

Toy example

All the required R functions, including CMB.RV and CMB.RV.null, can be downloaded via source() funtion in R, as follow:

path = "path_to_R_file_sources"

source(paste(path,"/functions.R", sep=""))
source(paste(path,"/inputChecksCBMRV.R", sep=""))
source(paste(path,"/mainCBM-RV.R", sep=""))
source(paste(path,"/CBM-RV-Null.R", sep=""))

Before running an association test with one (or more) of the methods, the following data needs to be present:

Phenotype data; phenotype data should be present in the form of an R matrix (one raw for each individual). It is up to you (as user) to create the matrix, for example by reading it from a CSV file using R's |read.csv()| function or like this:

source(paste(path,"/phenotypes.dat", sep=""))
head(phenotypes)

Covariate data; this data should be present in the form of a matrix. Like the phenotype data it is up to you to load this data. Here we will load the data from the CBM-RV package:

source(paste(path,"/covariates.dat", sep=""))
dim(covariates)

Genotype data; genotype data this data should be present in the form of a matrix.

source(paste(path,"/genotypes.dat", sep=""))
head(genotypes)

Relationship matrix; each of the methods requires a (symmetric) relationship matrix (2 $\times$ the kinship matrix). In this example we will load this matrix from the CBM-RV package.

load(paste(path,"/kin1.Rdata", sep=""))
kin <- 2 *kin1
dim(kin)

The CBMRV depends on some R packages and users need to load such packages before running the CBMRV main functions.

library(CompQuadForm); library(MASS); library(Matrix); library(mvtnorm); library(mnormt); library(numDeriv);
library(kinship2); library(VineCopula); library(copula); library(pracma); library(matrixcalc); library(rARPACK)

Before running the main functoins To run the main function and get p.value estimates, one needs to estimate the phenotypes variances-covariances matrix under the null model.

null.obj = CBM.RV.Null(Y=phenotypes, 
                       covariates = covariates, 
                       kin, 
                       nb.running.guess = 3)

Now one can run the main function in order to calculate copula-based p.value:

CBM.RV(Y=phenotypes, 
       G=genotypes, 
       covariates=covariates, 
       null.obj=null.obj, 
       copfit="Gaussian", 
       method="optimal.p")

References



KarimOualkacha/CBMRV4 documentation built on May 25, 2019, 12:22 p.m.