dat: simulated data for demonstrating the features of marble.

datR Documentation

simulated data for demonstrating the features of marble.

Description

Simulated gene expression data for demonstrating the features of marble.

Usage

data("dat")

Format

dat consists of four components: X, Y, E, clin.

Details

The data model for generating Y

Use subscript i to denote the ith subject. Let (Y_{i}, X_{i}, E_{i}, clin_{i}) (i=1,\ldots,n) be independent and identically distributed random vectors. Y_{i} is a continuous response variable representing the phenotype. X_{i} is the p–dimensional vector of genetic factors. The environmental factors and clinical factors are denoted as the q-dimensional vector E_{i} and the m-dimensional vector clin_{i}, respectively. The \epsilon follows some heavy-tailed distribution. For X_{ij} (j = 1,\ldots,p), the measurement of the jth genetic factor on the jth subject, considering the following model:

Y_{i} = \alpha_{0} + \sum_{k=1}^{q}\alpha_{k}E_{ik}+\sum_{t=1}^{m}\gamma_{t}clin_{it}+\beta_{j}X_{ij}+\sum_{k=1}^{q}\eta_{jk}X_{ij}E_{ik}+\epsilon_{i},

where \alpha_{0} is the intercept, \alpha_{k}'s and \gamma_{t}'s are the regression coefficients corresponding to effects of environmental and clinical factors, respectively. The \beta_{j}'s and \eta_{jk}'s are the regression coefficients of the genetic variants and G\timesE interactions effects, correspondingly. The G\timesE interactions effects are defined with W_{j} = (X_{j}E_{1},\ldots,X_{j}E_{q}). With a slight abuse of notation, denote \tilde{W} = W_{j}. Denote \alpha=(\alpha_{1}, \ldots, \alpha_{q})^{T}, \gamma=(\gamma_{1}, \ldots, \gamma_{m})^{T}, \beta=(\beta_{1}, \ldots, \beta_{p})^{T}, \eta=(\eta_{1}^{T}, \ldots, \eta_{p}^{T})^{T}, \tilde{W} = (\tilde{W_{1}}, \dots, \tilde{W_{p}}). Then model can be written as

Y_{i} = E_{i}\alpha + clin_{i}\gamma + X_{ij}\beta_{j} + \tilde{W}_{i}\eta_{j} + \epsilon_{i}.

See Also

marble

Examples

data(dat)
dim(X)

marble documentation built on May 29, 2024, 6:44 a.m.