Generate a matrix of gene expressions in the presence of tag genes

Share:

Description

Generate a matrix of gene expressions in the presence of tag genes (Scenario 1 of Emura et al. (2012)).

Usage

1
X.tag(n, p, q, s = 1)

Arguments

n

the number of individuals (sample size)

p

the number of genes

q

the number of non-null genes

s

the number of null genes correlated with a non-null gene (tag)

Details

n by p matrix of gene expressions are generated. Correlation between columns is introduced to reflect the presence of tag genes. The distribution of each column is standardized to have mean=0 and SD=1. If two genes are correlated, the correlation is 0.5. Otherwise, the correlation is 0. Details are referred to p.4 of Emura et al. (2012). This deta generation scheme is also used in the simulations of Emura and Chen (2014).

Value

X

n by p matrix of gene expressions

Author(s)

Takeshi Emura & Yi-Hau Chen

References

Emura T, Chen YH, Chen HY (2012). Survival Prediction Based on Compound Covariate under Cox Proportional Hazard Models. PLoS ONE 7(10): e47627. doi:10.1371/journal.pone.0047627

Emura T, Chen YH (2016). Gene Selection for Survival Data Under Dependent Censoring: a Copula-based Approach, Stat Methods Med Res 25(No.6): 2840-57.

Examples

1
2
3
X.mat=X.tag(n=200,p=100,q=10,s=4)
round( colMeans(X.mat),3 ) ## mean ~ 0 ##
round( apply(X.mat, MARGIN=2, FUN=sd),3) ## SD ~ 1 ##