simulate: Simulating data
In MAclinical: Class prediction based on microarray data and clinical parameters

Description Usage Arguments Details Value Author(s) References See Also Examples

This function simulates a list of data sets as described in Boulesteix et al (2008), section 3.1.

1 2	simuldata_list(niter=50,n=500,p=1000,psig=50,q=5,muX=0,muZ=0) simuldatacluster_list(niter=50,n=500,p=1000,psig=50,q=5,muX=0,muZ=0)

`niter`	The number of data sets to be simulated.
`n`	The number of observations.
`p`	The number of microarray variables (genes).
`psig`	The number of significant microarray variables (must be <`p`).
`q`	The number of clinical variables.
`muX`	The class mean difference for the `psig` relevant genes.
`muZ`	The class mean difference for the `q` clinical variables.

With the function simuldata_cluster, observations with y=1 are assumed to come from two different subgroups, 1a and 1b, each with probability 0.5. Relevant genes are generated such that they separate class 1a from the rest, whereas clinical variables separate class 1b from the rest.

A niter-list of simulated data sets. Each data set is given as a list with three elements:

`y`	the `n`-vector of class memberships, coded as 0,1.
`x`	the `n x p` matrix of gene expressions levels. Each row corresponds to an observation, each column to a variable (gene).
`z`	the `n x q` matrix of clinical variables. Each row corresponds to an observation, each column to a clinical variable.

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/eng.html)

Boulesteix AL, Porzelius C, Daumer M, 2008. Microarray-based classification and clinical predictors: On combined classifiers and additional predictive value. Bioinformatics 24:1698-1706.

testclass, testclass_simul, plsrf_x_pv, plsrf_xz_pv, plsrf_x, plsrf_xz, logistic_z, rf_z, svm_x.

# load MAclinical library
# library(MAclinical)

# Generating 3 simulated data sets
my.data<-simuldata_list(niter=3,n=100,p=150,psig=10,q=5,muX=2,muZ=1)
length(my.data)
dim(my.data[[1]]$x)
dim(my.data[[1]]$z)
length(my.data[[1]]$y)