simulate: Simulating data

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This function simulates a list of data sets as described in Boulesteix et al (2008), section 3.1.

Usage

1
2
simuldata_list(niter=50,n=500,p=1000,psig=50,q=5,muX=0,muZ=0)
simuldatacluster_list(niter=50,n=500,p=1000,psig=50,q=5,muX=0,muZ=0)

Arguments

niter

The number of data sets to be simulated.

n

The number of observations.

p

The number of microarray variables (genes).

psig

The number of significant microarray variables (must be <p).

q

The number of clinical variables.

muX

The class mean difference for the psig relevant genes.

muZ

The class mean difference for the q clinical variables.

Details

With the function simuldata_cluster, observations with y=1 are assumed to come from two different subgroups, 1a and 1b, each with probability 0.5. Relevant genes are generated such that they separate class 1a from the rest, whereas clinical variables separate class 1b from the rest.

Value

A niter-list of simulated data sets. Each data set is given as a list with three elements:

y

the n-vector of class memberships, coded as 0,1.

x

the n x p matrix of gene expressions levels. Each row corresponds to an observation, each column to a variable (gene).

z

the n x q matrix of clinical variables. Each row corresponds to an observation, each column to a clinical variable.

Author(s)

Anne-Laure Boulesteix (http://www.ibe.med.uni-muenchen.de/organisation/mitarbeiter/020_professuren/boulesteix/eng.html)

References

Boulesteix AL, Porzelius C, Daumer M, 2008. Microarray-based classification and clinical predictors: On combined classifiers and additional predictive value. Bioinformatics 24:1698-1706.

See Also

testclass, testclass_simul, plsrf_x_pv, plsrf_xz_pv, plsrf_x, plsrf_xz, logistic_z, rf_z, svm_x.

Examples

1
2
3
4
5
6
7
8
9
# load MAclinical library
# library(MAclinical)

# Generating 3 simulated data sets
my.data<-simuldata_list(niter=3,n=100,p=150,psig=10,q=5,muX=2,muZ=1)
length(my.data)
dim(my.data[[1]]$x)
dim(my.data[[1]]$z)
length(my.data[[1]]$y)

MAclinical documentation built on May 2, 2019, 9:30 a.m.