simDataSet: simDataSet - simulation of exemplary dataset
In bootfs: Derive Robust Feature Sets for Two or Multiclass Classification Problems

Description Usage Arguments Details Value Author(s) Examples

A very simple interface to simulate a dataset containing ngen genes and nsam samples. Two groups are defined, drawn from normal distributions with different parameters.

1 2	simDataSet(nsam = 30, ngen = 100, mu1a = 1.2, mu1b = -0.5, mu2a = -1.2, mu2b = -1.4, sigma = 1, plot = FALSE)

`nsam`	Integer. Number of samples.
`ngen`	Integer. Number of genes.
`mu1a`	Double. Mean value of first subgroup of genes in the first sample group.
`mu1b`	Double. Mean value of second subgroup of genes in the first sample group.
`mu2a`	Double. Mean value of first group of genes in the second sample group.
`mu2b`	Double. Mean value of second group of genes in the second sample group.
`sigma`	Positive double. Common standard deviation for the informative genes.
`plot`	Boolean. Show a heatmap of the sampled data.

Defines two sample groups to be classified. One third of the genes One third of the genes contain the information to classify sample group 1, another third the information to classify sample group2. In each gene group, two subgroups with differing intensity profiles are defined, to get complementary subgroups which in total define the respective sample group.

A list with two elements:

`logX`	Log intensity values, samples in rows, features in columns.
`groupings`	List containing one element named grx, which hold the sample group assignment

Christian Bender.

	## Not run: 
		my.seed <- 1234
		data <- simDataSet(ngen=100, nsam=30, plot=TRUE)
		
		
		

		## alternative way to sample data	
		my.seed <- runif(n=1, min=1, max=99999999)
		nsam <- 30 ## number of samples
		ngen <- 100 ## number of features
		nsig <- floor(ngen * .33)

		## use simdata from penalizedSVM package
		# 4. add 6 blocks of 5 genes each and only one significant gene 
		# per block. all genes in the block are correlated with constant
		# correlation factor corr.factor=0.8 		
		#train <- sim.data(n = nsam, ng = ngen, nsg = nsig, corr=TRUE, 
		#corr.factor=0.8, blocks=TRUE, n.blocks=6, nsg.block=1, ng.block=5, seed=my.seed )

		train <- sim.data(n = nsam, ng = ngen, nsg = nsig, corr=FALSE,  
						seed=my.seed, p.n.ratio=0.8) 

		logX <- t(train$x)
		groupings <- list(grx=train$y)

		drawheat(logX, groups=groupings[[1]])

	
## End(Not run)