makeFabiaDataPos: Generation of Bicluster Data
In fabia: FABIA: Factor Analysis for Bicluster Acquisition

Description Usage Arguments Details Value Author(s) See Also Examples

makeFabiaDataPos: R implementation of makeFabiaDataPos.

1 2	makeFabiaDataPos(n,l,p,f1,f2,of1,of2,sd_noise,sd_z_noise, mean_z,sd_z,sd_l_noise,mean_l,sd_l)

`n`	number of observations.
`l`	number of samples.
`p`	number of biclusters.
`f1`	nn/f1 max. additional samples are active in a bicluster.
`f2`	n/f2 max. additional observations that form a pattern in a bicluster.
`of1`	minimal active samples in a bicluster.
`of2`	minimal observations that form a pattern in a bicluster.
`sd_noise`	Gaussian zero mean noise std on data matrix.
`sd_z_noise`	Gaussian zero mean noise std for deactivated hidden factors.
`mean_z`	Gaussian mean for activated factors.
`sd_z`	Gaussian std for activated factors.
`sd_l_noise`	Gaussian zero mean noise std if no observation patterns are present.
`mean_l`	Gaussian mean for observation patterns.
`sd_l`	Gaussian std for observation patterns.

Essentially the data generation model is the sum of outer products of sparse vectors:

X = ∑_{i=1}^{p} λ_i z_i^T + U

where the number of summands p is the number of biclusters. The matrix factorization is

X = L Z + U

and noise free

Y = L Z

Here λ_i are from R^n, z_i from R^l, L from R^{n \times p}, Z from R^{p \times l}, and X, U, Y from R^{n \times l}.

Sequentially L_i are generated using n, f2, of2, sd_l_noise, mean_l, sd_l. of2 gives the minimal observations participating in a bicluster to which between 0 and n/f2 observations are added, where the number is uniformly chosen. sd_l_noise gives the noise of observations not participating in the bicluster. mean_l and sd_l determines the Gaussian from which the values are drawn for the observations that participate in the bicluster. "POS": The sign of the mean is fixed.

Sequentially Z_i are generated using l, f1, of1, sd_z_noise, mean_z, sd_z. of1 gives the minimal samples participating in a bicluster to which between 0 and l/f1 samples are added, where the number is uniformly chosen. sd_z_noise gives the noise of samples not participating in the bicluster. mean_z and sd_z determines the Gaussian from which the values are drawn for the samples that participate in the bicluster.

U is the overall Gaussian zero mean noise generated by sd_noise.

Implementation in R.

`X`	the noise data from R^{n \times l}.
`Y`	the noise free data from R^{n \times l}.
`ZC`	list where i-th element gives samples belonging to i-th bicluster.
`LC`	list where i-th element gives observations belonging to i-th bicluster.

Sepp Hochreiter

fabia, fabias, fabiap, fabi, fabiasp, mfsc, nmfdiv, nmfeu, nmfsc, extractPlot, extractBic, plotBicluster, Factorization, projFuncPos, projFunc, estimateMode, makeFabiaData, makeFabiaDataBlocks, makeFabiaDataPos, makeFabiaDataBlocksPos, matrixImagePlot, fabiaDemo, fabiaVersion

#---------------
# TEST
#---------------

dat <- makeFabiaDataPos(n = 100,l= 50,p = 3,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]

matrixImagePlot(Y)
dev.new()
matrixImagePlot(X)


## Not run: 
#---------------
# DEMO
#---------------

dat <- makeFabiaDataPos(n = 1000,l= 100,p = 10,f1 = 5,f2 = 5,
  of1 = 5,of2 = 10,sd_noise = 3.0,sd_z_noise = 0.2,mean_z = 2.0,
  sd_z = 1.0,sd_l_noise = 0.2,mean_l = 3.0,sd_l = 1.0)

X <- dat[[1]]
Y <- dat[[2]]

matrixImagePlot(Y)
dev.new()
matrixImagePlot(X)


## End(Not run)