genSimData: Simulate an omics dataset with biclusters

Description Usage Arguments Details Value Examples

Description

Most models of omics datasets can be built using various combinations of effects in this function. Effects are applied in this order:

  1. Constant background value

  2. Base row values (overrides constant background value)

  3. Gaussian noise

  4. Constant value across all biclusters (overrides all previous effects)

  5. Additive bicluster-specific values

  6. Additive bicluster-and-column-specific values

  7. Multiplicative bicluster-and-column-specific values

  8. Additive bicluster-and-row-specific values

  9. Multiplicative bicluster-and-row-specific values

  10. Shuffling of rows and columns

Usage

1
2
3
4
5
genSimData(n = 1, clusterHeight = 20, clusterWidth = 20, dimx = 80,
  dimy = 80, overlapRows = 0, overlapCols = 0, bgConst = 0,
  bgNorm = 0, biclusterConstant = NULL, biclusterShift = 0,
  rowBase = 0, rowShift = 0, rowScale = NULL, colShift = 0,
  colScale = NULL, shuffle = TRUE, file = "")

Arguments

n

the number of biclusters

clusterHeight

the number of rows in each bicluster

clusterWidth

the number of columns in each bicluster

dimx

the number of rows in the dataset

dimy

the number of columns in the dataset

overlapRows

the number of rows that biclusters should share

overlapCols

the number of columns that biclusters should share

bgConst

initial value of the entire matrix

bgNorm

standard deviation of Gaussian noise

biclusterConstant

if not NULL, sets the value of every element in every bicluster, after creating the background

biclusterShift

standard deviation of bicluster-specific additive values

rowBase

standard deviaton of base row values

rowShift

standard deviation of bicluster-and-row-specific shifts

rowScale

standard deviation of bicluster-and-row-specific scaling coefficients

colShift

standard deviation of bicluster-and-column-specific shifts

colScale

standard deviation of bicluster-and-column-specific scaling coefficients

shuffle

if FALSE, biclusters will be deterministically arranged along the diagonal of the matrix

file

an optional string giving a prefix for .png and .csv files where the generated matrix should be written

Details

Besides constant values across all biclusters (biclusterConstant), all effects are sampled from a normal distribution centered on 0 and with standard deviation given by the respective parameter.

Value

A numeric matrix containing n biclusters created by the given parameters.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# No background, sixteen non-overlapping 5x5 biclusters. Elements in
# biclusters have value 5; all other elements are 0.
mat <- genSimData(n = 16, biclusterConstant = 5,
 clusterHeight = 5, clusterWidth = 5, shuffle = FALSE)

# No background, two 5x5 biclusters causing row-shift effects
mat <- genSimData(n = 2, rowBase = 1, rowShift = 1,
 clusterHeight = 5, clusterWidth = 5, shuffle = FALSE)
matrixHeatmap(mat)

# Three 5x5 biclusters, where the first and second biclusters overlap and the
# second and third biclusters overlap. Both overlap regions are 4 rows by 2
# columns.
mat <- genSimData(n = 3, rowBase = 1, rowShift = 1, overlapRows = 4,
 overlapCols = 2, clusterHeight = 5, clusterWidth = 5, shuffle = FALSE)
matrixHeatmap(mat)

# One 10x10 plaid bicluster
mat <- genSimData(n = 3, biclusterShift = 1, rowShift = 1, colShift = 1,
 clusterHeight = 10, clusterWidth = 10, shuffle = FALSE)
 matrixHeatmap(mat)

jonalim/mfBiclust documentation built on May 4, 2019, 4:13 a.m.