generateCountData: Generate Count Data

Description Usage Arguments Value Author(s) Examples

Description

This function can be used to generate counts, e.g RNA-Sequencing data, for both classification and clustering purposes.

Usage

1
2
generateCountData(n, p, K, param, sdsignal = 1, DE = 0.3,
  allZero.rm = TRUE, tag.samples = FALSE)

Arguments

n

number of samples.

p

number of variables/features.

K

number of classes.

param

overdispersion parameter. This parameter is matched with the arguement size in rnbinom function. Hence, Negative Binomial distribution aproximates to Poisson distribution as param increases.

sdsignal

a nonzero numeric value. As sdsignal increases, the observed counts greatly differs among K classes.

DE

a numeric value within the interval [0, 1]. This is the proportion of total number of variables that is significantly different among K classes. The remaining part is assumed to be having no contribution to discrimination function.

allZero.rm

a logical. If TRUE, columns having all zero cells are dropped.

tag.samples

a logical. If TRUE, rownames are automatically generated. A tag for each sample such as "S1", "S2", etc.

Value

x, xte

count data matrix for training and test set.

y, yte

class labels for training and test set.

truesf, truesfte

true size factors for training and test set. See Witten (2011) for more information on estimating size factors.

Author(s)

Dincer Goksuluk

Examples

1
2
3
4
set.seed(2128)
counts <- generateCountData(n = 20, p = 10, K = 2, param = 1, sdsignal = 0.5, DE = 0.8,
                            allZero.rm = FALSE, tag.samples = TRUE)
head(counts$x)

NBLDA documentation built on May 2, 2019, 12:21 p.m.