balance: Balances the number of samples
In Pigengene: Infers biological signatures from gene expression data

Description Usage Arguments Value Author(s) See Also Examples

Oversamples data by repeating rows such that each condition has roughly the same number of samples.

1	balance(Data, Labels, amplification = 5, verbose = 0, naTolerance=0.05)

`Data`	A matrix or data frame containing the expression data, with genes corresponding to columns and rows corresponding to samples. Rows and columns must be named.
`Labels`	A (preferably named) vector containing the Labels (condition types) for `Data`. Names must agree with rows of `Data`.
`amplification`	An integer that controls the number of repeats for each condition. The number of all samples roughly will be multiplied by this factor after oversampling.
`verbose`	The integer level of verbosity. 0 means silent and higher values produce more details of computation.
`naTolerance`	Upper threshold on the fraction of entries per gene that can be missing. Genes with a larger fraction of missing entries are ignored. For genes with smaller fraction of NA entries, the missing values are imputed from their average expression in the other samples. See `check.pigengene.input`.

A list of:

`balanced`	The matrix of oversampled data
`Reptimes`	A vector of integers named by conditions reporting the number of repeats for each condition.
`origSampleInds`	The indices of rows in `balanced` that correspond to the original samples before oversampling

Habil Zare

Pigengene-package, one.step.pigengene, wgcna.one.step, compute.pigengene

     data(aml)
     data(mds)
     d1 <- rbind(aml,mds)
     Labels <- c(rep("AML",nrow(aml)),rep("MDS",nrow(mds)))
     names(Labels) <- rownames(d1)
     b1 <- balance(Data=d1, Labels=Labels)
     d2 <- b1$balanced