balance: Balances the number of samples

Description Usage Arguments Value Author(s) See Also Examples

View source: R/balance.R

Description

Oversamples data by repeating rows such that each condition has roughly the same number of samples.

Usage

1
     balance(Data, Labels, amplification = 5, verbose = 0, naTolerance=0.05)

Arguments

Data

A matrix or data frame containing the expression data, with genes corresponding to columns and rows corresponding to samples. Rows and columns must be named.

Labels

A (preferably named) vector containing the Labels (condition types) for Data. Names must agree with rows of Data.

amplification

An integer that controls the number of repeats for each condition. The number of all samples roughly will be multiplied by this factor after oversampling.

verbose

The integer level of verbosity. 0 means silent and higher values produce more details of computation.

naTolerance

Upper threshold on the fraction of entries per gene that can be missing. Genes with a larger fraction of missing entries are ignored. For genes with smaller fraction of NA entries, the missing values are imputed from their average expression in the other samples. See check.pigengene.input.

Value

A list of:

balanced

The matrix of oversampled data

Reptimes

A vector of integers named by conditions reporting the number of repeats for each condition.

origSampleInds

The indices of rows in balanced that correspond to the original samples before oversampling

Author(s)

Habil Zare

See Also

Pigengene-package, one.step.pigengene, wgcna.one.step, compute.pigengene

Examples

1
2
3
4
5
6
7
     data(aml)
     data(mds)
     d1 <- rbind(aml,mds)
     Labels <- c(rep("AML",nrow(aml)),rep("MDS",nrow(mds)))
     names(Labels) <- rownames(d1)
     b1 <- balance(Data=d1, Labels=Labels)
     d2 <- b1$balanced

Pigengene documentation built on Nov. 8, 2020, 6:47 p.m.