data.discretize: Function to discretize data based on user specified cutoffs

Description Usage Arguments Details Value Author(s) See Also Examples

Description

This function enable discretization of data based on cutoffs specified by the users

Usage

1

Arguments

data

matrix of continuous or categorical values (gene expressions for example); observations in rows, features in columns.

cuts

list of cutoffs for each variable.

Details

This function is discretizing the continuous value in data using the cutoffs specified in cuts to create categories represented by increasing integers in 1,2,...n where n is the maximum number of categories in the dataset.

Value

a matrix of categorical values where categories are {1,2,..,n} depending on the list of cutoffs specified in cuts; observations in rows, features in columns.

Author(s)

Benjamin Haibe-Kains

See Also

discretize

Examples

1
2
3
4
5
6
7
## load gene expression data for colon cancer data, list of genes related to RAS signaling pathway and the corresponding priors
data(expO.colon.ras)
## discretize the data in 3 categories
categories <- rep(3, ncol(data.ras))
## estimate the cutoffs (tertiles) for each gene
cuts.discr <- lapply(apply(rbind("nbcat"=categories, data.ras), 2, function(x) { y <- x[1]; x <- x[-1]; return(list(quantile(x=x, probs=seq(0, 1, length.out=y+1), na.rm=TRUE)[-c(1, y+1)])) }), function(x) { return(x[[1]]) })
data.ras.bin <- data.discretize(data=data.ras, cuts=cuts.discr)

predictionet documentation built on Nov. 8, 2020, 7:48 p.m.