PreparingTheIndexes: Define an expression index for each gene
In michmich76/ctsGE: Clustering of Time Series Gene Expression data

Description Usage Arguments Details Value See Also Examples

Reads the table of genes expression and return an expression index for each gene.

1 2	PreparingTheIndexes(x, min_cutoff = 0.5, max_cutoff = 0.7, mad.scale = TRUE)

`x`	list of an expression data that made by readTSGE
`min_cutoff`	A numeric the lower limit range to calculate the optimal cutoff for the data, default to 0.5 See Details.
`max_cutoff`	A numeric the upper limit range to calculate the optimal cutoff for the data, default to 0.7 See Details.
`mad.scale`	A boolean defaulting to TRUE as to what method of scaling to use. Default median-base scaling. FALSE, mean-base scaling.

1. First, the expression matrix is standardized. The function default standardizing method is a median-based scaling; alternatively, a mean-based scaling can be used. The new scaled values represent the distance of each gene at a certain time point from its center, median or mean, in median absolute deviation (MAD) units or standard deviation (SD) units, respectively.

2. The function compute the cutoff value following the idea that the clustering will be performed on small gene groups, an optimal cutoff value will be one that will minimize the number of genes in each group, i.e., generate index groups of equal size. The chi-squared values will be generate for each cutoff value (from min_cutoff to max_cutoff parameter in increments of 0.05) the cutoff that generate the lowest chi-squared is chosen.

3. Next, the standardized values are converted to index values that indicate whether gene expression is above, below or within the limits around the center of the time series, i.e., **1 / -1 / 0**, respectively. The cutoff parameter determines the limits around the gene-expression center. Then the function calculates the index value at each time point according to:

0: standardized value is within the limits (+/- cutoff)
1: standardized value exceeds the upper limit (+ cutoff)
-1: standardized value exceeds the lower limit (- cutoff)

list object is returned as output with the relative standarization table in object$scaled, and the indexes table in object$index

scale index

data_dir <- system.file("extdata", package = "ctsGE")
files <- dir(path=data_dir,pattern = "\\.xls$")
rts <- readTSGE(files, path = data_dir,
labels = c("0h","6h","12h","24h","48h","72h"), skip = 10625 )
prts <- PreparingTheIndexes(rts)
prts$cutoff # the optimal cutoff