ROKU: detect tissue-specific (or tissue-selective) patterns from...
In TCC: TCC: Differential expression analysis for tag count data with robust normalization strategies

Description Usage Arguments Details Value References Examples

ROKU is a method for detecting tissue-specific (or tissue-selective) patterns from gene expression data for many tissues (or samples). ROKU (i) ranks genes according to their overall tissue-specificity using Shannon entropy after data processing and (ii) detects tissues specific to each gene if any exist using an Akaike's information criterion (AIC) procedure.

1	ROKU(data, upper.limit = 0.25, sort = FALSE)

`data`	numeric matrix or data frame containing microarray data (on log2 scale), where each row indicates the gene or probeset ID, each column indicates the tissue, and each cell indicates a (log2-transformed) expression value of the gene in the tissue. Numeric vector can also be accepted for a single gene expression vector.
`upper.limit`	numeric value (between 0 and 1) specifying the maximum percentage of tissues (or samples) as outliers to each gene.
`sort`	logical. If `TRUE`, results are sorted in descending order of the entropy scores.

As shown in Figure 1 in the original study of ROKU (Kadota et al., 2006), Shannon entropy H of a gene expression vector (x_{1}, x_{2}, ..., x_{N}) for N tissues can range from zero to log_{2}N, with the value 0 for genes expressed in a single tissue and log_{2}N for genes expressed uniformly in all the tissues. Researchers therefore rely on the low entropy score for the identification of tissue-specific patterns. However, direct calculation of the entropy for raw gene expression vector works well only for detecting tissue-specific patterns when over-expressed in a small number of tissues but unexpressed or slightly expressed in others: The H scores of tissue-specific patterns such as (8,8,2,8,8,8,8,8,8,8) for the 3rd tissue-specific down-regulation (see the Figure 1e) are close to the maximum value (log_{2}N=3.32 when N=10) and cannot identify such patterns as tissue-specific. To detect various kinds of tissue-specific patterns by low entropy score, ROKU processes the original gene expression vector and makes a new vector (x_{1'}, x_{2'}, ..., x_{N'}). The data processing is done by subtracting the one-step Tukey biweight and by taking the absolute value. In case of the above example, ROKU calculates the H score from the processed vector (0,0,6,0,0,0,0,0,0,0), giving very low score (from H = 3.26 before processing to H' = 0 after processing). A major characteristic of ROKU is, therefore, to be able to rank various tissue-specific patterns by using the modified entropy scores.

Note that the modified entropy does not explain to which tissue a gene is specific, only measuring the degree of overall tissue specificity of the gene. ROKU employs an AIC-based outlier detection method (Ueda, 1996). Consider, for example, a hypothetical mixed-type of tissue-selective expression pattern (1.2, 5.1, 5.2, 5.4, 5.7, 5.9, 6.0, 6.3, 8.5, 8.8) where we imagine a total of three tissues are specific (down-regulated in tissue1; up-regulated in tissues 9 and 10). The method first normalize the expression values by subtracting the mean and dividing by the standard deviation (i.e., z-score transformation), then sorted in order of increasing magnitude by
(-2.221, -0.342, -0.294, -0.198, -0.053, 0.043, 0.092, 0.236, 1.296, 1.441). The method evaluates various combinations of outlier candidates starting from both sides of the values: model1 for non-outlier, model2 for one outlier for high-side, model3 for two outliers for high-side, ..., modelx for one outlier for down-side, ..., modely for two outliers for both up- and down sides, and so on. Then, it calculates AIC-like statistic (called U) for each combination of model and search the best combination that achieves the lowest U value and is termed the minimum AIC estimate (MAICE). Since the upper.limit value corresponds to the maximum number of the outlier candidates, it decides the number of combinations. The AIC-based method output a vector (1 for up-regulated outliers, -1 for down-regulated outliers, and 0 for non-outliers) that corresponds to the input vector. For example, the method outputs a vector (-1, 0, 0, 0, 0, 0, 0, 0, 1, 1) when using upper.limit = 0.5 and (-1, 0, 0, 0, 0, 0, 0, 0, 0, 0) when using upper.limit = 0.25 (as default). See the Kadota et al., 2007 for detailed discussion about the effect of different parameter settings.

A list containing following fields:

`outlier`	A numeric matrix when the input `data` are data frame or matrix. A numeric vector when the input `data` are numeric vector. Both matrix or vector consist of 1, -1, and 0: 1 for over-expressed outliers, -1 for under-expressed outliers, and 0 for non-outliers.
`H`	A numeric vector when the input `data` are data frame or matrix. A numeric scalar when the input `data` are numeric vector. Both vector or scalar consist of original entropy (H) score(s) calculated from an original gene expression vector.
`modH`	A numeric vector when the input `data` are data frame or matrix. A numeric scalar when the input `data` are numeric vector. Both vector or scalar consist of modified entropy (H') score(s) calculated from a processed gene expression vector.
`rank`	A numeric vector or scalar consisting of the rank(s) of `modH`.
`Tbw`	a numeric vector or scalar consisting of one-step Tukey's biweight as an iteratively reweighted measure of central tendency. This value is in general similar to median value and the same as the output of `tukey.biweight` with default parameter settings in `affy` package. The data processing is done by subtracting this value for each gene expression vector and by taking the absolute value.

Kadota K, Konishi T, Shimizu K: Evaluation of two outlier-detection-based methods for detecting tissue-selective genes from microarray data. Gene Regulation and Systems Biology 2007, 1: 9-15.

Kadota K, Ye J, Nakai Y, Terada T, Shimizu K: ROKU: a novel method for identification of tissue-specific genes. BMC Bioinformatics 2006, 7: 294.

Kadota K, Nishimura SI, Bono H, Nakamura S, Hayashizaki Y, Okazaki Y, Takahashi K: Detection of genes with tissue-specific expression patterns using Akaike's Information Criterion (AIC) procedure. Physiol Genomics 2003, 12: 251-259.

Ueda T. Simple method for the detection of outliers. Japanese J Appl Stat 1996, 25: 17-26.

1
2
3

data(hypoData_ts)

result <- ROKU(hypoData_ts)

Loading required package: DESeq
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: locfit
locfit 1.5-9.1 	 2013-03-22
Loading required package: lattice
    Welcome to 'DESeq'. For improved performance, usability and
    functionality, please consider migrating to 'DESeq2'.
Loading required package: DESeq2
Loading required package: S4Vectors
Loading required package: stats4

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

Loading required package: IRanges
Loading required package: GenomicRanges
Loading required package: GenomeInfoDb
Loading required package: SummarizedExperiment
Loading required package: DelayedArray
Loading required package: matrixStats

Attaching package: 'matrixStats'

The following objects are masked from 'package:Biobase':

    anyMissing, rowMedians


Attaching package: 'DelayedArray'

The following objects are masked from 'package:matrixStats':

    colMaxs, colMins, colRanges, rowMaxs, rowMins, rowRanges

The following object is masked from 'package:base':

    apply


Attaching package: 'DESeq2'

The following objects are masked from 'package:DESeq':

    estimateSizeFactorsForMatrix, getVarianceStabilizedData,
    varianceStabilizingTransformation

Loading required package: edgeR
Loading required package: limma

Attaching package: 'limma'

The following object is masked from 'package:DESeq2':

    plotMA

The following object is masked from 'package:DESeq':

    plotMA

The following object is masked from 'package:BiocGenerics':

    plotMA

Loading required package: baySeq
Loading required package: abind
Loading required package: ROC

Attaching package: 'TCC'

The following object is masked from 'package:edgeR':

    calcNormFactors

Warning messages:
1: no function found corresponding to methods exports from 'DelayedArray' for: 'acbind', 'arbind' 
2: no function found corresponding to methods exports from 'SummarizedExperiment' for: 'acbind', 'arbind'

TCC documentation built on Nov. 8, 2020, 8:20 p.m.

TCC index

TCC

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

TCC
TCC: Differential expression analysis for tag count data with robust normalization strategies

ROKU: detect tissue-specific (or tissue-selective) patterns from...
In TCC: TCC: Differential expression analysis for tag count data with robust normalization strategies

Description

Usage

Arguments

Details

Value

References

Examples

Example output

Related to ROKU in TCC...

R Package Documentation

Browse R Packages

We want your feedback!

TCC TCC: Differential expression analysis for tag count data with robust normalization strategies

ROKU: detect tissue-specific (or tissue-selective) patterns from... In TCC: TCC: Differential expression analysis for tag count data with robust normalization strategies

Description

Usage

Arguments

Details

Value

References

Examples

Example output

Related to ROKU in TCC...

R Package Documentation

Browse R Packages

We want your feedback!

TCC
TCC: Differential expression analysis for tag count data with robust normalization strategies

ROKU: detect tissue-specific (or tissue-selective) patterns from...
In TCC: TCC: Differential expression analysis for tag count data with robust normalization strategies