scoring: Compute (regularized) t-scores for gene expression data

Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/scoring.R

Description

This function computes for all genes in an expression matrix the (regularized) t-scores (statistics) with the given class labels and a number of permutations of these labels. Each gene is also assigned a p-value either empirically from the permutation scores or from a t-distribution.

Usage

1
2
scoring(data, labels, method = "SAM", pcompute = "tdist", 
        nperms = 1000, memory.limit = TRUE, verbose = TRUE)

Arguments

data

Expression matrix with rows = genes and columns = samples

labels

Vector or factor of class labels; Scoring works only with two classes!

method

Either "SAM" to compute regularized t-scores, or "t.test" to compute Student's t-statistic

pcompute

Method to compute p-values for each genes, either "empirical" to do permutations and compute p-values from them, or "tdist" to compute p-values based on respective t-distribution

nperms

Number of permutations of the labels to be investigated, if argument 'pcompute="empirical"'

memory.limit

Logical, if you have a really good computer (>2GB RAM), setting this FALSE will increase speed of computations

verbose

Logical, if progress should be reported to STDOUT

Details

If 'pcompute="empirical"', the statistic is computed based on the given class labels, afterwards for 'nperms' permutations of the labels. The p-value for each gene is then the proportion of permutation statistics that are higher or equal than the statistic from the real labels. For each gene the 2.5%- and the 97.5%-quantile of the permutation statistics are also returned as lower and upper 'significance threshold'.

If 'pcompute="tdist", the statistic is computed only based on the given class labels, and the p-value is computed from the t-distribution with (Number of samples - 2) degrees of freedom.

Value

A list, with four components:

observed

(Regularized) t-scores for all genes based on the given labels

pvalues

P-values for all genes, either from permutations or t-distribution

expected.lower

2.5%-quantile of permutation test-statistics, supposed to be a lower 'significance border' for the gene; or NULL if p-values were computed from t-distribution

expected.upper

97.5%-quantile of permutation test-statistics, supposed to be an upper 'significance border' for the gene; or NULL if p-values were computed from t-distribution

Note

In package macat, this function is only called internally by the function evalScoring

Author(s)

MACAT development team

References

Regarding the regularized t-score please see the macat vignette.

See Also

evalScoring

Examples

1
2
3
4
5
6
  data(stjd)
  # compute gene-wise regularized t-statistics for
  #  T- vs. B-lymphocyte ALL:
  isT <- as.numeric(stjd$labels=="T")
  TvsB <- scoring(stjd$expr,isT,method="SAM",pcompute="none")
  summary(TvsB$observed)

Example output

Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

Loading required package: XML
Loading MicroArray Chromosome Analysis Tool...
Loading required packages...

You need package 'stjudem' if you want to see the demo and examples.
Type 'loaddatapkg("stjudem")' for automatic install!


Type 'demo(macatdemo)' for a quick tour...

Compute observed test statistics...
Compute quantiles of empirical distributions...Done.
     Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
-5.837462 -0.382940  0.021554  0.005255  0.450870  5.425687 

macat documentation built on Nov. 8, 2020, 5:44 p.m.