tranest: Glog transformation parameter estimation function

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

Estimates parameters for the glog transformation, by maximum likelihood or by minimizing the stability score.

Usage

1
2
3
tranest(eS, ngenes = -1, starting = FALSE, lambda = 1000, alpha = 0,
    gradtol = 1e-3, lowessnorm = FALSE, method=1, mult=FALSE, model=NULL, 
	SD = FALSE, rank = TRUE, model.based = TRUE, rep.arrays = NULL)

Arguments

eS

An ExpressionSet object

ngenes

Number of genes to be used in parameter estimation. Default is to use all genes unless there are more than 100,000, in which case a subset of 50,000 genes is selected at random.

starting

If TRUE, user-specified starting values for lambda and alpha are input to the optimization routine

lambda

Starting value for parameter lambda. Ignored unless starting = TRUE

alpha

Starting value for parameter alpha. Ignored unless starting = TRUE

gradtol

A positive scalar giving the tolerance at which the scaled gradient is considered close enough to zero to terminate the algorithm

lowessnorm

If TRUE, lowess normalization (using lnorm) is used in calculating the likelihood.

method

Determines optimization method. Default is 1, which corresponds to a Newton-type method (see nlm and details.)

mult

If TRUE, tranest will use a vector alpha with one (possibly different) entry per sample. Default is to use same alpha for every sample. SD and mult may not both be TRUE.

model

Specifies model to be used. Default is to use all variables from eS without interactions. See details.

SD

If TRUE, transformation parameters are estimated by minimizing the stability score rather than by maximum likelihood. See details.

rank

If TRUE, the stability score is calculated by regressing the replicate standard deviations on the ranks of the gene/row means (rather than on the means themselves). Ignored unless SD = TRUE

model.based

If TRUE, the stability score is calculated using the standard deviations of residuals from the linear model in model. Ignored unless SD = TRUE

rep.arrays

List of sets of replicate arrays. Each element of rep.arrays should be a vector with entries corresponding to arrays (columns) in exprs(eS) conducted under the same experimental conditions, i.e., with identical rows in pData(eS). Ignored unless SD = TRUE and model.based = FALSE

Details

If you have data in a matrix and information about experimental design factors, then you can use neweS to convert the data into an ExpressionSet object. Please see neweS for more detail.

The model argument is an optional character string, constructed like the right-hand side of a formula for lm. It specifies which of the variables in the ExpressionSet will be used in the model and whether interaction terms will be included. If model=NULL, it uses all variables from the ExpressionSet without interactions. Be careful of using interaction terms with factors; this often leads to overfitting, which will yield an error.

The default estimation method is maximum likelihood. The likelihood is derived by assuming that there exist values for lambda and alpha such that the residuals from the linear model in model, fit to glog-transformed data using those values for lambda and alpha, follow a normal distribution. See Durbin and Rocke (2003) for details.

If SD = TRUE, lambda and alpha are estimated by minimizing the stability score rather than by maximum likelihood. The stability score is defined as the absolute value of the slope coefficient from the regression of the replicate/residual standard deviation on the gene/row means, or on the rank of the gene/row means. If model.based = TRUE, the stability score is calculated using the standard deviation of residuals from the linear model in model. Otherwise, the stability score is calculated using the pooled standard deviation over sets of replicates in rep.arrays. See Wu and Rocke (2009) for details.

Optimization methods in method are as follows:

1 =

Newton-type method, using nlm

2 =

Nelder-Mead, using optim

3 =

BFGS, using optim

4 =

Conjugate gradients, using optim

5 =

Simulated annealing, using optim (may only be used when mult = TRUE)

Value

A list with components:

lambda

Estimate of transformation parameter lambda

alpha

Estimate of transformation parameter alpha

Author(s)

David Rocke, Geun-Cheol Lee, John Tillinghast, Blythe Durbin-Johnson, and Shiquan Wu

References

Durbin, B.P and Rocke, D.M. (2003) Estimation of Transformation Parameters for Microarray Data, Bioinformatics, 19, 1360–1367.

Wu, S. and Rocke, D.M. (2009) Analysis of Illumina BeadArray data using variance stabilizing transformations.

http://dmrocke.ucdavis.edu

See Also

tranestAffyProbeLevel, lnorm, glog

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
library(Biobase)
library(LMGene)

#data
data(sample.eS)

tranpar <- tranest(sample.eS, 100)
tranpar
tranpar <- tranest(sample.eS, mult=TRUE)
tranpar

Example output

Loading required package: Biobase
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, cbind, colMeans, colSums, colnames, do.call,
    duplicated, eval, evalq, get, grep, grepl, intersect, is.unsorted,
    lapply, lengths, mapply, match, mget, order, paste, pmax, pmax.int,
    pmin, pmin.int, rank, rbind, rowMeans, rowSums, rownames, sapply,
    setdiff, sort, table, tapply, union, unique, unsplit, which,
    which.max, which.min

Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: multtest
Loading required package: survival
Loading required package: affy

Attaching package: 'LMGene'

The following object is masked from 'package:base':

    norm

Warning message:
In read.dcf(con) :
  URL 'http://bioconductor.org/BiocInstaller.dcf': status was 'Couldn't resolve host name'
$lambda
[1] 765.1725

$alpha
[1] 59.49636

$lambda
[1] 689.2819

$alpha
 [1]  69.67146  37.02711  54.13904  69.35728  60.33270  60.75301  71.72965
 [8]  64.55506  58.63427  65.73625  48.40173  59.43778  76.34568  78.81046
[15]  82.20326  96.19938  77.60070  79.48089  73.63257  73.41650  33.86029
[22]  69.26448  55.75460  54.29840 139.89493  91.36521  46.46158  59.02056
[29]  73.60255  89.48728  57.13887  64.98866

LMGene documentation built on April 28, 2020, 8:01 p.m.