plotRLDF: Plot of regularized linear discriminant functions for...
In limma: Linear Models for Microarray Data

Description Usage Arguments Details Value Note Author(s) See Also Examples

Plot regularized linear discriminant functions for classifying samples based on expression data.

plotRLDF(y, design = NULL, z = NULL, nprobes = 100, plot = TRUE,
         labels.y = NULL, labels.z = NULL, pch.y = NULL, pch.z = NULL,
         col.y = "black", col.z = "black",
         show.dimensions = c(1,2), ndim = max(show.dimensions),
         var.prior = NULL, df.prior = NULL, trend = FALSE, robust = FALSE, ...)

`y`	the training dataset. Can be any data object which can be coerced to a matrix, such as `ExpressionSet` or `EList`.
`design`	design matrix defining the training groups to be distinguished. The first column is assumed to represent the intercept. Defaults to `model.matrix(~factor(labels.y))`.
`z`	the dataset to be classified. Can be any data object which can be coerced to a matrix, such as `ExpressionSet` or `EList`. Rows must correspond to rows of `y`.
`nprobes`	number of probes to be used for the calculations. The probes will be selected by moderated F statistic.
`plot`	logical, should a plot be created?
`labels.y`	character vector of sample names or labels in `y`. Defaults to `colnames(y)` or failing that to `1:n`.
`labels.z`	character vector of sample names or labels in `z`. Defaults to `colnames(z)` or failing that to `letters[1:n]`.
`pch.y`	plotting symbol or symbols for `y`. See `points` for possible values. Takes precedence over `labels.y` if both are specified.
`pch.z`	plotting symbol or symbols for `y`. See `points` for possible values. Takes precedence over `labels.z` if both are specified.
`col.y`	colors for the plotting `labels.y`.
`col.z`	colors for the plotting `labels.z`.
`show.dimensions`	integer vector of length two indicating which two discriminant functions to plot. Functions are in decreasing order of discriminatory power.
`ndim`	number of discriminant functions to compute
`var.prior`	prior variances, for regularizing the within-group covariance matrix. By default is estimated by `squeezeVar`.
`df.prior`	prior degrees of freedom for regularizing the within-group covariance matrix. By default is estimated by `squeezeVar`.
`trend`	logical, should a trend be estimated for `var.prior`? See `eBayes` for details. Only used if `var.prior` or `df.prior` are `NULL`.
`robust`	logical, should `var.prior` and `df.prior` be estimated robustly? See `eBayes` for details. Only used if `var.prior` or `df.prior` are `NULL`.
`...`	any other arguments are passed to `plot`.

The function builds discriminant functions from the training data (y) and applies them to the test data (z). The method is a variation on classifical linear discriminant functions (LDFs), in that the within-group covariance matrix is regularized to ensure that it is invertible, with eigenvalues bounded away from zero. The within-group covariance matrix is squeezed towards a diagonal matrix with empirical Bayes posterior variances as diagonal elements.

The calculations are based on a filtered list of probes. The nprobes probes with largest moderated F statistics are used to discriminate.

The ndim argument allows all required LDFs to be computed even though only two are plotted.

If plot=TRUE a plot is created on the current graphics device. A list containing the following components is (invisibly) returned:

`training`	numeric matrix with `ncol(y)` rows and `ndim` columns containing discriminant functions evaluated for the training data.
`predicting`	numeric matrix with `ncol(z)` rows and `ndim` columns containing discriminant functions evalulated on the classification data.
`top`	integer vector of length `nprobes` giving indices of probes used.
`metagenes`	numeric matrix with `nprobes` rows and `ndim` columns containing probe weights defining each discriminant function.
`singular.values`	singular.values showing the predictive power of each discriminant function.
`rank`	maximum number of discriminant functions with singular.values greater than zero.
`var.prior`	numeric vector of prior variances.
`df.prior`	numeric vector of prior degrees of freedom.

The default values for df.prior and var.prior were changed in limma 3.27.10. Previously these were preset values. Now the default is to estimate them using squeezeVar.

Gordon Smyth, Di Wu and Yifang Hu

lda in package MASS

# Simulate gene expression data for 1000 probes and 6 microarrays.
# Samples are in two groups
# First 50 probes are differentially expressed in second group
sd <- 0.3*sqrt(4/rchisq(1000,df=4))
y <- matrix(rnorm(1000*6,sd=sd),1000,6)
rownames(y) <- paste("Gene",1:1000)
y[1:50,4:6] <- y[1:50,4:6] + 2

z <- matrix(rnorm(1000*6,sd=sd),1000,6)
rownames(z) <- paste("Gene",1:1000)
z[1:50,4:6] <- z[1:50,4:6] + 1.8
z[1:50,1:3] <- z[1:50,1:3] - 0.2

design <- cbind(Grp1=1,Grp2vs1=c(0,0,0,1,1,1))
options(digit=3)

# Samples 1-6 are training set, samples a-f are test set:
plotRLDF(y, design, z=z, col.y="black", col.z="red")
legend("top", pch=16, col=c("black","red"), legend=c("Training","Predicted"))