plotLoadings: Plot of Loading vectors
In mixOmics: Omics Data Integration Project

Description Usage Arguments Details Author(s) References See Also Examples

This function provides a horizontal bar plot to visualise loading vectors. For discriminant analysis, it provides visualisation of highest or lowest mean/median value of the variables with color code corresponding to the outcome of interest.

## S3 method for class 'pls'
plotLoadings(object, block, comp = 1, col = NULL, ndisplay = NULL,
size.name = 0.7, name.var = NULL, name.var.complete = FALSE, title = NULL, subtitle,
size.title = rel(2), size.subtitle = rel(1.5), layout = NULL, border = NA,
xlim = NULL, ... )

## S3 method for class 'mint.pls'
plotLoadings(object, study = "global", comp = 1, col = NULL, ndisplay = NULL,
size.name = 0.7, name.var = NULL, name.var.complete = FALSE, title = NULL, subtitle,
size.title = rel(1.8), size.subtitle = rel(1.4), layout = NULL, border = NA,
xlim = NULL, ... )

## S3 method for class 'plsda'
plotLoadings(object, contrib, method = "mean", block, comp = 1,
plot = TRUE, show.ties = TRUE, col.ties="white", ndisplay = NULL, size.name = 0.7,
size.legend = 0.8, name.var=NULL, name.var.complete=FALSE, title = NULL,
subtitle, size.title = rel(1.8), size.subtitle = rel(1.4),
legend = TRUE, legend.color = NULL, legend.title = 'Outcome',
layout = NULL, border = NA, xlim = NULL, ... )

## S3 method for class 'mint.plsda'
plotLoadings(object, contrib = NULL, method = "mean",
study = "global", comp = 1, plot = TRUE, show.ties = TRUE, col.ties = "white",
ndisplay = NULL, size.name = 0.7, size.legend = 0.8, name.var = NULL,
name.var.complete = FALSE, title = NULL, subtitle, size.title = rel(1.8),
size.subtitle = rel(1.4), legend = TRUE, legend.color = NULL,
legend.title = 'Outcome', layout = NULL, border = NA, xlim = NULL, ... )

`object`	object
`contrib`	a character set to 'max' or 'min' indicating if the color of the bar should correspond to the group with the maximal or minimal expression levels / abundance.
`method`	a character set to 'mean' or 'median' indicating the criterion to assess the contribution. We recommend using median in the case of count or skewed data.
`study`	Indicates which study are to be plotted. A character vector containing some levels of `object$study`, "all.partial" to plot all studies or "global" is expected.
`block`	A single value indicating which block to consider in a `sgccda` object.
`comp`	integer value indicating the component of interest from the object.
`col`	color used in the barplot, only for object from non Discriminant analysis
`plot`	Boolean indicating of the plot should be output. If set to FALSE the user can extract the contribution matrix, see example. Default value is TRUE.
`show.ties`	Boolean. If TRUE then tie groups appear in the color set by `col.ties`, which will appear in the legend. Ties can happen when dealing with count data type. By default set to TRUE.
`col.ties`	Color corresponding to ties, only used if `show.ties=TRUE` and ties are present.
`ndisplay`	integer indicating how many of the most important variables are to be plotted (ranked by decreasing weights in each PLS-component). Useful to lighten a graph.
`size.name`	A numerical value giving the amount by which plotting the variable name text should be magnified or reduced relative to the default.
`size.legend`	A numerical value giving the amount by which plotting the legend text should be magnified or reduced relative to the default.
`name.var`	A character vector indicating the names of the variables. The names of the vector should match the names of the input data, see example.
`name.var.complete`	Boolean. If `name.var` is supplied with some empty names, `name.var.complete` allows you to use the initial variable names to complete the graph (from colnames(X)). Defaut to FALSE.
`title`	A set of characters to indicate the title of the plot. Default value is NULL.
`subtitle`	subtitle for each plot, only used when several `block` or `study` are plotted.
`size.title`	size of the title
`size.subtitle`	size of the subtitle
`legend`	Boolean indicating if the legend indicating the group outcomes should be added to the plot. Default value is TRUE.
`legend.color`	A color vector of length the number of group outcomes. See examples.
`legend.title`	A set of characters to indicate the title of the legend. Default value is NULL.
`layout`	Vector of two values (rows,cols) that indicates the layout of the plot. If `layout` is provided, the remaining empty subplots are still active
`border`	Argument from `barplot`: indicates whether to draw a border on the barplot.
`xlim`	Argument from `barplot`: limit of the x-axis. When plotting several `block`, a matrix is expected where each row is the `xlim` used for each of the blocks.
`...`	not used.

The contribution of each variable for each component (depending on the object) is represented in a barplot where each bar length corresponds to the loading weight (importance) of the feature. The loading weight can be positive or negative.

For discriminant analysis, the color corresponds to the group in which the feature is most 'abundant'. Note that this type of graphical output is particularly insightful for count microbial data - in that latter case using the method = 'median' is advised. Note also that if the parameter contrib is not provided, plots are white.

For MINT analysis, study="global" plots the global loadings while partial loadings are plotted when study is a level of object$study. Since variable selection in MINT is performed at the global level, only the selected variables are plotted for the partial loadings even if the partial loadings are not sparse. See references. Importantly for multi plots, the legend accounts for one subplot in the layout design.

Florian Rohart, Kim-Anh Lê Cao, Benoit Gautier

Rohart F. et al (2016, submitted). MINT: A multivariate integrative approach to identify a reproducible biomarker signature across multiple experiments and platforms.

Eslami, A., Qannari, E. M., Kohler, A., and Bougeard, S. (2013). Multi-group PLS Regression: Application to Epidemiology. In New Perspectives in Partial Least Squares and Related Methods, pages 243-255. Springer.

Singh A., Gautier B., Shannon C., Vacher M., Rohart F., Tebbutt S. and Lê Cao K.A. (2016). DIABLO - multi omics integration for biomarker discovery.

Lê Cao, K.-A., Martin, P.G.P., Robert-Granie, C. and Besse, P. (2009). Sparse canonical methods for biological data integration: application to a cross-platform study. BMC Bioinformatics 10:34.

Tenenhaus, M. (1998). La regression PLS: theorie et pratique. Paris: Editions Technic.

Wold H. (1966). Estimation of principal components and related models by iterative least squares. In: Krishnaiah, P. R. (editors), Multivariate Analysis. Academic Press, N.Y., 391-420.

pls, spls, plsda, splsda, mint.pls, mint.spls, mint.plsda, mint.splsda, block.pls, block.spls, block.plsda, block.splsda, mint.block.pls, mint.block.spls, mint.block.plsda, mint.block.splsda

## object of class 'spls'
# --------------------------
data(liver.toxicity)
X = liver.toxicity$gene
Y = liver.toxicity$clinic

toxicity.spls = spls(X, Y, ncomp = 2, keepX = c(50, 50),
keepY = c(10, 10))

plotLoadings(toxicity.spls)

# with xlim
xlim = matrix(c(-0.1,0.3, -0.4,0.6), nrow = 2, byrow = TRUE)
plotLoadings(toxicity.spls, xlim = xlim)


## object of class 'splsda'
# --------------------------
data(liver.toxicity)
X = as.matrix(liver.toxicity$gene)
Y = as.factor(liver.toxicity$treatment[, 4])

splsda.liver = splsda(X, Y, ncomp = 2, keepX = c(20, 20))

# contribution on comp 1, based on the median. 
# Colors indicate the group in which the median expression is maximal
plotLoadings(splsda.liver, comp = 1, method = 'median')
plotLoadings(splsda.liver, comp = 1, method = 'median', contrib = "max")

# contribution on comp 2, based on median. 
#Colors indicate the group in which the median expression is maximal
plotLoadings(splsda.liver, comp = 2, method = 'median', contrib = "max")

# contribution on comp 2, based on median. 
# Colors indicate the group in which the median expression is minimal
plotLoadings(splsda.liver, comp = 2, method = 'median', contrib = 'min')

# changing the name to gene names
# if the user input a name.var but names(name.var) is NULL,
# then a warning will be output and assign names of name.var to colnames(X)
# this is to make sure we can match the name of the selected variables to the contribution plot.
name.var = liver.toxicity$gene.ID[, 'geneBank']
length(name.var)
plotLoadings(splsda.liver, comp = 2, method = 'median', name.var = name.var,
title = "Liver data", contrib = "max")

# if names are provided: ok, even when NAs
name.var = liver.toxicity$gene.ID[, 'geneBank']
names(name.var) = rownames(liver.toxicity$gene.ID)
plotLoadings(splsda.liver, comp = 2, method = 'median',
name.var = name.var, size.name = 0.5, contrib = "max")

#missing names of some genes? complete with the original names
plotLoadings(splsda.liver, comp = 2, method = 'median',
name.var = name.var, size.name = 0.5,complete.name.var=TRUE, contrib = "max")

# look at the contribution (median) for each variable
plot.contrib = plotLoadings(splsda.liver, comp = 2, method = 'median', plot = FALSE,
contrib = "max")
head(plot.contrib$contrib)
# change the title of the legend and title name
plotLoadings(splsda.liver, comp = 2, method = 'median', legend.title = 'Time',
title = 'Contribution plot', contrib = "max")

# no legend
plotLoadings(splsda.liver, comp = 2, method = 'median', legend = FALSE, contrib = "max")

# change the color of the legend
plotLoadings(splsda.liver, comp = 2, method = 'median', legend.color = c(1:4), contrib = "max")



# object 'splsda multilevel'
# -----------------
## Not run: 
data(vac18)
X = vac18$genes
Y = vac18$stimulation
# sample indicates the repeated measurements
sample = vac18$sample
stimul = vac18$stimulation

# multilevel sPLS-DA model
res.1level = splsda(X, Y = stimul, ncomp = 3, multilevel = sample,
keepX = c(30, 137, 123))


name.var = vac18$tab.prob.gene[, 'Gene']
names(name.var) = colnames(X)

plotLoadings(res.1level, comp = 2, method = 'median', legend.title = 'Stimu',
name.var = name.var, size.name = 0.2, contrib = "max")

# too many transcripts? only output the top ones
plotLoadings(res.1level, comp = 2, method = 'median', legend.title = 'Stimu',
name.var = name.var, size.name = 0.5, ndisplay = 60, contrib = "max")


## End(Not run)

# object 'plsda'
# ----------------
## Not run: 
# breast tumors
# ---
data(breast.tumors)
X = breast.tumors$gene.exp
Y = breast.tumors$sample$treatment

plsda.breast = plsda(X, Y, ncomp = 2)

name.var = as.character(breast.tumors$genes$name)
names(name.var) = colnames(X)

# with gene IDs, showing the top 60
plotLoadings(plsda.breast, contrib = 'max', comp = 1, method = 'median', 
            ndisplay = 60, 
            name.var = name.var,
            size.name = 0.6,
            legend.color = color.mixo(1:2))

## End(Not run)

# liver toxicity
# ---
## Not run: 
data(liver.toxicity)
X = liver.toxicity$gene
Y = liver.toxicity$treatment[, 4]

plsda.liver = plsda(X, Y, ncomp = 2)
plotIndiv(plsda.liver, ind.names = Y, ellipse = TRUE)


name.var = liver.toxicity$gene.ID[, 'geneBank']
names(name.var) = rownames(liver.toxicity$gene.ID)

plotLoadings(plsda.liver, contrib = 'max', comp = 1, method = 'median', ndisplay = 100, 
            name.var = name.var, size.name = 0.4,
            legend.color = color.mixo(1:4))

## End(Not run)

# object 'sgccda'
# ----------------
## Not run: 
data(nutrimouse)
Y = nutrimouse$diet
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid)
design = matrix(c(0,1,1,1,0,1,1,1,0), ncol = 3, nrow = 3, byrow = TRUE)

nutrimouse.sgccda = wrapper.sgccda(X = data,
Y = Y,
design = design,
keepX = list(gene = c(10,10), lipid = c(15,15)),
ncomp = 2,
scheme = "centroid")

plotLoadings(nutrimouse.sgccda,block=2)
plotLoadings(nutrimouse.sgccda,block="gene")

## End(Not run)


# object 'mint.splsda'
# ----------------
data(stemcells)
data = stemcells$gene
type.id = stemcells$celltype
exp = stemcells$study

res = mint.splsda(X = data, Y = type.id, ncomp = 3, keepX = c(10,5,15), study = exp)

plotLoadings(res)
plotLoadings(res, contrib = "max")
plotLoadings(res, contrib = "min", study = 1:4,comp=2)

# combining different plots by setting a layout of 2 rows and 4columns.
# Note that the legend accounts for a subplot so 4columns instead of 2.
plotLoadings(res,contrib="min",study=c(1,2,3),comp=2, layout = c(2,4))
plotLoadings(res,contrib="min",study="global",comp=2)