plotVar: Plot of Variables

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/plotVar.R

Description

This function provides variables representation for (regularized) CCA, (sparse) PLS regression, PCA and (sparse) Regularized generalised CCA.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
plotVar(object,
comp = NULL,
comp.select = comp,
plot=TRUE,
var.names = NULL,
blocks = NULL, # to choose which block data to plot, when using GCCA module
X.label = NULL,
Y.label = NULL,
Z.label = NULL,
abline = TRUE,
col,
cex,
pch,
font,
cutoff = 0,
rad.in = 0.5,
title="Correlation Circle Plots",
legend = FALSE,
style="ggplot2", # can choose between graphics,3d, lattice or ggplot2,
overlap = TRUE,
axes.box = "all",
label.axes.box = "both")

Arguments

object

object of class inheriting from "rcc", "pls", "plsda", "spls", "splsda", "pca" or "spca".

comp

integer vector of length two. The components that will be used on the horizontal and the vertical axis respectively to project the variables. By default, comp=c(1,2) except when style='3d', comp=c(1:3)

comp.select

for the sparse versions, an input vector indicating the components on which the variables were selected. Only those selected variables are displayed. By default, comp.select=comp

plot

if TRUE (the default) then a plot is produced. If not, the summaries which the plots are based on are returned.

var.names

either a character vector of names for the variables to be plotted, or FALSE for no names. If TRUE, the col names of the first (or second) data matrix is used as names.

blocks

for an object of class "rgcca" or "sgcca", a numerical vector indicating the block variables to display.

X.label

x axis titles.

Y.label

y axis titles.

Z.label

z axis titles (when style = '3d').

abline

should the vertical and horizontal line through the center be plotted? Default set to FALSE

col

character or integer vector of colors for plotted character and symbols, can be of length 2 (one for each data set) or of length (p+q) (i.e. the total number of variables). See Details.

cex

numeric vector of character expansion sizes for the plotted character and symbols, can be of length 2 (one for each data set) or of length (p+q) (i.e. the total number of variables).

pch

plot character. A vector of single characters or integers, can be of length 2 (one for each data set) or of length (p+q) (i.e. the total number of variables). See points for all alternatives.

font

numeric vector of font to be used, can be of length 2 (one for each data set) or of length (p+q) (i.e. the total number of variables). See par for details.

cutoff

numeric between 0 and 1. Variables with correlations below this cutoff in absolute value are not plotted (see Details).

rad.in

numeric between 0 and 1, the radius of the inner circle. Defaults to 0.5.

title

character indicating the title plot.

legend

boolean. Whether the legend should be added. Default is TRUE.

style

argument to be set to either 'graphics', 'lattice', 'ggplot2' or '3d' for a style of plotting.

overlap

boolean. Whether the variables should be plotted in one single figure. Default is TRUE.

axes.box

for style '3d', argument to be set to either 'axes', 'box', 'bbox' or 'all', defining the shape of the box.

label.axes.box

for style '3d', argument to be set to either 'axes', 'box', 'both', indicating which labels to print.

Details

plotVar produce a "correlation circle", i.e. the correlations between each variable and the selected components are plotted as scatter plot, with concentric circles of radius one et radius given by rad.in. Each point corresponds to a variable. For (regularized) CCA the components correspond to the equiangular vector between X- and Y-variates. For (sparse) PLS regression mode the components correspond to the X-variates. If mode is canonical, the components for X and Y variables correspond to the X- and Y-variates respectively.

For plsda and splsda objects, only the X variables are represented.

For spls and splsda objects, only the X and Y variables selected on dimensions comp are represented.

The arguments col, pch, cex and font can be either vectors of length two or a list with two vector components of length p and q respectively, where p is the number of X-variables and q is the number of Y-variables. In the first case, the first and second component of the vector determine the graphics attributes for the X- and Y-variables respectively. Otherwise, multiple arguments values can be specified so that each point (variable) can be given its own graphic attributes. In this case, the first component of the list correspond to the X attributs and the second component correspond to the Y attributs. Default values exist for this arguments.

Value

A list containing the following components:

x

a vector of coordinates of the variables on the x-axis.

y

a vector of coordinates of the variables on the y-axis.

Block

the data block name each variable belongs to.

names

the name of each variable, matching their coordinates values.

Author(s)

Ignacio González, Kim-Anh Lê Cao, Benoit Gautier, Florian Rohart, Francois Bartolo.

References

González I., Lê Cao K-A., Davis, M.J. and Déjean, S. (2012). Visualising associations between paired 'omics data sets. J. Data Mining 5:19. http://www.biodatamining.org/content/5/1/19/abstract

See Also

cim, network, par and http://www.mixOmics.org for more details.

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
## variable representation for objects of class 'rcc'
# ----------------------------------------------------
data(nutrimouse)
X <- nutrimouse$lipid
Y <- nutrimouse$gene
nutri.res <- rcc(X, Y, ncomp = 3, lambda1 = 0.064, lambda2 = 0.008)

plotVar(nutri.res) #(default)

## Not run: 
plotVar(nutri.res, comp = c(1,3), cutoff = 0.5)

## End(Not run)

## variable representation for objects of class 'pls' or 'spls'
# ----------------------------------------------------
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic
toxicity.spls <- spls(X, Y, ncomp = 3, keepX = c(50, 50, 50), 
                      keepY = c(10, 10, 10))
	
plotVar(toxicity.spls, cex = c(1,0.8))

## variable representation for objects of class 'splsda'
# ----------------------------------------------------
## Not run: 
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- as.factor(liver.toxicity$treatment[, 4])

ncomp <- 2
keepX <- rep(20, ncomp)

splsda.liver <- splsda(X, Y, ncomp = ncomp, keepX = keepX)
plotVar(splsda.liver)

## End(Not run)

## variable representation for objects of class 'sgcca' (or 'rgcca')
# ----------------------------------------------------
## see example in ??wrapper.sgcca
data(nutrimouse)
# need to unmap the Y factor diet
Y = unmap(nutrimouse$diet)
# set up the data as list
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid, Y = Y)

# set up the design matrix:
# with this design, gene expression and lipids are connected to the diet factor
# design = matrix(c(0,0,1,
#                   0,0,1,
#                   1,1,0), ncol = 3, nrow = 3, byrow = TRUE)

# with this design, gene expression and lipids are connected to the diet factor
# and gene expression and lipids are also connected
design = matrix(c(0,1,1,
                  1,0,1,
                  1,1,0), ncol = 3, nrow = 3, byrow = TRUE)


#note: the penalty parameters will need to be tuned
wrap.result.sgcca = wrapper.sgcca(X = data, design = design, penalty = c(.3,.3, 1),
                                  ncomp = 2,
                                  scheme = "centroid")
wrap.result.sgcca

#variables selected on component 1 for each block
selectVar(wrap.result.sgcca, comp = 1, block = c(1,2))$'gene'$name
selectVar(wrap.result.sgcca, comp = 1, block = c(1,2))$'lipid'$name

#variables selected on component 2 for each block
selectVar(wrap.result.sgcca, comp = 2, block = c(1,2))$'gene'$name
selectVar(wrap.result.sgcca, comp = 2, block = c(1,2))$'lipid'$name

plotVar(wrap.result.sgcca, comp = c(1,2), block = c(1,2), comp.select = c(1,1),
title = c('Variables selected on component 1 only'))

## Not run: 
    plotVar(wrap.result.sgcca, comp = c(1,2), block = c(1,2), comp.select = c(2,2),
    title = c('Variables selected on component 2 only'))

    # -> this one shows the variables selected on both components
    plotVar(wrap.result.sgcca, comp = c(1,2), block = c(1,2),
    title = c('Variables selected on components 1 and 2'))

## End(Not run)
## variable representation for objects of class 'rgcca'
# ----------------------------------------------------
## Not run: 
data(nutrimouse)
# need to unmap Y for an unsupervised analysis, where Y is included as a data block in data
Y = unmap(nutrimouse$diet)

data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid, Y = Y)
# with this design, all blocks are connected
design = matrix(c(0,1,1,1,0,1,1,1,0), ncol = 3, nrow = 3, 
                byrow = TRUE, dimnames = list(names(data), names(data)))

nutrimouse.rgcca <- wrapper.rgcca(X = data,
                                         design = design,
                                         tau = "optimal",
                                         ncomp = 2,
                                         scheme = "centroid")

plotVar(nutrimouse.rgcca, comp = c(1,2), block = c(1,2), cex = c(1.5, 1.5))


    plotVar(nutrimouse.rgcca, comp = c(1,2), block = c(1,2))


    # set up the data as list
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid, Y =Y)
    # with this design, gene expression and lipids are connected to the diet factor
    # design = matrix(c(0,0,1,
    #                   0,0,1,
    #                   1,1,0), ncol = 3, nrow = 3, byrow = TRUE)

    # with this design, gene expression and lipids are connected to the diet factor
    # and gene expression and lipids are also connected
    design = matrix(c(0,1,1,
                      1,0,1,
                      1,1,0), ncol = 3, nrow = 3, byrow = TRUE)
    #note: the tau parameter is the regularization parameter
    wrap.result.rgcca = wrapper.rgcca(X = data, design = design, tau = c(1, 1, 0),
                                      ncomp = 2,
                                      scheme = "centroid")
    #wrap.result.rgcca
    plotVar(wrap.result.rgcca, comp = c(1,2), block = c(1,2))

## End(Not run)

Example output

Loading required package: MASS
Loading required package: lattice
Loading required package: ggplot2

Loaded mixOmics 6.2.0

Visit http://www.mixOmics.org for more details about our methods.
Any bug reports or comments? Notify us at mixomics at math.univ-toulouse.fr or https://bitbucket.org/klecao/package-mixomics/issues

Thank you for using mixOmics!
Warning messages:
1: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
2: 'rgl_init' failed, running with rgl.useNULL = TRUE 
3: .onUnload failed in unloadNamespace() for 'rgl', details:
  call: fun(...)
  error: object 'rgl_quit' not found 

Call:
 wrapper.sgcca(X = data, design = design, penalty = c(0.3, 0.3, 1), ncomp = 2, scheme = "centroid") 

 sGCCA with 2 components on block 1 named gene 
 sGCCA with 2 components on block 2 named lipid 
 sGCCA with 2 components on block 3 named Y 

 Dimension of block 1 is  40 120 
 Dimension of block 2 is  40 21 
 Dimension of block 3 is  40 5 

 Selection of 18 19 variables on each of the sGCCA components on the block 1 
 Selection of 4 2 variables on each of the sGCCA components on the block 2 
 Selection of 5 5 variables on each of the sGCCA components on the block 3 

 Main numerical outputs: 
 -------------------- 
 loading vectors: see object$loadings 
 variates: see object$variates 
 variable names: see object$names 

 Functions to visualise samples: 
 -------------------- 
 plotIndiv, plotArrow 

 Functions to visualise variables: 
 -------------------- 
 plotVar, plotLoadings, network

 Other functions: 
 -------------------- 
 selectVar 
 [1] "ACC2"      "PLTP"      "GSTpi2"    "apoC3"     "S14"       "FAT"      
 [7] "SR.BI"     "HMGCoAred" "i.FABP"    "UCP2"      "cHMGCoAS"  "Ntcp"     
[13] "SPI1.1"    "BSEP"      "CYP3A11"   "i.NOS"     "G6PDH"     "CYP27a1"  
[1] "C18.1n.7" "C18.1n.9" "C16.1n.7" "C14.0"   
 [1] "G6Pase"   "HPNCL"    "Lpin2"    "Lpin"     "Lpin1"    "CYP3A11" 
 [7] "GSTa"     "CYP2c29"  "C16SR"    "GSTmu"    "ACAT2"    "Tpalpha" 
[13] "CIDEA"    "mHMGCoAS" "BIEN"     "Waf1"     "apoC3"    "PPARd"   
[19] "Pex11a"  
[1] "C22.4n.6" "C20.2n.6"
Warning message:
In plotVar(nutrimouse.rgcca, comp = c(1, 2), block = c(1, 2), cex = c(1.5,  :
  We detected negative correlation between the variates of some blocks, which means that some clusters of variables observed on the correlation circle plot are not necessarily positively correlated.
Warning message:
In plotVar(nutrimouse.rgcca, comp = c(1, 2), block = c(1, 2)) :
  We detected negative correlation between the variates of some blocks, which means that some clusters of variables observed on the correlation circle plot are not necessarily positively correlated.

mixOmics documentation built on June 1, 2018, 5:06 p.m.

Related to plotVar in mixOmics...