plotArrow: Arrow sample plot
In mixOmics: Omics Data Integration Project

Description Usage Arguments Details Value Author(s) References See Also Examples

Represents samples from multiple coordinates to assess the alignment in the latent space.

plotArrow(
  object,
  comp = c(1, 2),
  ind.names = TRUE,
  group = NULL,
  col.per.group = NULL,
  col = NULL,
  ind.names.position = c("start", "end"),
  ind.names.size = 2,
  pch = NULL,
  pch.size = 2,
  arrow.alpha = 0.6,
  arrow.size = 0.5,
  arrow.length = 0.2,
  legend = if (is.null(group)) FALSE else TRUE,
  legend.title = NULL,
  ...
)

`object`	object of class inheriting from mixOmics: `PLS, sPLS, rCC, rGCCA, sGCCA, sGCCDA`
`comp`	integer vector of length two (or three to 3d). The components that will be used on the horizontal and the vertical axis respectively to project the individuals.
`ind.names`	either a character vector of names for the individuals to be plotted, or `FALSE` for no names. If `TRUE`, the row names of the first (or second) data matrix is used as names (see Details).
`group`	Factor indicating the group membership for each sample.
`col.per.group`	character (or symbol) color to be used when 'group' is defined. Vector of the same length as the number of groups.
`col`	character (or symbol) color to be used, possibly vector.
`ind.names.position`	One of c('start', 'end') indicating where to show the ind.names . Not used in block analyses, where centroids are used.
`ind.names.size`	Numeric, sample name size.
`pch`	plot character. A character string or a named vector of single characters or integers whose names match those of `object$variates`.
`pch.size`	Numeric, sample point character size.
`arrow.alpha`	Numeric between 0 and 1 determining the opacity of arrows.
`arrow.size`	Numeric, variable arrow head size.
`arrow.length`	Numeric, length of the arrow head in 'cm'.
`legend`	Logical, whether to show the legend if `group != NULL`.
`legend.title`	Character, the legend title if `group != NULL`.
`...`	Not currently used. sample size to display sample names.

Graphical of the samples (individuals) is displayed in a superimposed manner where each sample will be indicated using an arrow. The start of the arrow indicates the location of the sample in X in one plot, and the tip the location of the sample in Y in the other plot. Short arrows indicate a strong agreement between the matching data sets, long arrows a disagreement between the matching data sets. The representation space is scaled using the range of coordinates so minimum and maximum values are equal for all blocks. Since the algorithm maximises the covariance of these components, the absolute values do not affect the alignment.

For objects of class "GCCA" and if there are more than 2 blocks, the start of the arrow indicates the centroid between all data sets for a given individual and the tips of the arrows the location of that individual in each block.

A ggplot object

Al J Abadi

Lê Cao, K.-A., Martin, P.G.P., Robert-Granie, C. and Besse, P. (2009). Sparse canonical methods for biological data integration: application to a cross-platform study. BMC Bioinformatics 10:34.

arrows, text, points and http://mixOmics.org/graphics for more details.

## plot of individuals for objects with two datasets only (X and Y)
# ----------------------------------------------------
data(nutrimouse)
X <- nutrimouse$lipid
Y <- nutrimouse$gene
nutri.res <- rcc(X, Y, ncomp = 3, lambda1 = 0.064, lambda2 = 0.008)

## plot of individuals for objects of class 'pls' or 'spls'
# ----------------------------------------------------
plotArrow(nutri.res)
## customise the ggplot object as you wish
plotArrow(nutri.res) + geom_vline(xintercept = 0, alpha = 0.5) + 
    geom_hline(yintercept = 0, alpha = 0.5) +
    labs(x = 'Dim 1' , y = 'Dim 2', title = 'Nutrimouse') +
    theme_minimal()
## individual name position
plotArrow(nutri.res, ind.names.position = 'end')
plotArrow(nutri.res, comp = c(1,3))
## custom pch
plotArrow(nutri.res, pch = 10, pch.size = 3)
plotArrow(nutri.res, pch = c(X = 1, Y = 0))
## custom arrow
plotArrow(nutri.res, arrow.alpha = 0.6, arrow.size = 0.6, arrow.length = 0.15)

## group samples
plotArrow(nutri.res, group = nutrimouse$genotype)
plotArrow(nutri.res, group = nutrimouse$genotype, legend.title = 'Genotype')

## custom ind.names
plotArrow(nutri.res,
           ind.names = paste0('ID', rownames(nutrimouse$gene)), 
           ind.names.size = 3)

## plot of individuals for objects of class 'pls' or 'spls'
# ----------------------------------------------------
data(liver.toxicity)
X <- liver.toxicity$gene
Y <- liver.toxicity$clinic
toxicity.spls <- spls(X, Y, ncomp = 3, keepX = c(50, 50, 50),
                      keepY = c(10, 10, 10))

# colors indicate time of necropsy, text is the dose, label at start of arrow
plotArrow(toxicity.spls,  group = liver.toxicity$treatment[, 'Time.Group'],
           ind.names  = liver.toxicity$treatment[, 'Dose.Group'],
           legend = TRUE, position.names = 'start', legend.title = 'Time.Group')

## individual representation for objects of class 'sgcca' (or 'rgcca')
# ----------------------------------------------------
data(nutrimouse)
Y = unmap(nutrimouse$diet)
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid, Y = Y)
design1 = matrix(c(0,1,1,1,0,1,1,1,0), ncol = 3, nrow = 3, byrow = TRUE)
nutrimouse.sgcca <- wrapper.sgcca(X = data,
                                  design = design1,
                                  penalty = c(0.3, 0.5, 1),
                                  ncomp = 3,
                                  scheme = "centroid")

plotArrow(nutrimouse.sgcca, group = nutrimouse$genotype, ind.names = TRUE, 
           legend.title = 'Genotype' )

## custom pch by block
blocks <- names(nutrimouse.sgcca$variates)
pch <- seq_along(blocks)
names(pch) <- blocks
pch
#>   gene   lipid     Y 
#>   1       2        3 
plotArrow(nutrimouse.sgcca, group = nutrimouse$genotype, ind.names = TRUE, 
           pch =pch, legend.title = 'Genotype' )

## individual representation for objects of class 'sgccda'
# ----------------------------------------------------
# Note: the code differs from above as we use a 'supervised' GCCA analysis
data(nutrimouse)
Y = nutrimouse$diet
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid)
design1 = matrix(c(0,1,0,1), ncol = 2, nrow = 2, byrow = TRUE)

nutrimouse.sgccda1 <- 
    wrapper.sgccda(X = data,
                   Y = Y,
                   design = design1,
                   ncomp = 2,
                   keepX = list(gene = c(10,10), lipid = c(15,15)),
                   scheme = "centroid")


## Default colours correspond to outcome Y
plotArrow(nutrimouse.sgccda1)