Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/wrapper.sgcca.R

Wrapper function to perform Sparse Generalised Canonical Correlation
Analysis (sGCCA), a generalised approach for the integration of multiple
datasets. For more details, see the `help(sgcca)`

from the RGCCA
package.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |

`X` |
a list of data sets (called 'blocks') matching on the same samples.
Data in the list should be arranged in samples x variables. |

`design` |
numeric matrix of size (number of blocks in X) x (number of
blocks in X) with values between 0 and 1. Each value indicates the strenght
of the relationship to be modelled between two blocks using sGCCA; a value
of 0 indicates no relationship, 1 is the maximum value. If |

`penalty` |
numeric vector of length the number of blocks in |

`ncomp` |
the number of components to include in the model. Default to 1. |

`keepX` |
A vector of same length as X. Each entry keepX[i] is the number of X[[i]]-variables kept in the model. |

`scheme` |
Either "horst", "factorial" or "centroid" (Default: "horst"). |

`mode` |
character string. What type of algorithm to use, (partially)
matching one of |

`scale` |
boleean. If scale = TRUE, each block is standardized to zero means and unit variances (default: TRUE) |

`init` |
Mode of initialization use in the algorithm, either by Singular Value Decompostion of the product of each block of X with Y ("svd") or each block independently ("svd.single") . Default to "svd.single". |

`tol` |
Convergence stopping value. |

`max.iter` |
integer, the maximum number of iterations. |

`near.zero.var` |
boolean, see the internal |

`all.outputs` |
boolean. Computation can be faster when some specific
(and non-essential) outputs are not calculated. Default = |

This wrapper function performs sGCCA (see RGCCA) with *1, …
,*`ncomp`

components on each block data set. A supervised or
unsupervised model can be run. For a supervised model, the
`unmap`

function should be used as an input data set. More
details can be found on the package RGCCA.

Note that this function is the same as `block.spls`

with
different default arguments.

More details about the PLS modes in `?pls`

.

`wrapper.sgcca`

returns an object of class `"sgcca"`

, a
list that contains the following components:

`data` |
the input data set (as a list). |

`design` |
the input design. |

`variates` |
the sgcca components. |

`loadings` |
the loadings for each block data set (outer wieght vector). |

`loadings.star` |
the laodings, standardised. |

`penalty` |
the input penalty parameter. |

`scheme` |
the input schme. |

`ncomp` |
the number of components included in the model for each block. |

`crit` |
the convergence criterion. |

`AVE` |
Indicators of model quality based on the Average Variance Explained (AVE): AVE(for one block), AVE(outer model), AVE(inner model).. |

`names` |
list containing the names to be used for individuals and variables. |

More details can be found in the references.

Arthur Tenenhaus, Vincent Guillemot, Kim-Anh Lê Cao, Florian Rohart, Benoit Gautier, Al J Abadi

Tenenhaus A. and Tenenhaus M., (2011), Regularized Generalized Canonical Correlation Analysis, Psychometrika, Vol. 76, Nr 2, pp 257-284.

Tenenhaus A., Phillipe C., Guillemot, V., Lê Cao K-A., Grill J., Frouin, V. Variable Selection For Generalized Canonical Correlation Analysis. 2013. (in revision)

`wrapper.sgcca`

, `plotIndiv`

,
`plotVar`

, `wrapper.rgcca`

and
http://www.mixOmics.org for more details.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | ```
data(nutrimouse)
# need to unmap the Y factor diet if you pretend this is not a classification pb.
# see also the function block.splsda for discriminant analysis where you dont
# need to unmap Y.
Y = unmap(nutrimouse$diet)
data = list(gene = nutrimouse$gene, lipid = nutrimouse$lipid, Y = Y)
# with this design, gene expression and lipids are connected to the diet factor
# design = matrix(c(0,0,1,
# 0,0,1,
# 1,1,0), ncol = 3, nrow = 3, byrow = TRUE)
# with this design, gene expression and lipids are connected to the diet factor
# and gene expression and lipids are also connected
design = matrix(c(0,1,1,
1,0,1,
1,1,0), ncol = 3, nrow = 3, byrow = TRUE)
#note: the penalty parameters will need to be tuned
wrap.result.sgcca = wrapper.sgcca(X = data, design = design, penalty = c(.3,.5, 1),
ncomp = 2,
scheme = "centroid")
wrap.result.sgcca
#did the algo converge?
wrap.result.sgcca$crit # yes
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.