LinkData-LinkData: LinkData: multiple heterogeneous dataset integration
In LinkHD: LinkHD: a versatile framework to explore and integrate heterogeneous data

Description Usage Arguments Value Author(s) References Examples

Integrating multiple Heterogeneous Datasets stored into a list. This function makes Statis using Distances options. Statis is part of the PCA family and is based on singular value decomposition (SVD) and the generalized singular value decomposition (GSVD) of a matrix. This methodology aims to analyze several data sets of variables that were collected on the same set of observations. Originally, the comparisons were drawn from the compute of the scalar product between the different tables. In our approach, the condition is relaxing allowing the incorporation of different distances.

1 2	LinkData(Data, Distance = c(), Center = FALSE, Scale = FALSE, CorrelVector = TRUE, nCluster = 0, cl_method = "pam")

`Data`	should be a list of dataframes or ExpressionSet data with the same length of the number of tables to be integrate. In each dataframe, the Observations (common elements on Statis) should be in rows and the variables should be in columns. Data also might be a MultiAssayExperiment object from MultiAssayExperiment package, a software for multi-omics experiments integration in Bioconductor.
`Distance`	Vector indicating which distance (including scalar product) should be applied to each study. If is missing, the scalar product is used. The vector lenght must be equal to the length of Data. Distance options: ScalarProduct, euclidean, manhattan, canberra, pearson, pearsonabs, spearman, spearmanabs, mahalanobis, BrayCurtis distance (please, use option Bray). For binary data, the distance can be jaccard, simple_matching, sokal_Sneath, Roger_Tanimoto, Dice, Hamman, Ochiai, Phi_Pearson, 'Gower&Legendre. Note that, use pre-processing option as compositional and Euclidean is the same than use Aitchison distance for compositional data.
`Center`	Logical. If TRUE, the data frame is centered by the mean. By default is FALSE. If you have tables with different characteristics (continous phenotypes, frecuencies, compositional data), we strongly recomendate normalize datasets as a previous step through DataProcessing option.
`Scale`	A logical value indicating whether the column vectors should be standardized by the rows weight, by default is FALSE. Note that all data into the list will be scaled. If you don't need normalizing all data, you could set this parameter as False and perform the normalization step externally by using DataProcessing function. If you have tables with different characteristics (continous phenotypes, frecuencies, compositional data), we strongly recomendate normalize datasets as a previous step through DataProcessing option.
`CorrelVector`	Logical. If TRUE (default), the RV matrix is computed using vectorial correlation, else the Hilbert-Smith distance is used.
`nCluster`	this variable indicates if common elements on the dataset should be grouped (by default is zero, i.e. no-cluster).
`cl_method`	categorical (pam or kmeans). pam is a robust version of classical kmeans algorithm.

LinkData

DistStatis class object with the corresponding completed slots according to the given model

Laura M Zingatetti

Escoufier, Y. (1976). Operateur associe a un tableau de donnees. Annales de laInsee, 22-23, 165-178.
Escoufier, Y. (1987). The duality diagram: a means for better practical applications. En P. Legendre & L. Legendre (Eds.), Developments in Numerical Ecology, pp. 139-156, NATO Advanced Institute, Serie G. Berlin: Springer.
L'Hermier des Plantes, H. (1976). Structuration des Tableaux a Trois Indices de la Statistique. [These de Troisieme Cycle]. University of Montpellier, France.

{
data(Taraoceans)
pro.phylo <- Taraoceans$taxonomy[ ,'Phylum']
TaraOc<-list(Taraoceans$phychem,as.data.frame(Taraoceans$pro.phylo)
,as.data.frame(Taraoceans$pro.NOGs))
TaraOc_1<-scale(TaraOc[[1]])
Normalization<-lapply(list(TaraOc[[2]],TaraOc[[3]]),
function(x){DataProcessing(x,Method='Compositional')})
colnames(Normalization[[1]])=pro.phylo
colnames(Normalization[[2]])=Taraoceans$GO
TaraOc<-list(TaraOc_1,Normalization[[1]],Normalization[[2]])
names(TaraOc)<-c('phychem','pro_phylo','pro_NOGs')
TaraOc<-lapply(TaraOc,as.data.frame)
Output<-LinkData(TaraOc,Scale =FALSE,Distance = c('ScalarProduct','Euclidean','Euclidean'))
}

LinkHD documentation built on Nov. 8, 2020, 5:08 p.m.

LinkHD index

README.md Link-HD: a versatile framework to explore and integrate heterogeneous data

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

LinkHD
LinkHD: a versatile framework to explore and integrate heterogeneous data

LinkData-LinkData: LinkData: multiple heterogeneous dataset integration
In LinkHD: LinkHD: a versatile framework to explore and integrate heterogeneous data

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to LinkData-LinkData in LinkHD...

R Package Documentation

Browse R Packages

We want your feedback!

LinkHD LinkHD: a versatile framework to explore and integrate heterogeneous data

LinkData-LinkData: LinkData: multiple heterogeneous dataset integration In LinkHD: LinkHD: a versatile framework to explore and integrate heterogeneous data

Description

Usage

Arguments

Value

Author(s)

References

Examples

Related to LinkData-LinkData in LinkHD...

R Package Documentation

Browse R Packages

We want your feedback!

LinkHD
LinkHD: a versatile framework to explore and integrate heterogeneous data

LinkData-LinkData: LinkData: multiple heterogeneous dataset integration
In LinkHD: LinkHD: a versatile framework to explore and integrate heterogeneous data