PLS-Integrating: Integrating Multiple Large Datasets
In bioinfoDZ/RISC: Robust Integration of Single-Cell RNA-Seq Datasets

scPLS

R Documentation

Integrating Multiple Large Datasets

Description

The "scPLS" function can be used for data integration of multiple datasets, it is basically based on our new algorithm: reference principal components integration (RPCI). RPCI decomposes all the target datasets based on the reference. The output of this function can be used for low dimension visualization.

Usage

scPLS(
  objects,
  eigens = 10,
  add.Id = NULL,
  var.gene = NULL,
  npc = 100,
  adjust = TRUE,
  ncore = 1,
  seed = 123
)

Arguments

`objects`	The list of multiple RISC objects: listobject1, object2, object3, .... The first set is the reference to generate gene-eigenvectors.
`eigens`	The number of eigenvectors used for data integration.
`add.Id`	Add a vector of Id to label different datasets, a character vector.
`var.gene`	Define the variable genes manually. Here input a vector of gene names as variable genes
`npc`	The number of the PCs returns from "scMultiIntegrate" function, they are usually used for the subsequent analyses, like cell embedding and cell clustering.
`adjust`	Whether adjust the number of eigenvectors.
`ncore`	The number of multiple cores for data integration.
`seed`	The random seed to keep consistent result.

References

Liu et al., Nature Biotech. (2021)

Examples

obj1 = raw.mat[[3]]
obj2 = raw.mat[[4]]
obj0 = list(obj1, obj2)
var0 = intersect(obj1@vargene, obj2@vargene)
PLS0 = scPLS(obj0, var.gene = var0, npc = 20, add.Id = c("Set1", "Set2"), ncore = 1)

bioinfoDZ/RISC documentation built on March 30, 2024, 9:19 p.m.