run_pca_cc_genes: Run PCA on Gene Ontology cell cycle genes

View source: R/run_pca_cc_genes.R

run_pca_cc_genesR Documentation

Run PCA on Gene Ontology cell cycle genes

Description

Run PCA on Gene Ontology cell cycle genes abd get a new SingleCellExperiment. User could use this function to learn new reference projection matrix.

Arguments

sce.o

A SingleCellExperiment contains library size normalized **log-expression** matrix.

gname

Alternative rownames of sce.o. If provided, this will be used to map genes within Gene Ontology cell cycle gene list. If not provided, the rownames of sce.o will be used instead. Default: NULL

exprs_values

Integer scalar or string indicating which assay of sce.o contains the **log-expression** values, which will be used to run PCA. Default: 'logcounts'

gname.type

The type of gene names as in gname or rownames of sce.o. It can be either 'ENSEMBL' or 'SYMBOL'. Default: 'ENSEMBL'

species

The type of species in sce.o. It can be either 'mouse' or 'human'. If the user uses custom cycleGene.l, this value will have no effect. Default: 'mouse'

AnnotationDb

An AnnotationDb objects. It is used to map ENSEMBL IDs to gene SYMBOLs. If no AnnotationDb object being given, the function will use org.Hs.eg.db or org.Mm.eg.db for human and mouse respectively.

ntop

The number of genes with highest variance to use when calculating PCA, as in calculatePCA. Default: 500

ncomponents

The number of component components to obtain, as in calculatePCA. Default: 20

name

String specifying the name to be used to store the result in the reducedDims of the output. Default: 'PCA'

Details

The function require an output of a SingleCellExperiment object which contains the library size normalized **log-expression** matrix. The full dataset will be subsetted to genes in the Gene Ontology cell cycle gene list (GO:0007049). The corresponding AnnotationDb object will be org.Mm.eg.db and org.Hs.eg.db for mouse and human respectively. If runSeuratBy is set, the data will be integrated to remove batch effect between samples/batches by Seurat.

User can use this function to make new reference projection matrix by getting the 'rotation' attribute in PCA results. Such as attr(reducedDim(sce.o, 'PCA'), 'rotation')[, 1:2]. See examples for more details.

Value

A subset SingleCellExperiment object with only GO cell cycle genes will be return. The PCA resulting will be save in reducedDims with chosen name reducedDims(..., name). If Seurat integration is performed, another reducedDims with name 'matched.'+name will also be included in the SingleCellExperiment.

Author(s)

Shijie C. Zheng

Examples

data(neurosphere_example, package = "tricycle")
### Use internal NeuroRef to project and infer tricyclePosition
neurosphere_example <- estimate_cycle_position(neurosphere_example) 

### Build new reference
gocc_sce.o <- run_pca_cc_genes(neurosphere_example)
new.ref <- attr(reducedDim(gocc_sce.o, "PCA"), "rotation")[, seq_len(2)]

### Use new reference to project and infer tricyclePosition
new_sce <- estimate_cycle_position(neurosphere_example, ref.m = new.ref,
 dimred = "tricycleEmbedding2")
plot(neurosphere_example$tricyclePosition, new_sce$tricyclePosition)

hansenlab/tricycle documentation built on March 19, 2022, 7:24 p.m.