RunPCA.PSI: Principle component analysis for splicing data

View source: R/Script_PLATE_08_PCA_1_RunPCA_PSI.R

RunPCA.PSIR Documentation

Principle component analysis for splicing data

Description

Performs principle component analysis using PSI values.

Usage

RunPCA.PSI(
  MarvelObject,
  sample.ids = NULL,
  cell.group.column,
  cell.group.order,
  cell.group.colors = NULL,
  features,
  min.cells = 25,
  min.pct.events = NULL,
  point.size = 0.5,
  point.alpha = 0.75,
  point.stroke = 0.1,
  method.impute = "random",
  seed = 1,
  pcs = c(1, 2),
  mode = "pca",
  seed.umap = 42,
  npc.umap = 30,
  remove.outliers = FALSE,
  npc.elbow.plot = 50
)

Arguments

MarvelObject

Marvel object. S3 object generated from ComputePSI function.

sample.ids

Character strings. Specific cells to plot.

cell.group.column

Character string. The name of the sample metadata column in which the variables will be used to label the cell groups on the PCA.

cell.group.order

Character string. The order of the variables under the sample metadata column specified in cell.group.column to appear in the PCA cell group legend.

cell.group.colors

Character string. Vector of colors for the cell groups specified for PCA analysis using cell.type.columns and cell.group.order. If not specified, default ggplot2 colors will be used.

features

Character string. Vector of tran_id for analysis. Should match tran_id column of MarvelObject$ValidatedSpliceFeature.

min.cells

Numeric value. The minimum no. of cells expressing the splicing event, above which, the splicing event will be retained for analysis.

point.size

Numeric value. Size of data points on reduced dimension space.

point.alpha

Numeric value. Transparency of the data points on reduced dimension space. Take any values between 0 to 1. The smaller the value, the more transparent the data points will be.

point.stroke

Numeric value. The thickness of the outline of the data points. The larger the value, the thicker the outline of the data points.

method.impute

Character string. Indicate the method for imputing missing PSI values (low coverage). "random" method randomly assigns any values between 0-1. "Bayesian" method uses the posterior PSI computed from the ComputePSI.Posterior function. Default is "random".

seed

Numeric value. Ensures imputed values for NA PSIs are reproducible when method.impute option set to "random". Default value is 1.

pcs

Numeric vector. The principal components (PCs) to plot. Default is the first two PCs, i.e., c(1,2). If a vector of 3 is specified, a 3D scatterplot is returned.

mode

Character string. Specify "pca" for linear dimension reduction analysis or "umap" for non-linear dimension reduction analysis. Specify "elbow.plot" to return eigen values. Default is "pca".

seed.umap

Numeric value. Only applicable when mode set to "umap". To sure reproducibility of analysis. Default value is 42.

npc.umap

Numeric value. Only applicable when mode set to "umap". Incidate the number of PCs to include for UMAP. Default value is 30.

remove.outliers

Logical value. If set to TRUE, outliers will be removed. Outliers defined as data points beyond 1.5 times the interquartile range (IQR) from the 1st and 99th percentile. Default is FALSE.

npc.elbow.plot

Numeric value. Only applicable when mode set to "elbow.plot". Incidate the number of PCs to for elbow plot. Default value is 50.

min.events.pct

Numeric value. The minimum percentage of events expressed in a cell, above which, the cell will be retained for analysis. By default, this option is switched off, i.e., NULL.

Value

An object of class S3 containing with new slots MarvelObject$PCA$PSI$Results and MarvelObject$PCA$PSI$Plot

Examples

marvel.demo <- readRDS(system.file("extdata/data", "marvel.demo.rds", package="MARVEL"))

# Define splicing events for analysis
df <- do.call(rbind.data.frame, marvel.demo$PSI)
tran_ids <- df$tran_id

# PCA
marvel.demo <- RunPCA.PSI(MarvelObject=marvel.demo,
                          sample.ids=marvel.demo$SplicePheno$sample.id,
                          cell.group.column="cell.type",
                          cell.group.order=c("iPSC", "Endoderm"),
                          cell.group.colors=NULL,
                          min.cells=5,
                          features=tran_ids,
                          point.size=2
                          )

# Check outputs
head(marvel.demo$PCA$PSI$Results$ind$coord)
marvel.demo$PCA$PSI$Plot

wenweixiong/MARVEL documentation built on Aug. 5, 2024, 2:54 p.m.