Find most important principal components for a given variable

Description

Find most important principal components for a given variable

Usage

1
2
3
findImportantPCs(object, variable = "total_features",
  plot_type = "pcs-vs-vars", exprs_values = "exprs", ntop = 500,
  feature_set = NULL, scale_features = TRUE, theme_size = 10)

Arguments

object

an SCESet object containing expression values and experimental information. Must have been appropriately prepared.

variable

character scalar providing a variable name (column from pData(object)) for which to determine the most important PCs.

plot_type

character string, indicating which type of plot to produce. Default, "pairs-pcs" produces a pairs plot for the top 5 PCs based on their R-squared with the variable of interest. A value of "pcs-vs-vars" produces plots of the top PCs against the variable of interest.

exprs_values

which slot of the assayData in the object should be used to define expression? Valid options are "counts" (default), "tpm", "fpkm" and "exprs", or anything else in the object added manually by the user.

ntop

numeric scalar indicating the number of most variable features to use for the PCA. Default is 500, but any ntop argument is overrided if the feature_set argument is non-NULL.

feature_set

character, numeric or logical vector indicating a set of features to use for the PCA. If character, entries must all be in featureNames(object). If numeric, values are taken to be indices for features. If logical, vector is used to index features and should have length equal to nrow(object).

scale_features

logical, should the expression values be standardised so that each feature has unit variance? Default is TRUE.

theme_size

numeric scalar providing base font size for ggplot theme.

Details

Plot the top 5 or 6 most important PCs (depending on the plot_type argument for a given variable. Importance here is defined as the R-squared value from a linear model regressing each PC onto the variable of interest.

Value

a ggplot plot object

Examples

1
2
3
4
5
6
7
8
9
data("sc_example_counts")
data("sc_example_cell_info")
pd <- new("AnnotatedDataFrame", data = sc_example_cell_info)
rownames(pd) <- pd$Cell
example_sceset <- newSCESet(countData = sc_example_counts, phenoData = pd)
drop_genes <- apply(exprs(example_sceset), 1, function(x) {var(x) == 0})
example_sceset <- example_sceset[!drop_genes, ]
example_sceset <- calculateQCMetrics(example_sceset)
findImportantPCs(example_sceset, variable="total_features")

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.