knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

findPC is a software tool including six methods to automatically select the number of principal components to retain based on the standard deviations explained by each PC. A major advantage of findPC is that the only information required is a numeric vector of standard deviations explained by each PC.

Installation

findPC software can be installed via Github. Users should have R installed on their computer before installing findPC. R can be downloaded here: http://www.r-project.org/. To install the latest version of findPC package via Github, run following commands in R:

if (!require("devtools"))
install.packages("devtools")
devtools::install_github("haotian-zhuang/findPC")
library(findPC)

findPC function

The synopsis of findPC is:
findPC(sdev,number = 30,method = 'perpendicular line',aggregate = NULL,figure = FALSE)

The default is to return the number of PCs by Perpendicular line with 30 PCs. The following codes take the 50 PCs of the single-cell RNA-seq data in findPC package as an example.

data(procdata)
sdev<-prcomp(t(procdata),scale. = T)$sdev[1:50]
head(sdev)
findPC(sdev = sdev)

The argument 'sdev' should be sorted in decreasing order.

findPC(sdev = -sdev)

Number

The argument 'number' is a vector including number of PCs used in the following function.

findPC(sdev = sdev,number = 51)
findPC(sdev = sdev,number = c(24,36,48))

Method

The argument ‘method’ specifies the six methods or returns the six results simultaneously.

findPC(sdev = sdev,method = 'xxx')
findPC(sdev = sdev,number = c(24,36,48),method = 'all')

Method 1: Piecewise linear model

findPC(sdev = sdev,number = c(24,36,48),method = 'piecewise linear model')

Method 2: First derivative

findPC(sdev = sdev,number = c(24,36,48),method = 'first derivative')

Method 3: Second derivative

findPC(sdev = sdev,number = c(24,36,48),method = 'second derivative')

Method 4: Preceding residual

findPC(sdev = sdev,number = c(24,36,48),method = 'preceding residual')

Method 5: Perpendicular line

findPC(sdev = sdev,number = c(24,36,48),method = 'perpendicular line')

Method 6: K-means clustering

findPC(sdev = sdev,number = c(24,36,48),method = 'k-means clustering')

Aggregate

If users are also interested in the mean, median, or voting (median if all are different, otherwise mode) of the results, the argument ‘aggregate’ will support them.

findPC(sdev = sdev,number = c(24,36,48),method = 'all',aggregate = 'mean')
findPC(sdev = sdev,number = c(24,36,48),method = 'all',aggregate = 'median')
findPC(sdev = sdev,number = c(24,36,48),method = 'all',aggregate = 'voting')

Figure

The last argument ‘figure’ provides the option to draw a heatmap showing the results.

findPC(sdev = sdev,number = c(24,36,48),method = 'all',figure = T)

Session Info

sessionInfo()



haotian-zhuang/findPC documentation built on July 1, 2022, 10:42 a.m.