Linnorm.PCA: Linnorm-PCA Clustering pipeline for subpopulation Analysis

Description Usage Arguments Details Value Examples


This function first performs Linnorm transformation on the dataset. Then, it will perform Principal component analysis on the dataset and use k-means clustering to identify subpopulations of cells.


Linnorm.PCA(datamatrix, RowSamples = FALSE, input = "Raw", MZP = 0,
  DataImputation = TRUE, num_PC = 3, num_center = c(1:20), Group = NULL,
  Coloring = "kmeans", pca.scale = FALSE, kmeans.iter = 2000,
  plot.title = "PCA K-means clustering", ...)



The matrix or data frame that contains your dataset. Each row is a feature (or Gene) and each column is a sample (or replicate). Raw Counts, CPM, RPKM, FPKM or TPM are supported. Undefined values such as NA are not supported. It is not compatible with log transformed datasets.


Logical. In the datamatrix, if each row is a sample and each row is a feature, set this to TRUE so that you don't need to transpose it. Linnorm works slightly faster with this argument set to TRUE, but it should be negligable for smaller datasets. Defaults to FALSE.


Character. "Raw" or "Linnorm". In case you have already transformed your dataset with Linnorm, set input into "Linnorm" so that you can put the Linnorm transformed dataset into the "datamatrix" argument. Defaults to "Raw".


Double >=0, <= 1. Minimum non-Zero Portion Threshold for this function. Genes not satisfying this threshold will be removed from the analysis. For exmaple, if set to 0.3, genes without at least 30 percent of the samples being non-zero will be removed. Defaults to 0.


Logical. Perform data imputation on the dataset after transformation. Defaults to TRUE.


Integer >= 2. Number of principal componenets to be used in K-means clustering. Defaults to 3.


Numeric vector. Number of clusters to be tested for k-means clustering. fpc, vegan, mclust and apcluster packages are used to determine the number of clusters needed. If only one number is supplied, it will be used and this test will be skipped. Defaults to c(1:20).


Character vector with length equals to sample size. Each character in this vector corresponds to each of the columns (samples) in the datamatrix. In the plot, the shape of the points that represent each sample will be indicated by their group assignment. Defaults to NULL.


Character. "kmeans" or "Group". If Group is not NULL, coloring in the PCA plot will reflect each sample's group. Otherwise, coloring will reflect k means clustering results. Defaults to "Group".


Logical. In the prcomp(for Principal component analysis) function, set the "scale." parameter. It signals the function to scale unit variances in the variables before the analysis takes place. Defaults to FALSE.


Numeric. Number of iterations in k-means clustering. Defaults to 2000.


Character. Set the title of the plot. Defaults to "PCA K-means clustering".


arguments that will be passed into Linnorm's transformation function.


This function performs PCA clustering using Linnorm transformation.


It returns a list with the following objects:

  • k_means: Output of kmeans(for K-means clustering) from the stat package. Note: It contains a "cluster" object that indicates each sample's cluster assignment.

  • PCA: Output of prcomp(for Principal component analysis) from the stat package.

  • plot: Plot of PCA clustering.

  • Linnorm: Linnorm transformed data matrix.


#Obtain example matrix:
PCA.results <- Linnorm.PCA(Islam2011)

Search within the Linnorm package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? or email at

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.