NMF_PCA: PCA plot with NMF cluster information

Description Usage Arguments Value

View source: R/NMF_PCA.R

Description

Clusters the data using NMF (Non-neagtive matrix factorization) after finding the optimal number of clusters in a dataset using the value of cophenetic coefficients.The results of the clustering are used along with PCA to see whether all the samples of a batch lie in the same cluster. The cophenetic coefficient plot, the PCA biplot, and the files containing the cophentic coefficeients (for number of clusters: 2 to 7) and the clustering information (for optimal k) is saved to the NMF folder created in the working directory.

Usage

1
2
3
4
5
6
7
8
9
NMF_PCA(
  expr,
  batch.info,
  nrun = 30,
  batch = "Batch",
  NameString = "",
  when,
  return.plot = FALSE
)

Arguments

expr

gene expression dataset (rows should be genes, column should be samples)

batch.info

contains the samples names and the batches they belong to

nrun

The number of runs for NMF, the default number is 30. The consensus matrices from the NMF results are used to compute the cophenetic coefficients

batch

title of the batch being used for correction

NameString

string that will be appear in all output filenames. Default="" (empty string)

when

String indicating when the clustering is taking place (before batch correction or after batch correction?)

return.plot

Should the plot be returned as an object to the environment? If FALSE, plot is saved to a pdf file, if TRUE, plot is returned to the environment. Default = FALSE

Value

Returns the optimal number of clusters (k) that has the maximum average silhouette width (ignoring k=2) and was used for clustering and plotting. If return.plot=TRUE, the cophenetic coefficient plot and the PCA plot denoting NMF clusters are also returned.


jankinsan/BatchEC documentation built on Sept. 9, 2021, 8:12 p.m.