nmfFitMetrics: CNomplexity::nmfFitMetrics

Description Usage Arguments Details Value

View source: R/rearrSigs.R

Description

Cluster signatures extracted from multiple runs of nmfBoot for a given number of signatures, N, into N clusters. Calculate Frobenius distance between reconstructed data from the means of these clustered signatures and the original data, as well as the average silhouette width of the members of each clustered signature. Aim to maximise the silhouette width, while minimising the Frobenius distance.

Usage

1
2
3
4
5
nmfFitMetrics(N,res,data,
	plotsil=FALSE,maxiter=1000,nstarts=20,
	algo="Hartigan-Wong",doMetric=TRUE,
	clusterMeth="kmeans",hclustInput="sigs",
	kmeansDist="cosine") 

Arguments

N

String. Which rank of res to generate metrics for.

res

Results object from a replicate of bootNMF.

data

Matrix of n*m that was handed to bootNMF. n=number of samples, m=number of features.

plotsil

Boolean. Whetehr to plot results of silhouette.

maxiter

Maximum iterations for kmeans. Only used if kmeansDist is not cosine.

nstarts

Number of starts for kmeans. Only used if kmeansDist is not cosine.

algo

Algorithm for kmeans. Only used if kmeansDist is not cosine.

doMetric

Boolean. Whether to calculate metrics.

clusterMeth

Cluster method. Defaults to k-means clustering. Otherwise, heirarchical clustering.

hclustInput

Input to heirarchical clustering. Default is the raw signatures. Otherwise, cosine distances between signatures.

kmeansDist

Distance metric for kmeans. Defaults to cosine. Otherwise, euclidean distance.

Details

Method for selecting the number of signatures that are appropriate for a given dataset. Described in Alexandrov et al. (2013).

Value

If doMetric=TRUE, a list with 4 elements. sil=average silhouette width across groups. frob=Frobenius distance between the reconstructed data from signatures and the original data. P=extracted signatures, m*N matrix where m is the number of features, and N is the number of signatures to extract. E=exposures, N*n matrix, where n is the numebr of samples in original data. If doMetric=FALSE, only P and E are returned.


SteeleCD/CNomplexity documentation built on May 29, 2019, 2:09 p.m.