nmf.mnnals: Nonnegative Matrix Factorization of Multiple data using...

Description Usage Arguments Value Author(s) References Examples

View source: R/nmf.mnnals.R

Description

Given a single or multiple types of datasets (e.g. DNA methylation, mRNA expression, protein expression, DNA copy number) measured on same set of samples and pre-defined number of clusters, the function carries out clustering of the samples together with cluster membership assignment to the samples utilizing all the data set in a single comprehensive step.

Usage

1
2
nmf.mnnals(dat = dat, k = k, maxiter = 200, st.count = 20, n.ini = 30, ini.nndsvd = TRUE,
seed = TRUE,wt=if(is.list(dat)) rep(1,length(dat)) else 1)

Arguments

dat

A single data or a list of multiple data matrices measured on same set of samples. For each data matrix in the list, samples should be on rows and genomic features should be on columns.

k

Number of clusters

maxiter

Maximum number of iteration, default is 200.

st.count

Count for stability in connectivity matrix, default is 20.

n.ini

Number of initializations of the random matrices, default is 30.

ini.nndsvd

Initialization of the Hi matrices using non negative double singular value decomposition (NNDSVD). If true, one of the initializations of algorithm will use NNDSVD. Default is TRUE.

seed

Random seed for initialization of algorithm, default is TRUE

wt

Weight, default is 1 for each data.

Value

consensus

Consensus matrix

W

Common basis matrix across the multiple data sets

H

List of data specific coefficient matrices.

convergence

Matrix with five columns and number of rows equal to number of iterative steps required to converge the algorithm or number of maximum iteration. The five columns represent number of iterations, count for stability in connectivity matrix, stability indicator (1/0), absolute difference in reconstruction error between ith and (i-1)th iteration and value of the objective function respectively.

min.f.WH

Collection of values of objective function at convergence for each initialization of the algorithm.

clusters

Cluster membership assignment to samples.

Author(s)

Prabhakar Chalise, Rama Raghavan, Brooke Fridley

References

Chalise P and Fridley B (2017). Integrative clustering of multi-level 'omic data based on non-negative matrix factorization algorithm. PLOS ONE, 12(5), e0176278.

Chalise P, Raghavan R and Fridley B (2016). InterSIM: Simulation tool for multiple integrative 'omic datasets. Computer Methods and Programs in Biomedicine, 128:69-74.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
#### Simulation of three interrelated dataset
#prop <- c(0.65,0.35)
#prop <- c(0.30,0.40,0.30)
prop <- c(0.20,0.30,0.27,0.23)
effect <- 2.5

library(InterSIM)
sim.D <- InterSIM(n.sample=100, cluster.sample.prop=prop, delta.methyl=effect,
delta.expr=effect, delta.protein=effect, p.DMP=0.25, p.DEG=NULL, p.DEP=NULL,
do.plot=FALSE, sample.cluster=TRUE, feature.cluster=TRUE)
dat1 <- sim.D$dat.methyl
dat2 <- sim.D$dat.expr
dat3 <- sim.D$dat.protein
true.cluster.assignment <- sim.D$clustering.assignment

## Make all data positive by shifting to positive direction.
## Also rescale the datasets so that they are comparable.
if (!all(dat1>=0)) dat1 <- pmax(dat1 + abs(min(dat1)), .Machine$double.eps)
dat1 <- dat1/max(dat1)
if (!all(dat2>=0)) dat2 <- pmax(dat2 + abs(min(dat2)), .Machine$double.eps)
dat2 <- dat2/max(dat2)
if (!all(dat3>=0)) dat3 <- pmax(dat3 + abs(min(dat3)), .Machine$double.eps)
dat3 <- dat3/max(dat3)
# The function nmf.mnnals requires the samples to be on rows and variables on columns.
dat1[1:5,1:5]
dat2[1:5,1:5]
dat3[1:5,1:5]
dat <- list(dat1,dat2,dat3)

# Find optimum number of clusters for the data
#opt.k <- nmf.opt.k(dat=dat, n.runs=5, n.fold=5, k.range=2:7, result=TRUE,
#make.plot=TRUE, progress=TRUE)
# Find clustering assignment for the samples
fit <- nmf.mnnals(dat=dat, k=length(prop), maxiter=200, st.count=20, n.ini=15,
ini.nndsvd=TRUE, seed=TRUE)
table(fit$clusters)
fit$clusters[1:10]

Example output

Loading required package: MASS
Loading required package: NMF
Loading required package: pkgmaker
Loading required package: registry
Loading required package: rngtools
Loading required package: cluster
NMF - BioConductor layer [OK] | Shared memory capabilities [OK] | Cores 2/2
Loading required package: mclust
Package 'mclust' version 5.4.3
Type 'citation("mclust")' for citing this R package in publications.
Loading required package: InterSIM
Loading required package: tools
         cg20139214 cg10999429 cg23640701 cg02956093 cg08711674
subject1 0.38345761 0.04270010  0.4709816 0.63698800  0.3204910
subject2 0.03882988 0.02616354  0.8556725 0.70186711  0.3457245
subject3 0.01284249 0.02230174  0.6899571 0.10279633  0.4823276
subject4 0.02271478 0.01309580  0.7696969 0.46762651  0.4311446
subject5 0.02009131 0.05519434  0.9891692 0.08300571  0.3813473
             ACACA    ACVRL1      AKT1    AKT1S1     ANXA1
subject1 0.5141161 0.4471168 0.4865514 0.4408803 0.3744637
subject2 0.3588073 0.4306060 0.5044448 0.3074571 0.3280014
subject3 0.3304403 0.4959506 0.3579689 0.5314552 0.4447205
subject4 0.3535681 0.4596985 0.5357364 0.3783785 0.3736336
subject5 0.3073558 0.6495787 0.3513035 0.4717229 0.6005244
              ACC1  ACC_pS79    ACVRL1 Akt_pS473 PRAS40_pT246
subject1 0.5310152 0.5466824 0.3792079 0.6275415    0.5935576
subject2 0.3367035 0.3444259 0.4062084 0.5821532    0.3857230
subject3 0.2950677 0.3195800 0.3880791 0.3745251    0.5831972
subject4 0.3094863 0.3034135 0.4084230 0.6055066    0.3775843
subject5 0.3239476 0.3193604 0.5875222 0.3742243    0.5620514
There were 18 warnings (use warnings() to see them)

 1  2  3  4 
20 23 30 27 
 subject1  subject2  subject3  subject4  subject5  subject6  subject7  subject8 
        1         3         4         3         2         1         4         3 
 subject9 subject10 
        3         1 

IntNMF documentation built on May 1, 2019, 6:35 p.m.

Related to nmf.mnnals in IntNMF...