knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

This package is developed for a new sampling algorithm to find the set of feasible solutions(SFS) from an initial solution of non-negative matrix factorization(NMF). Remember, non-negative matrix factorization takes a non-negative matrix $M(K \times G)$ and approximates it by two other non-negative matrices $P(K \times N)$ and $E(N \times G)$ such that \begin{equation} M \approx PE. \end{equation} Other solutions with the same approximation could be construct with an invertible matrix $A(N \times N)$ such that \begin{equation} \tilde{P} = PA \geq 0 \quad \tilde{E} = A^{-1}E \geq 0, \end{equation} are new solutions. There exist trivial ambiguities where $A$ is either a diagonal matrix or a permutation matrix, but besides these trivial ambiguities others could exist as well. The scaling ambiguity is removed by assuming the columns of $P$ sum to one. The goal of the main function \texttt{sampleSFS()} in this package is to approximate the whole SFS that exist for $P$ and $E$ besides the ambiguities. The advantage of this algorithm is that is has a simple implementation and can be applied for an arbitrary dimension of $N$. A further desciption can be found in the corresponding paper \textit{R. Laursen and A. Hobolth, A sampling algorithm to compute the set of feasible solutions for non-negative matrix factorization for an arbitrary rank.}.

The package includes the following functions:

Workflow of the package

Installation

The most simple way to install the package is using the package \textbf{devtools}.

devtools::install_github("ragnhildlaursen/SFS")
library(SFS)

Example of how to use functions

To illustrate the functions let us assume we have given a matrix of data $M (4 \times 6)$

M = matrix(c(20, 3, 24, 19,  2, 15, 
             9, 14, 25, 30, 15, 10,
             30, 6, 12, 10, 11,  7,
             9, 27, 5, 11, 19, 15),
           nrow = 4, ncol = 6)

First, we need to create an initial NMF solution which is made using the function \texttt{NMFPois}. The input for this function is a matrix $M$ and a rank $N$, that we here choose to be $3$.

rank = 3
initial_fit = NMFPois(M,rank)
initial_fit$P
initial_fit$E
initial_fit$P%*%initial_fit$E #approximation of M

Now, as an initial solution has been constructed one can find the $SFS$ with the function \texttt{sampleSFS}. Here, we just need the initial solutions of $P$ and $E$.

sfs.result = sampleSFS(initial_fit$P,initial_fit$E) 

To see the results in a simple way you could plot it in the following way

par(mfrow = c(rank,1), mar = c(3,3,2,2))
# plot of SFS for P
for(i in 1:rank){
  barplot(rbind(sfs.result$Pminimum[i,],sfs.result$Pmaximum[i,]), col = c('grey','red'), main = paste0("SFS for column ",i, " in P"))
}

# plot of SFS for E
for(i in 1:rank){
  barplot(rbind(sfs.result$Eminimum[i,],sfs.result$Emaximum[i,]), col = c('grey','red'), main = paste0("SFS for row ",i, " in E"))
}


ragnhildlaursen/SFS documentation built on Nov. 15, 2021, 8:28 p.m.