featureScore  R Documentation 
The function featureScore
implements different
methods to computes basisspecificity scores for each
feature in the data.
The function extractFeatures
implements different
methods to select the most basisspecific features of
each basis component.
featureScore(object, ...) ## S4 method for signature 'matrix' featureScore(object, method = c("kim", "max")) extractFeatures(object, ...) ## S4 method for signature 'matrix' extractFeatures(object, method = c("kim", "max"), format = c("list", "combine", "subset"), nodups = TRUE)
object 
an object from which scores/features are computed/extracted 
... 
extra arguments to allow extension 
method 
scoring or selection method. It specifies the name of one of the method described in sections Feature scores and Feature selection. Additionally for Note that 
format 
output format. The following values are accepted:

nodups 
logical that indicates if duplicated
indexes, i.e. features selected on multiple basis
components (which should in theory not happen), should be
only appear once in the result. Only used when

One of the properties of Nonnegative Matrix Factorization is that is tend to produce sparse representation of the observed data, leading to a natural application to biclustering, that characterises groups of samples by a small number of features.
In NMF models, samples are grouped according to the basis
components that contributes the most to each sample, i.e.
the basis components that have the greatest coefficient
in each column of the coefficient matrix (see
predict,NMFmethod
). Each group of samples
is then characterised by a set of features selected based
on basisspecifity scores that are computed on the basis
matrix.
featureScore
returns a numeric vector of the
length the number of rows in object
(i.e. one
score per feature).
extractFeatures
returns the selected features as a
list of indexes, a single integer vector or an object of
the same class as object
that only contains the
selected features.
signature(object =
"matrix")
: Select features on a given matrix, that
contains the basis component in columns.
signature(object = "NMF")
:
Select basisspecific features from an NMF model, by
applying the method extractFeatures,matrix
to its
basis matrix.
signature(object = "matrix")
:
Computes feature scores on a given matrix, that contains
the basis component in columns.
signature(object = "NMF")
:
Computes feature scores on the basis matrix of an NMF
model.
The function featureScore
can compute
basisspecificity scores using the following methods:
Method defined by Kim et al. (2007).
The score for feature i is defined as:
S_i = 1 + 1/log2(k) sum_q [ p(i,q) log2( p(i,q) ) ]
,
where p(i,q) is the probability that the ith feature contributes to basis q:
p(i,q) = W(i,q) / (sum_r W(i,r))
The feature scores are real values within the range [0,1]. The higher the feature score the more basisspecific the corresponding feature.
Method defined by CarmonaSaez et al. (2006).
The feature scores are defined as the row maximums.
The function extractFeatures
can select features
using the following methods:
uses Kim et al. (2007) scoring schema and feature selection method.
The features are first scored using the function
featureScore
with method ‘kim’. Then only
the features that fulfil both following criteria are
retained:
score greater than \hat{μ} + 3 \hat{σ}, where \hat{μ} and \hat{σ} are the median and the median absolute deviation (MAD) of the scores respectively;
the maximum contribution to a basis component is greater than the median of all contributions (i.e. of all elements of W).
uses the selection method used in
the bioNMF
software package and described in
CarmonaSaez et al. (2006).
For each basis component, the features are first sorted by decreasing contribution. Then, one selects only the first consecutive features whose highest contribution in the basis matrix is effectively on the considered basis.
Kim H and Park H (2007). "Sparse nonnegative matrix factorizations via alternating nonnegativityconstrained least squares for microarray data analysis." _Bioinformatics (Oxford, England)_, *23*(12), pp. 1495502. ISSN 14602059, <URL: http://dx.doi.org/10.1093/bioinformatics/btm134>, <URL: http://www.ncbi.nlm.nih.gov/pubmed/17483501>.
CarmonaSaez P, PascualMarqui RD, Tirado F, Carazo JM and PascualMontano A (2006). "Biclustering of gene expression data by Nonsmooth Nonnegative Matrix Factorization." _BMC bioinformatics_, *7*, pp. 78. ISSN 14712105, <URL: http://dx.doi.org/10.1186/14712105778>, <URL: http://www.ncbi.nlm.nih.gov/pubmed/16503973>.
# random NMF model x < rnmf(3, 50,20) # probably no feature is selected extractFeatures(x) # extract top 5 for each basis extractFeatures(x, 5L) # extract features that have a relative basis contribution above a threshold extractFeatures(x, 0.5) # ambiguity? extractFeatures(x, 1) # means relative contribution above 100% extractFeatures(x, 1L) # means top contributing feature in each component
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.