Description Usage Arguments Details Value Author(s) References See Also Examples
Returns a completed matrix of peptide log-intensity where missing values (NAs) are imputated
by low-rank approximation of the input matrix. Non-NA entries remain unmodified. msImpute
requires at least 4
non-missing measurements per peptide across all samples. It is assumed that peptide intensities (DDA), or MS1/MS2 normalised peak areas (DIA),
are log2-transformed and normalised (e.g. by quantile normalisation).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
y |
Numeric matrix giving log-intensity where missing values are denoted by NA. Rows are peptides, columns are samples. |
method |
character. Allowed values are |
group |
character or factor vector of length |
a |
numeric. the weight parameter. default to 0.2. |
rank.max |
Numeric. This restricts the rank of the solution. is set to min(dim( |
lambda |
Numeric. Nuclear-norm regularization parameter. Controls the low-rank property of the solution to the matrix completion problem. By default, it is determined at the scaling step. If set to zero the algorithm reverts to "hardImputation", where the convergence will be slower. Applicable to "v1" only. |
thresh |
Numeric. Convergence threshold. Set to 1e-05, by default. Applicable to "v1" only. |
maxit |
Numeric. Maximum number of iterations of the algorithm before the algorithm is converged. 100 by default. Applicable to "v1" only. |
trace.it |
Logical. Prints traces of progress of the algorithm. Applicable to "v1" only. |
warm.start |
List. A SVD object can be used to initialize the algorithm instead of random initialization. Applicable to "v1" only. |
final.svd |
Logical. Shall final SVD object be saved? The solutions to the matrix completion problems are computed from U, D and V components of final SVD. Applicable to "v1" only. |
biScale_maxit |
number of iteration for the scaling algorithm to converge . See |
msImpute
operates on the softImpute-als
algorithm in softImpute
package.
The algorithm estimates a low-rank matrix ( a smaller matrix
than the input matrix) that approximates the data with a reasonable accuracy. SoftImpute-als
determines the optimal
rank of the matrix through the lambda
parameter, which it learns from the data.
This algorithm is implemented in method="v1"
.
In v2 we have used a information theoretic approach to estimate the optimal rank, instead of relying on softImpute-als
defaults. Similarly, we have implemented a new approach to estimate lambda
from the data. Low-rank approximation
is a linear reconstruction of the data, and is only appropriate for imputation of MAR data. In order to make the
algorithm applicable to MNAR data, we have implemented method="v2-mnar"
which imputes the missing observations
as weighted sum of values imputed by msImpute v2 (method="v2"
) and random draws from a Gaussian distribution.
Missing values that tend to be missing completely in one or more experimental groups will be weighted more (shrunken) towards
imputation by sampling from a Gaussian parameterised by smallest observed values in the sample (similar to minProb, or
Perseus). However, if the missing value distribution is even across the samples for a peptide, the imputed values
for that peptide are shrunken towards
low-rank imputed values. The judgment of distribution of missing values is based on the EBM metric implemented in
selectFeatures
, which is also a information theory measure.
Missing values are imputed by low-rank approximation of the input matrix. If input is a numeric matrix, a numeric matrix of identical dimensions is returned.
Soroor Hediyeh-zadeh
Hastie, T., Mazumder, R., Lee, J. D., & Zadeh, R. (2015). Matrix completion and low-rank SVD via fast alternating least squares. The Journal of Machine Learning Research, 16(1), 3367-3402.
Hediyeh-zadeh, S., Webb, A. I., & Davis, M. J. (2020). MSImpute: Imputation of label-free mass spectrometry peptides by low-rank approximation. bioRxiv.
selectFeatures
1 2 3 4 |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.