Description Usage Arguments Details Value Author(s) References Examples
This function allows imputing data sets with a MCAR-devoted algorithm and a MNAR-devoted algorithm using probabilities that missing values are MCAR. If such a probability is superior to a chosen threshold, then the MCAR-devoted algorithm is used, otherwise it is the MNAR-devoted algorithm. For details, see Giai Gianetto, Q. et al. (2020) (doi: doi: 10.1101/2020.05.29.122770).
1 2 3 4 5 6 7 8 | impute.mix(tab, prob.MCAR, threshold, conditions, repbio=NULL, reptech=NULL,
methodMCAR="mle",nknn=15,weight=1, selec="all", ind.comp=1, progress.bar=TRUE, q=0.95,
ncp.max=5, maxiter = 10, ntree = 100, variablewise = FALSE, decreasing = FALSE,
verbose = FALSE, mtry = floor(sqrt(ncol(tab))), replace = TRUE,classwt = NULL,
cutoff = NULL, strata = NULL, sampsize = NULL, nodesize = NULL, maxnodes = NULL,
xtrue = NA, parallelize = c('no', 'variables', 'forests'),
methodMNAR="igcda", q.min = 0.025, q.norm = 3, eps = 0, distribution = "unif",
param1 = 3, param2 = 1, R.q.min=1)
|
tab |
A data matrix containing numeric and missing values. Each column of this matrix is assumed to correspond to an experimental sample, and each row to an identified peptide. |
prob.MCAR |
A matrix of probabilities that each missing value is MCAR. For instance such a matrix can be obtained from the function |
threshold |
A value such that if the probability that a missing value is MCAR is superior to it, then a MCAR-devoted algorithm is used, otherwise it is a MNAR-devoted algorithm that is used. |
conditions |
A vector of factors indicating the biological condition to which each column (experimental sample) belongs. |
repbio |
A vector of factors indicating the biological replicate to which each column belongs. Default is NULL (no experimental design is considered). |
reptech |
A vector of factors indicating the technical replicate to which each column belongs. Default is NULL (no experimental design is considered). |
methodMCAR |
The method used for imputing MCAR data. If |
methodMNAR |
The method used for imputing MNAR data. If |
nknn |
The number of nearest neighbours used in the SLSA algorithm (see |
weight |
The way of weighting in the algorithm (see |
selec |
A parameter to select a part of the dataset to find nearest neighbours between rows. This can be useful for big data sets (see |
ind.comp |
If |
progress.bar |
If |
q |
A quantile value (see |
ncp.max |
parameter of the |
maxiter |
parameter of the |
ntree |
parameter of the |
variablewise |
parameter of the |
decreasing |
parameter of the |
verbose |
parameter of the |
mtry |
parameter of the |
replace |
parameter of the |
classwt |
parameter of the |
cutoff |
parameter of the |
strata |
parameter of the |
sampsize |
parameter of the |
nodesize |
parameter of the |
maxnodes |
parameter of the |
xtrue |
parameter of the |
parallelize |
parameter of the |
q.min |
parameter of the |
q.norm |
parameter of the |
eps |
parameter of the |
distribution |
parameter of the |
param1 |
parameter of the |
param2 |
parameter of the |
R.q.min |
parameter of the |
The missing values for which prob.MCAR
is superior to a chosen threshold are imputed with one of the MCAR-devoted imputation methods (impute.mle
, impute.RF
, impute.PCA
or impute.slsa
). The other missing values are considered MNAR and imputed with impute.igcda
. More details and explanations can be bound in Giai Gianetto (2020).
The input matrix tab
with imputed values instead of missing values.
Quentin Giai Gianetto <quentin2g@yahoo.fr>
Giai Gianetto, Q., Wieczorek S., Couté Y., Burger, T. (2020). A peptide-level multiple imputation strategy accounting for the different natures of missing values in proteomics data. bioRxiv 2020.05.29.122770; doi: doi: 10.1101/2020.05.29.122770
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #Simulating data
res.sim=sim.data(nb.pept=2000,nb.miss=600);
#Fast imputation of missing values with the impute.rand algorithm
dat.rand=impute.rand(tab=res.sim$dat.obs,conditions=res.sim$condition);
#Estimation of the mixture model
res=estim.mix(tab=res.sim$dat.obs, tab.imp=dat.rand, conditions=res.sim$condition);
#Computing probabilities to be MCAR
born=estim.bound(tab=res.sim$dat.obs,conditions=res.sim$condition);
proba=prob.mcar.tab(born$tab.upper,res);
#Imputation under the assumption of MCAR and MNAR values
tabi=impute.mix(tab=res.sim$dat.obs, prob.MCAR=proba, threshold=0.5, conditions=res.sim$conditions,
repbio=res.sim$repbio, methodMCAR="slsa", methodMNAR="igcda", nknn=15, weight=1, selec="all",
ind.comp=1, progress.bar=TRUE);
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.