Multiple imputation from a matrix of probabilities of being MCAR for each missing value.

Description

This function allows imputing data sets with a multiple imputation strategy.

Usage

1
2
mi.mix(tab, tab.imp, prob.MCAR, conditions, repbio=NULL, reptech=NULL, nb.iter=3, nknn=15,
weight=1, selec="all", siz=500, ind.comp=1, methodi="slsa", q=0.95, progress.bar=TRUE)

Arguments

tab

A data matrix containing numeric and missing values. Each column of this matrix is assumed to correspond to an experimental sample, and each row to an identified peptide.

tab.imp

A matrix where the missing values of tab have been imputed under the assumption that they are all MCAR. For instance, such a matrix can be obtained from the function impute.slsa of this package.

prob.MCAR

A matrix of probabilities that each missing value is MCAR. For instance such a matrix can be obtained from the function prob.mcar.tab of this package.

conditions

A vector of factors indicating the biological condition to which each column (experimental sample) belongs.

repbio

A vector of factors indicating the biological replicate to which each column belongs. Default is NULL (no experimental design is considered).

reptech

A vector of factors indicating the technical replicate to which each column belongs. Default is NULL (no experimental design is considered).

nb.iter

The number of iterations used for the multiple imputation method.

nknn

The number of nearest neighbours used in the SLSA algorithm (see impute.slsa).

selec

A parameter to select a part of the dataset to find nearest neighbours between rows. This can be useful for big data sets (see impute.slsa).

siz

A parameter to select a part of the dataset to perform imputations with the SLSA or the MLE algorithm. This can be useful for big data sets. Note that siz needs to be inferior to selec.

weight

The way of weighting in the algorithm (see impute.slsa).

ind.comp

If ind.comp=1, only nearest neighbours without missing values are selected to fit linear models (see impute.slsa). Else, they can contain missing values.

methodi

The method used for imputing data. If methodi="mle", then the MLE algorithm is used (function impute.wrapper.MLE of the R package imputeLCMD), else the SLSA algorithm is used (default).

q

A quantile value (see impute.igcda).

progress.bar

If TRUE, a progress bar is displayed.

Details

At each iteration, a matrix indicating the MCAR values is generated by Bernouilli distributions having parameters given by the matrix prob.MCAR. The generated MCAR values are next imputed thanks to the matrix tab.imp. For each row containing MNAR values, the other rows are imputed thanks to the function impute.igcda and, next, the considered row is imputed thanks to either the function impute.slsa or the function impute.wrapper.MLE of the R package imputeLCMD. So, the function impute.igcda allows to deform the correlation structure of the dataset in view to be closer to that of the true values, while the function impute.slsa (impute.wrapper.MLE) will impute by taking into account this modified correlation structure.

Value

The input matrix tab with imputed values instead of missing values.

Author(s)

Quentin Giai Gianetto <quentin2g@yahoo.fr>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
#Simulating data
res.sim=sim.data(nb.pept=2000,nb.miss=600,pi.mcar=0.2,para=10,nb.cond=2,nb.repbio=3,
nb.sample=5,m.c=25,sd.c=2,sd.rb=0.5,sd.r=0.2);

#Fast imputation of missing values with the impute.rand algorithm
dat.rand=impute.rand(tab=res.sim$dat.obs,conditions=res.sim$condition);

#Estimation of the mixture model
res=estim.mix(tab=res.sim$dat.obs, tab.imp=dat.rand, conditions=res.sim$condition);

#Computing probabilities to be MCAR
born=estim.bound(tab=res.sim$dat.obs,conditions=res.sim$condition);
proba=prob.mcar.tab(born$tab.lower,born$tab.upper,res);

#Multiple imputation strategy with 3 iterations (can be time consuming in function of the data set!)
data.mi=mi.mix(tab=res.sim$dat.obs, tab.imp=dat.rand, prob.MCAR=proba, conditions=
res.sim$conditions, repbio=res.sim$repbio, nb.iter=3);

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.