This function allows imputing data sets with a multiple imputation strategy.

1 2 |

`tab` |
A data matrix containing numeric and missing values. Each column of this matrix is assumed to correspond to an experimental sample, and each row to an identified peptide. |

`tab.imp` |
A matrix where the missing values of |

`prob.MCAR` |
A matrix of probabilities that each missing value is MCAR. For instance such a matrix can be obtained from the function |

`conditions` |
A vector of factors indicating the biological condition to which each column (experimental sample) belongs. |

`repbio` |
A vector of factors indicating the biological replicate to which each column belongs. Default is NULL (no experimental design is considered). |

`reptech` |
A vector of factors indicating the technical replicate to which each column belongs. Default is NULL (no experimental design is considered). |

`nb.iter` |
The number of iterations used for the multiple imputation method. |

`nknn` |
The number of nearest neighbours used in the SLSA algorithm (see |

`selec` |
A parameter to select a part of the dataset to find nearest neighbours between rows. This can be useful for big data sets (see |

`siz` |
A parameter to select a part of the dataset to perform imputations with the SLSA or the MLE algorithm. This can be useful for big data sets. Note that |

`weight` |
The way of weighting in the algorithm (see |

`ind.comp` |
If |

`methodi` |
The method used for imputing data. If |

`q` |
A quantile value (see |

`progress.bar` |
If |

At each iteration, a matrix indicating the MCAR values is generated by Bernouilli distributions having parameters given by the matrix `prob.MCAR`

. The generated MCAR values are next imputed thanks to the matrix `tab.imp`

. For each row containing MNAR values, the other rows are imputed thanks to the function `impute.igcda`

and, next, the considered row is imputed thanks to either the function `impute.slsa`

or the function `impute.wrapper.MLE`

of the R package imputeLCMD. So, the function `impute.igcda`

allows to deform the correlation structure of the dataset in view to be closer to that of the true values, while the function `impute.slsa`

(`impute.wrapper.MLE`

) will impute by taking into account this modified correlation structure.

The input matrix `tab`

with imputed values instead of missing values.

Quentin Giai Gianetto <quentin2g@yahoo.fr>

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | ```
#Simulating data
res.sim=sim.data(nb.pept=2000,nb.miss=600,pi.mcar=0.2,para=10,nb.cond=2,nb.repbio=3,
nb.sample=5,m.c=25,sd.c=2,sd.rb=0.5,sd.r=0.2);
#Fast imputation of missing values with the impute.rand algorithm
dat.rand=impute.rand(tab=res.sim$dat.obs,conditions=res.sim$condition);
#Estimation of the mixture model
res=estim.mix(tab=res.sim$dat.obs, tab.imp=dat.rand, conditions=res.sim$condition);
#Computing probabilities to be MCAR
born=estim.bound(tab=res.sim$dat.obs,conditions=res.sim$condition);
proba=prob.mcar.tab(born$tab.lower,born$tab.upper,res);
#Multiple imputation strategy with 3 iterations (can be time consuming in function of the data set!)
data.mi=mi.mix(tab=res.sim$dat.obs, tab.imp=dat.rand, prob.MCAR=proba, conditions=
res.sim$conditions, repbio=res.sim$repbio, nb.iter=3);
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.