Imputation using a decision rule under an assumption of a mixture of MCAR and MNAR values.

Share:

Description

This function allows imputing data sets with a MCAR-devoted algorithm and a MNAR-devoted algorithm using probabilities that missing values are MCAR. If such a probability is superior to 0.5, then the MCAR-devoted algorithm is used, otherwise it is the MNAR-devoted algorithm.

Usage

1
2
3
impute.mix(tab, prob.MCAR, conditions, repbio=NULL, reptech=NULL, method="slsa", nknn=15,
weight=1, selec="all", ind.comp=1, progress.bar=TRUE, q=0.95)
 

Arguments

tab

A data matrix containing numeric and missing values. Each column of this matrix is assumed to correspond to an experimental sample, and each row to an identified peptide.

prob.MCAR

A matrix of probabilities that each missing value is MCAR. For instance such a matrix can be obtained from the function prob.mcar.tab of this package.

conditions

A vector of factors indicating the biological condition to which each column (experimental sample) belongs.

repbio

A vector of factors indicating the biological replicate to which each column belongs. Default is NULL (no experimental design is considered).

reptech

A vector of factors indicating the technical replicate to which each column belongs. Default is NULL (no experimental design is considered).

method

The method used for imputing MCAR data. If methodi="slsa" (default), then the SLSA algorithm is used, else the MLE algorithm is used.

nknn

The number of nearest neighbours used in the SLSA algorithm (see impute.slsa).

weight

The way of weighting in the algorithm (see impute.slsa).

selec

A parameter to select a part of the dataset to find nearest neighbours between rows. This can be useful for big data sets (see impute.slsa).

ind.comp

If ind.comp=1, only nearest neighbours without missing values are selected to fit linear models (see impute.slsa). Else, they can contain missing values.

progress.bar

If TRUE, a progress bar is displayed.

q

A quantile value (see impute.igcda).

Details

The missing values for which prob.MCAR is superior to 0.5 are imputed with either the function impute.slsa or the MLE algorithm (function impute.wrapper.MLE of the R package imputeLCMD). The other missing values are considered MNAR and imputed with impute.igcda.

Value

The input matrix tab with imputed values instead of missing values.

Author(s)

Quentin Giai Gianetto <quentin2g@yahoo.fr>

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
#Simulating data
res.sim=sim.data(nb.pept=2000,nb.miss=600,pi.mcar=0.2,para=10,nb.cond=2,nb.repbio=3,
nb.sample=5,m.c=25,sd.c=2,sd.rb=0.5,sd.r=0.2);

#Fast imputation of missing values with the impute.rand algorithm
dat.rand=impute.rand(tab=res.sim$dat.obs,conditions=res.sim$condition);

#Estimation of the mixture model
res=estim.mix(tab=res.sim$dat.obs, tab.imp=dat.rand, conditions=res.sim$condition);

#Computing probabilities to be MCAR
born=estim.bound(tab=res.sim$dat.obs,conditions=res.sim$condition);
proba=prob.mcar.tab(born$tab.lower,born$tab.upper,res);

#Imputation under the assumption of MCAR and MNAR values
tabi=impute.mix(tab=res.sim$dat.obs, prob.MCAR=proba, conditions=res.sim$conditions,
repbio=res.sim$repbio, method="slsa", nknn=15, weight=1, selec="all", ind.comp=1, 
progress.bar=TRUE);

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.