Imputing missing values by assuming that the distribution of complete values is Gaussian in each column of an input matrix. This algorithm is named "Imputation under a Gaussian Complete Data Assumption" (IGCDA).

Description

This function allows imputing missing values under the assumption that the distribution of complete values has to be Gaussian in each column.

Usage

1
impute.igcda(tab, tab.imp, conditions, q=0.95) 

Arguments

tab

A numeric vector or matrix with observed and missing values.

tab.imp

A matrix where the missing values of tab have been imputed under the assumption that they are all MCAR. For instance, such a matrix can be obtained by using the function impute.slsa of this package.

conditions

A vector of factors indicating the biological condition to which each column (experimental sample) belongs.

q

A quantile value (see Details).

Details

The mean and variance of the Gaussian distribution are determined using a linear regression between the quantiles of the observed values q_{obs} and the ones of the standard normal distribution q_{N(0,1)}.

The quantile value is used for determining the minimum of imputed values. This minimum is determined by the minimum observed value in the dataset minus quant_diff(q) where quant_diff(q) corresponds to a quantile value of the differences between the maximum and the minimum of the observed values for all the peptides in the condition. As a result, if q is close to 1, quant_diff(q) represents an extrem value between the maximum and the minimum of the intensity values in a condition for a peptide.

Value

The numeric input matrix with imputed values. The distribution of the intensity values in each of its columns is supposed to be Gaussian.

Author(s)

Quentin Giai Gianetto <quentin2g@yahoo.fr>

Examples

1
2
3
4
5
6
7
8
9
#Simulating data
res.sim=sim.data(nb.pept=2000,nb.miss=600,pi.mcar=0.2,para=10,nb.cond=2,nb.repbio=3,
nb.sample=5,m.c=25,sd.c=2,sd.rb=0.5,sd.r=0.2);

#Imputation of missing values with the slsa algorithm
dat.slsa=impute.slsa(tab=res.sim$dat.obs,conditions=res.sim$condition,repbio=res.sim$repbio);

#Imputation of missing values under a Gaussian assumption
dat.gauss=impute.igcda(tab=result$tab.mod, tab.imp=dat.slsa, conditions=res.sim$conditions);

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.