Description Usage Arguments Value Note Author(s) References Examples

Wrapper function to select an optimal number of neighbours (`k`

) in `impute.knn`

from the IMPUTE package. For several values of `k`

, predictions made on
random data points by `impute.knn`

are compared to their original value to
calculate the root mean squared error. In the original matrix, `thres`

corresponds to the limit under which intensities are considered missing. `perc`

represents the percentage of "non missing" intensities randomly selected to
estimate RMSE. The optimal number `koptim`

corresponds to number of `k`

that improves RMSE by less than 10%. This value is automatically used for
computing the resulting matrix `x`

matrix.

1 |

`x` |
A data frame or matrix to be imputed. |

`thres` |
Threshold below which intensities in |

`log.t` |
A logical which specifies whether or not the log transformation is performed on the data set before imputation. |

`lk` |
A vector of numbers of neighbours to be tested. |

`perc` |
Percentage of non-low value to be randomly selected. |

`niter` |
Number of iteration. |

`...` |
Arguments passed to or from other methods. |

A list containing the following components:

`x` |
An imputed data matrix using |

`koptim` |
Optimal number of neighbors found in |

`rmse` |
Root mean squared error matrix ( |

Version of package `impute`

must be 1.8.0 or greater. At the moment of the package writing, only the package available on the Bioconductor website seemed to be regularly updated

David Enot [email protected]

Hastie, T., Tibshirani, R., Sherlock, G., Eisen, M., Brown, P. and
Botstein, D.(1999). Imputing Missing Data for Gene Expression Arrays,
*Stanford University Statistics Department Technical report*.
http://www-stat.stanford.edu/~hastie/Papers/missing.pdf

Olga Troyanskaya, Michael Cantor, Gavin Sherlock, Pat Brown,
Trevor Hastie, Robert Tibshirani, David Botstein and Russ B. Altman, (2001).
Missing value estimation methods for DNA microarrays. *Bioinformatics*.
Vol. 17, no. 6, Pages 520-525.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | ```
## load data
data(abr1)
mat <- abr1$pos[,110:300]
## find an optimal number of k between 3 and 6 to impute values lower than 1
## 10 perc. of intensities >1 are used to evaluate each solution
## imputation is done with the log transformed matrix
res <- koptimp(mat,thres=1,log.t=TRUE,lk=3:6,perc=0.1,niter=5)
names(res)
## check RMSE of the solutions at various k
boxplot(res$rmse,xlab="Number of neighbours",ylab="Root mean square error")
## Do the imputation with a given k
## thres=1 and log.t=TRUE
mat[mat <= 1] <- NA ; mat <- log(mat)
## uses k=6 for example
mimp <- t(impute.knn(t(mat), k = 6, 1, 1, maxp = ncol(mat))$data)
## transform to the original space
mimp <- exp(mimp)
``` |

wilsontom/FIEmspro documentation built on Feb. 19, 2018, 9:03 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.