Description Usage Arguments Value Author(s) References See Also Examples
Filling gaps in a data matrix with hierarchical information (e.g. taxonomy) using Gibbs sampling (BHPMF). Will save the mean and standard deviations (uncertainties) of the predictions for both missing and observed values into the given file paths.
1 2 3 4 5 | GapFilling(X, hierarchy.info, prediction.level,
used.num.hierarchy.levels, num.samples=1000, burn=100, gaps=2,
num.latent.feats=10, tuning=FALSE, num.folds.tuning=10, tmp.dir,
mean.gap.filled.output.path, std.gap.filled.output.path,
rmse.plot.test.data=TRUE, verbose=FALSE)
|
X |
The data matrix (including missing values). Rows are the observations, columns are the features. Missing values must be filled by NA. |
hierarchy.info |
A matrix containing the hierarchical information in the same order as matrix X. Data should be ordered in descending order (the lowest level being in the last column). The first column is the observation index. |
prediction.level |
The level at which gaps are filled. For example if set to 4, gaps are filled at the 3rd highest hierarchy level. The default value is at the observation level (filling gaps/missing values at the observation level). |
used.num.hierarchy.levels |
Number of hierarchy levels used for gap filling. For example, if set to 2, only information from the first and second levels of the hierarchy are used. The default value equals the total number of hierarchy levels. |
num.samples |
This is aparameter for Gibbs sampling: total number of generated samples at each fold using gibbs sampling. Note: this is not the effective number of samples. The default value is 1000. |
burn |
This is aparameter for Gibbs sampling: burn in period. Number of initially sampled parameters discarded. The default value is 200. Should be smaller than num.samples. |
gaps |
This is aparameter for Gibbs sampling: gap between sampled parameters kept. The default value is 2. |
num.latent.feats |
Size of latent vectors in BHPMF. If tuning is FALSE, set to value (default value: 10) |
tuning |
If set to |
num.folds.tuning |
Number of cross validation folds for tuning. The default value is 10 if tuning set to |
tmp.dir |
A temporary directory used to save preprocessing files. If not provided, a tmp directory in /R/tmp will be created. If provided, each time a function from this package is called, the preprocessing files saved in this directory will be used. Helps to avoid having to re-run preprocessing functions for the same dataset. WARNING: This directory should be empty for the first time call on a new dataset! |
mean.gap.filled.output.path |
File path for saving the mean imputations (gap filled data) for both missing and observed values. In the same format as the input X. |
std.gap.filled.output.path |
File path for saving the standard deviations (uncertainties) of the imputations (gap filled data). In the same format as the input X. |
rmse.plot.test.data |
If |
verbose |
If |
If tuning is TRUE
, then return a list with the following tags:
best.number.latent.features |
The best value for latent vectors length |
min.rmse |
The minimum RMSE for the latent vectors of size best.number.latent.features |
Farideh Fazayeli farideh@cs.umn.edu
F. Fazayeli, A. Banerjee, J. Kattge, F. Schrodt, P.B. Reich (2014) Uncertainty quantified matrix completion using Bayesian Hierarchical Matrix factorization. International Conference on Machine Learning and Applications (ICMLA)
F. Schrodt, J. Kattge, H. Shan, F. Fazayeli, A. Karpatne, A. Banerjee, M. Reichstein, M. Boenisch, S. Diaz, J. Dickie, A. Gillison, V. Kumar, S. Lavorel, P.W. Leadley, C. Wirth, I. Wright, S.J. Wright, P.B. Reich (2015) HPMF - a hierarchical Bayesian approach to gap-filling and trait prediction for macroecology and functional biogeography. Global Ecology and Biogeography 24, 1510–1521
H. Shan, J. Kattge, P. B. Reich, A. Banerjee, F. Schrodt, and M. Reichstein. (2012) Gap Filling in the Plant Kingdom – Trait Prediction Using Hierarchical Probabilistic Matrix Factorization . International Conference on Machine Learning (ICML)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | ## Not run:
library(BHPMF)
data(trait.info) # Read the matrix X
data(hierarchy.info) # Read the hierarchy information
# Filling the gaps in X
GapFilling(trait.info, hierarchy.info, num.latent.feats=15, num.samples=100, burn=10, gaps=2
mean.gap.filled.output.path = "/tmp/mean_gap_filled.txt",
std.gap.filled.output.path="/tmp/std_gap_filled.txt")
###################
# Usage 1: Figure 1
###################
GapFilling(trait.info, hierarchy.info,
mean.gap.filled.output.path = "/tmp/mean_gap_filled.txt",
std.gap.filled.output.path="/tmp/std_gap_filled.txt")
###################
# Usage 2: Figure 2
###################
GapFilling(trait.info, hierarchy.info,
prediction.level = 4,
used.num.hierarchy.levels = 2,
mean.gap.filled.output.path = "/tmp/mean_gap_filled.txt",
std.gap.filled.output.path="/tmp/std_gap_filled.txt")
###################
# Usage 3: Figure 3
###################
GapFilling(trait.info, hierarchy.info,
prediction.level = 3,
used.num.hierarchy.levels = 2,
mean.gap.filled.output.path = "/tmp/mean_gap_filled.txt",
std.gap.filled.output.path="/tmp/std_gap_filled.txt")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.