knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", out.width = "100%" )
The goal of augury is to streamline the process of fitting models to infill and forecast data, particularly for models used within the WHO's Triple Billion framework.
You can install augury from GitHub. The package depends on the INLA package which is not available on CRAN. You will need to separately install that prior to installing augury, following the code below.
if (!require("INLA")) install.packages("INLA",repos=c(getOption("repos"),INLA="https://inla.r-inla-download.org/R/stable"), dep=TRUE) remotes::install_github("gpw13/augury")
Most of the functions available through the package are designed to streamline fitting a model and replacing missing observations within a single data frame in one pass.
predict_general_mdl()
is a function that takes in a data frame, generic R
modelling function (such as stats::lm
), and fits a model and returns newly
projected data, error metrics, the fitted model itself, or a list of all 3.predict_inla()
is a function similar to the above, except rather than taking
a generic modelling function it accepts a formula and additional argument
passed to INLA::inla()
to perform Bayesian analysis of structured additive
models.predict_forecast()
instead uses forecasting methods available from
forecast::forecast()
to perform exponential smoothing or other time series
models on a response variable.Given that models might need to be fit separately to different portions of a data
frame, such as fitting time series models for each country individually, these
functions now accept the argument group_models
that determines whether to fit
separate models to each group in the data frame before joining them back together
into the original data frame.
Additional functions are provided as requested to streamlining fitting of
specific models. For general modeling, there are grouped and individual functions
for stats::lm()
, stats::glm()
, and lme4::lmer()
to fit linear models,
generalized linear models, and linear mixed-effects models respectively. These
wrappers are:
predict_lm()
predict_glm()
predict_lmer()
As well, wrappers are provided around INLA for models currently in place for the Triple Billion framework. These are a time series model with no additional covariates (fit individually by country) and a mixed-effects model using covariates.
predict_inla_me()
predict_inla_ts()
And for forecasting, generic functions are provided for simple exponential smoothing and Holt's linear trend exponential smoothing.
predict_holt()
predict_ses()
For convenience, the covariates used within the default INLA modeling are exported
from the package and can be easily joined up to a data frame for use in modeling.
These are available through augury::covariates_df
.
While we typically want to fit a model to a data set directly, augury has a set of
predict_..._avg_trend()
functions that allows you to average data by group (e.g.
by region), fit the model to the grouped and averaged data, and extract that trend
to apply to the original dataset. See the section below on Average trend modeling
for more details.
When doing model selection, it is always important to use metrics to evaluate
model fit and predictive accuracy. All predict_...
functions in augury use the
same evaluation framework, implemented through model_error()
. If a test
column
is defined, then the evaluation framework is only applied to this set
To help streamline other portions of the modeling process, some other functions are included in the package.
probit_transform()
uses a probit transformation on select columns of a data
frame, and inverses it if specified.scale_transform()
scales a vector by a single number, very simple, and can
inverse the scaling if specified.predict_simple()
performs linear interpolation or flat extrapolation in the
same manner as the other predict_...
functions, but without modeling or
confidence bounds.predict_average()
performs averaging by groups of columns in the same manner
as the other predict_...
functions, but without modeling or confidence
bounds.Together, they can be used to transform and scale data for better modeling, and then inverse these to get data back into the original feature space.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.