knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "README-"
)

R/origami

R-CMD-check Coverage Status CRAN CRAN downloads CRAN total downloads Project Status: Active - The project has reached a stable, usable state and is being actively developed. License: GPL v3 DOI DOI

High-powered framework for cross-validation. Fold your data like it's paper!

Authors: Jeremy Coyle, Nima Hejazi, Ivana Malenica, and Rachael Phillips


What's origami?

The origami R package provides a general framework for the application of cross-validation schemes to particular functions. By allowing arbitrary lists of results, origami accommodates a range of cross-validation applications.


Installation

For standard use, we recommend installing the package from CRAN via

install.packages("origami")

You can install a stable release of origami from GitHub via devtools with:

devtools::install_github("tlverse/origami")

Usage

For details on how best to use origami, please consult the package documentation and introductory vignette online, or do so from within R.


Example

This minimal example shows how to use origami to apply cross-validation to the computation of a simple descriptive statistic using a sample data set. In particular, we obtain a cross-validated estimate of the mean:

library(stringr)
library(origami)
set.seed(4795)

data(mtcars)
head(mtcars)

# build a cv_fun that wraps around lm
cv_lm <- function(fold, data, reg_form) {
  # get name and index of outcome variable from regression formula
  out_var <- as.character(unlist(str_split(reg_form, " "))[1])
  out_var_ind <- as.numeric(which(colnames(data) == out_var))

  # split up data into training and validation sets
  train_data <- training(data)
  valid_data <- validation(data)

  # fit linear model on training set and predict on validation set
  mod <- lm(as.formula(reg_form), data = train_data)
  preds <- predict(mod, newdata = valid_data)

  # capture results to be returned as output
  out <- list(coef = data.frame(t(coef(mod))),
              SE = ((preds - valid_data[, out_var_ind])^2))
  return(out)
}

folds <- make_folds(mtcars)
results <- cross_validate(cv_fun = cv_lm, folds = folds, data = mtcars,
                          reg_form = "mpg ~ .")
mean(results$SE)

For details on how to write wrappers (cv_funs) for use with origami::cross_validate, please consult the documentation and vignettes that accompany the package.


Issues

If you encounter any bugs or have any specific feature requests, please file an issue.


Contributions

Contributions are very welcome. Interested contributors should consult our contribution guidelines prior to submitting a pull request.


Citation

After using the origami R package, please cite it:

    @article{coyle2018origami,
      author = {Coyle, Jeremy R and Hejazi, Nima S},
      title = {origami: A Generalized Framework for Cross-Validation in R},
      journal = {The Journal of Open Source Software},
      volume = {3},
      number = {21},
      month = {January},
      year  = {2018},
      publisher = {The Open Journal},
      doi = {10.21105/joss.00512},
      url = {https://doi.org/10.21105/joss.00512}
    }

License

© 2017-2021 Jeremy R. Coyle

The contents of this repository are distributed under the GPL-3 license. See file LICENSE for details.



jeremyrcoyle/origami documentation built on Sept. 10, 2022, 4:28 p.m.