knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.path = "man/figures/README-"
)

dpurifyr

Build Status CRAN_Status_Badge

Overview

dpurifyr package gives you a practical way for data preprocessing providing a consistent set of verbs that help you solve the most common data preprocessing challenges.

Installation

You can install from CRAN with:

# Not yet
# install.packages("dpurifyr")

# Or the development version from GitHub:
# install.packages("devtools")
devtools::install_github("teramonagi/dpurifyr")

Simple Example

The following example uses dpurifyr to solve a fairly realistic problem: apply different types of data preprocessing (standard_scale and scale_minmax) to columns (dplyr::starts_with("Sepal") and Petal.Width) selected by the way which is consistent with other tidyverse packages.

library("dpurifyr")
df <- head(iris)
# Create Data Pre-Processing chain while data preproessing for df is done.
pp <- dpurifyr::scale_standard(df, dplyr::starts_with("Sepal")) %>% 
  dpurifyr::scale_minmax(Petal.Width) 

# PreProcessing object(pp) behave like data.frame object
head(pp)

Once you get preprocessing object, preprocessing object can be applied to the other data. This means that you can apply the same preprocessing with fixed parameter to other data.

# You can apply the same preprocessing to different data.frame
pp <- dpurifyr::scale_standard(head(iris), Sepal.Width, Petal.Length) %>% 
  dpurifyr::scale_standard(Sepal.Length) 
# Using the same parameters( `use_param=TRUE`) estimated in pp object.
pp2 <- dpurifyr::apply(head(iris, 10), pp, TRUE) 
pp2

You can find more information in vignettes.

Contribution

Citation

To cite package dpurifyr in publications use:

Nagi Teramo, Shinichi Takayanagi (2017). dpurifyr: A grammar of data preprocessing. R package version 0.1.0. https://github.com/teramonagi/dpurifyr

A BibTeX entry for LaTeX users is

@Manual{,
  title = {dpurifyr: A grammar of data preprocessing},
  author = {Nagi Teramo, Shinichi Takayanagi},
  year = {2017}, 
  note = {R package version 0.1.0},
  url = {https://github.com/teramonagi/dpurifyr},
}


teramonagi/dpurifyr documentation built on May 29, 2019, 9:52 a.m.