intubate-package: Interface to Popular R Functions for Data Science Pipelines.

Description Details Author(s) See Also Examples

Description

The aim of intubate (logo <||>) is to offer a painless way to add R functions that that are non-pipe-aware to data science pipelines implemented by 'magrittr' with the operator %>%, without having to rely on workarounds of varying complexity. In addition, three extensions for pipelines, called 'intubOrders', 'intuEnv', and 'intuBags', are implemented.

For a gentle introduction to intubate, please see the vignette that is included with the package.

Currently, there are 461 interfaces for:

adabag: Multiclass AdaBoost.M1, SAMME and Bagging

AER: Applied Econometrics with R

aod: Analysis of Overdispersed Data

ape: Analyses of Phylogenetics and Evolution

arm: Data Analysis Using Regression and Multilevel/Hierarchical Models

betareg: Beta Regression

brglm: Bias reduction in binomial-response generalized linear models

caper: Comparative Analyses of Phylogenetics and Evolution in R

car: Companion to Applied Regression

caret: Classification and Regression Training

coin: Conditional Inference Procedures in a Permutation Test Framework

CORElearn: Classification, Regression and Feature Evaluation

drc: Analysis of Dose-Response Curves

e1071: Support Vector Machines

earth: Multivariate Adaptive Regression Splines

EnvStats: Environmental Statistics, Including US EPA Guidance

fGarch: Rmetrics - Autoregressive Conditional Heteroskedastic Modelling

flexmix: Flexible Mixture Modeling

forecast: Forecasting Functions for Time Series and Linear Models

frontier: Stochastic Frontier Analysis

gam: Generalized Additive Models

gbm: Generalized Boosted Regression Models

gee: Generalized Estimation Equation Solver

glmnet: Lasso and Elastic-Net Regularized Generalized Linear Models

glmx: Generalized Linear Models Extended

gmnl: Multinomial Logit Models with Random Parameters

gplots: Various R Programming Tools for Plotting Data

graphics: The R Graphics Package

gss: General Smoothing Splines

hdm: High-Dimensional Metrics

Hmisc: Harrell Miscellaneous

ipred: Improved Predictors

iRegression: Regression Methods for Interval-Valued Variables

ivfixed: Instrumental fixed effect panel data model

kernlab: Kernel-Based Machine Learning Lab

kknn: Weighted k-Nearest Neighbors

klaR: Classification and Visualization

lars: Least Angle Regression, Lasso and Forward Stagewise

lattice: Trellis Graphics for R

latticeExtra: Extra Graphical Utilities Based on Lattice

leaps: Regression Subset Selection

lfe: Linear Group Fixed Effects

lme4: Linear Mixed-Effects Models using 'Eigen' and S4

lmtest: Testing Linear Regression Models

MASS: Robust Regression, Linear Discriminant Analysis, Ridge Regression, Probit Regression, ...

MCMCglmm: MCMC Generalised Linear Mixed Models

mda: Mixture and Flexible Discriminant Analysis

metafor: Meta-Analysis Package for R

mgcv: Mixed GAM Computation Vehicle with GCV/AIC/REML Smoothness Estimation

mhurdle: Multiple Hurdle Tobit Models

minpack.lm: R Interface to the Levenberg-Marquardt Nonlinear Least-Squares Algorithm Found in MINPACK, Plus Support for Bounds

mlogit: Multinomial logit model

mnlogit: Multinomial Logit Model

modeltools: Tools and Classes for Statistical Models

nlme: Linear and Nonlinear Mixed Effects Models

nlreg: Higher Order Inference for Nonlinear Heteroscedastic Models

nnet: Feed-Forward Neural Networks and Multinomial Log-Linear Models

ordinal: Regression Models for Ordinal Data

party: A Laboratory for Recursive Partytioning

partykit: A Toolkit for Recursive Partytioning

plotrix: Various Plotting Functions

pls: Partial Least Squares and Principal Component Regression

pROC: Display and Analyze ROC Curves

pscl: Political Science Computational Laboratory, Stanford University

psychomix: Psychometric Mixture Models

psychotools: Infrastructure for Psychometric Modeling

psychotree: Recursive Partitioning Based on Psychometric Models

quantreg: Quantile Regression

randomForest: Random Forests for Classification and Regression

Rchoice: Discrete Choice (Binary, Poisson and Ordered) Models with Random Parameters

rminer: Data Mining Classification and Regression Methods

rms: Regression Modeling Strategies

robustbase: Basic Robust Statistics

rpart: Recursive Partitioning and Regression Trees

RRF: Regularized Random Forest

RWeka: R/Weka Interface

sampleSelection: Sample Selection Models

sem: Structural Equation Models

spBayes: Univariate and Multivariate Spatial-temporal Modeling

stats: The R Stats Package (glm, lm, loess, lqs, nls, ...)

strucchange: Testing, Monitoring, and Dating Structural Changes

survey: Analysis of Complex Survey Samples

survival: Survival Analysis

SwarmSVM: Ensemble Learning Algorithms Based on Support Vector Machines

systemfit: Estimating Systems of Simultaneous Equations

tree: Classification and Regression Trees

vcd: Visualizing Categorical Data

vegan: Community Ecology Package

The aim is to provide interfaces to most methodologies used in data science.

intubate core depends only on base, stats, and utils libraries. To keep it as lean as possible, intubate will not install not load any library. You need to make sure that the library containing the functions to be interfaced are loaded (before or after intubate). Moreover, you can interface the functions of any library directly without the need to create interfaces (see ntbt) so perhaps in the future that will be the preferred way of using intubate.

intubate is still a work in progress. As such, the implementation may change in future versions until stabilization.

Details

Package: intubate
Type: Package
Version: 1.4.0
Date: 2016-09-09
License: GPL (>=2)

See examples of use below.

Author(s)

Roberto Bertolusso

Maintainer: Roberto Bertolusso <rbertolusso@rice.edu>

See Also

intubate

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
## Not run: 
library(intubate)
library(magrittr)

######### Interface to lm #########
## Original function to interface
lm(conc ~ uptake, CO2)

## The interface reverses the order of data and formula
ntbt_lm(CO2, conc ~ uptake)

## so it can be used easily in a pipeline.
CO2 %>%
  ntbt_lm(conc ~ uptake)

CO2 %>%
  ntbt_lm(conc ~ uptake) %>%
  summary()

######### Interface to cor.test #########
## Original function to interface
cor.test(~ CONT + INTG, data = USJudgeRatings)

## The interface reverses the order of data and formula
ntbt_cor.test(data = USJudgeRatings, ~ CONT + INTG)

## so it can be used easily in a pipeline.
USJudgeRatings %>%
  ntbt_cor.test(~ CONT + INTG)
  
######### Interfaces to aggregate and xtabs #########
## Original function to interface
ag <- aggregate(len ~ ., data = ToothGrowth, mean)
xtabs(len ~ ., data = ag)

## The interface reverses the order of data and formula
ag <- ntbt_aggregate(ToothGrowth, len ~ ., mean)
ntbt_xtabs(ag, len ~ .)

## so it can be used easily in a pipeline.
ToothGrowth %>%
  ntbt_aggregate(len ~ ., mean) %>%
  ntbt_xtabs(len ~ .)

## End(Not run)

rbertolusso/intubate documentation built on May 27, 2019, 3 a.m.