meshML: Mesh ML

Description Usage Arguments Value Examples

View source: R/meshML.R

Description

The function meshML designed to mesh (or embed) the score calculated by a machine learning (ML) method into a parametric regression, in particular a generalized linear model (GLM). The motivation is to harness the power of ML methods where a simple interpretation of the marginal effect of some variables is not mandatory, while keeping a transparent structure with simple interpretation (and direct understanding of marginal effects) for variables of interest.

Since we expect possible co-linearities to exist between the variables used by the 2 methods, and since the parameters of interest are the ones included in the parametric regression, the ML regression will run on an orthogonalized version of the dependent variables (this will be done using simple linear regression and regression on the residuals)

Usage

1
2
meshML(data, p_FUN = glm, p_formula, p_args = list(family = gaussian),
  ml_FUN = NULL, ml_formula, ml_args = list())

Arguments

data

The data-frame containing the data

p_FUN

A function to calculate the blended parametric regression. Defaults to R's glm

p_formula

A two-sided formula with the dependent variable and explaining variables to be included directly in the parametric regression

p_args

= A named list of arguments to be passed to the parametric regression function (e.g list(familiy = binomial) for logistic regression)

ml_FUN

A function to calculate the (non-parametric) ML regression. When set to NULL (the default) the will try to load and use randomForest.

ml_formula

A one-sided formula with dependent variables to be included in the ML regression (dependent variable is table form the parametric formula)

p_args

= A named list of arguments to be passed to the ML regression function (e.g list(familiy = binomial) for logistic regression)

Value

The function blendML returns an S3 object with class "blendML" that contains:

ort

A list containig the outputs of the orthogonalization process (a list of "lm" objects)

ML

The optput of the non-parametric regression (by default an object of class "randomForest")

param

The optput of the parametric regression (by default an object of class "glm")

Examples

1
2
3
4
5
6
# Linear regression & random forests
x <- blendML(data = mtcars, p_formula = mpg ~ disp, ml_formula =  ~ cyl + hp + wt)
summary(x$param)
# Logistic regression & random forests
x <- blendML(data = mtcars, p_formula = vs ~ disp, p_args = list(family = binomial), ml_formula =  ~ cyl + hp + wt)
summary(x$param)

ytoren/blendML documentation built on May 4, 2019, 5:33 p.m.