Visualizing the relationship between y and x in a partition model

Share:

Description

Attempts to show how the relationship between y and x is being modeled in a partition or random forest model

Usage

1
2
visualize_relationship(TREE,interest,on,smooth=TRUE,marginal=TRUE,nplots=5,
  seed=NA,pos="topright")

Arguments

TREE

A partition or random forest model (though it works with many regression models as well)

interest

The name of the predictor variable for which the plot of y vs. x is to be made.

on

A dataframe giving the values of the other predictor variables for which the relationship is to be visualized. Typically this is the dataframe on which the partition model was built.

smooth

If TRUE, the relationship is plotted using a loess to smooth out the relationship

marginal

If TRUE, the modeled value of y at a particular value of x is the average of the predicted values of y over all rows which have that common value of x. If FALSE, then nplots rows from on will be selected and all other predictors will be fixed, showing the relationship between y and x for that particular set of characteristics.

nplots

The number of rows of on for which the relationship is plotted (if marginal is set to FALSE)

seed

the seed for the random number seed if reproducibility is required

pos

the location of the legend

Details

The function shows a scatterplot of y vs. x in the on dataframe, then shows how TREE is modeling the relationship between y and x with predicted values of y for each row in the data and also a curve illustrating the relationship. It is useful for seeing what the relationship between y and x as modeled by TREE "looks like", both as a whole and for particular combinations of other variables. If marginal is FALSE, then differences in the curves indicate the presence of some interaction between x and another variable.

Author(s)

Adam Petrie

References

Introduction to Regression and Modeling

See Also

loess, lm, glm

Examples

1
2
3
4
5
6
7
8
9
  data(SALARY)
  FOREST <- randomForest(Salary~.,data=SALARY)
  visualize_relationship(FOREST,interest="Experience",on=SALARY)
  visualize_relationship(FOREST,interest="Months",marginal=FALSE,nplots=7,on=SALARY)

  data(WINE)
  TREE <- rpart(Quality~.,data=WINE)
  visualize_relationship(TREE,interest="alcohol",on=WINE,smooth=FALSE)
  visualize_relationship(TREE,interest="alcohol",on=WINE,marginal=FALSE,nplots=5,smooth=FALSE)

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.