visualize_relationship: Visualizing the relationship between y and x in a partition...
In regclass: Tools for an Introductory Class in Regression and Modeling

visualize_relationship

R Documentation

Visualizing the relationship between y and x in a partition model

Description

Attempts to show how the relationship between y and x is being modeled in a partition or random forest model

Usage

visualize_relationship(TREE,interest,on,smooth=TRUE,marginal=TRUE,nplots=5,
  seed=NA,pos="topright",...)

Arguments

`TREE`	A partition or random forest model (though it works with many regression models as well)
`interest`	The name of the predictor variable for which the plot of y vs. x is to be made.
`on`	A dataframe giving the values of the other predictor variables for which the relationship is to be visualized. Typically this is the dataframe on which the partition model was built.
`smooth`	If `TRUE`, the relationship is plotted using a `loess` to smooth out the relationship
`marginal`	If `TRUE`, the modeled value of y at a particular value of x is the average of the predicted values of y over all rows which have that common value of x. If `FALSE`, then `nplots` rows from `on` will be selected and all other predictors will be fixed, showing the relationship between y and x for that particular set of characteristics.
`nplots`	The number of rows of `on` for which the relationship is plotted (if `marginal` is set to `FALSE`)
`seed`	the seed for the random number seed if reproducibility is required
`pos`	the location of the legend
`...`	additional arguments past to `plot`, namely `xlim` and `ylim`

Details

The function shows a scatterplot of y vs. x in the on dataframe, then shows how TREE is modeling the relationship between y and x with predicted values of y for each row in the data and also a curve illustrating the relationship. It is useful for seeing what the relationship between y and x as modeled by TREE "looks like", both as a whole and for particular combinations of other variables. If marginal is FALSE, then differences in the curves indicate the presence of some interaction between x and another variable.

Author(s)

Adam Petrie

References

Introduction to Regression and Modeling

Examples

  data(SALARY)
  FOREST <- randomForest(Salary~.,data=SALARY)
  visualize_relationship(FOREST,interest="Experience",on=SALARY)
  visualize_relationship(FOREST,interest="Months",on=SALARY,xlim=c(1,15),ylim=c(2500,4500))

  data(WINE)
  TREE <- rpart(Quality~.,data=WINE)
  visualize_relationship(TREE,interest="alcohol",on=WINE,smooth=FALSE)
  visualize_relationship(TREE,interest="alcohol",on=WINE,marginal=FALSE,nplots=7,smooth=FALSE)

regclass documentation built on June 8, 2025, 12:40 p.m.