visualize_relationship: Visualizing the relationship between y and x in a partition... In regclass: Tools for an Introductory Class in Regression and Modeling

Description

Attempts to show how the relationship between y and x is being modeled in a partition or random forest model

Usage

 ```1 2``` ```visualize_relationship(TREE,interest,on,smooth=TRUE,marginal=TRUE,nplots=5, seed=NA,pos="topright",...) ```

Arguments

 `TREE` A partition or random forest model (though it works with many regression models as well) `interest` The name of the predictor variable for which the plot of y vs. x is to be made. `on` A dataframe giving the values of the other predictor variables for which the relationship is to be visualized. Typically this is the dataframe on which the partition model was built. `smooth` If `TRUE`, the relationship is plotted using a `loess` to smooth out the relationship `marginal` If `TRUE`, the modeled value of y at a particular value of x is the average of the predicted values of y over all rows which have that common value of x. If `FALSE`, then `nplots` rows from `on` will be selected and all other predictors will be fixed, showing the relationship between y and x for that particular set of characteristics. `nplots` The number of rows of `on` for which the relationship is plotted (if `marginal` is set to `FALSE`) `seed` the seed for the random number seed if reproducibility is required `pos` the location of the legend `...` additional arguments past to `plot`, namely `xlim` and `ylim`

Details

The function shows a scatterplot of y vs. x in the `on` dataframe, then shows how `TREE` is modeling the relationship between y and x with predicted values of y for each row in the data and also a curve illustrating the relationship. It is useful for seeing what the relationship between y and x as modeled by `TREE` "looks like", both as a whole and for particular combinations of other variables. If `marginal` is `FALSE`, then differences in the curves indicate the presence of some interaction between x and another variable.

References

Introduction to Regression and Modeling

`loess`, `lm`, `glm`
 ```1 2 3 4 5 6 7 8 9``` ``` data(SALARY) FOREST <- randomForest(Salary~.,data=SALARY) visualize_relationship(FOREST,interest="Experience",on=SALARY) visualize_relationship(FOREST,interest="Months",on=SALARY,xlim=c(1,15),ylim=c(2500,4500)) data(WINE) TREE <- rpart(Quality~.,data=WINE) visualize_relationship(TREE,interest="alcohol",on=WINE,smooth=FALSE) visualize_relationship(TREE,interest="alcohol",on=WINE,marginal=FALSE,nplots=7,smooth=FALSE) ```