plotVar: Plot of the (estimated) dependency structure of a variable...

View source: R/plotVar.R

plotVarR Documentation

Plot of the (estimated) dependency structure of a variable x on a categorical variable y

Description

This function allows to visualise the (estimated) distributions of a variable x for each of the categories of a categorical variable y. This allows to study the dependency structure of y on x. Two types of visualisations are available: density plots and boxplots.

Usage

plotVar(
  x,
  y,
  plot_type = c("both", "density", "boxplot")[1],
  x_label = "",
  y_label = "",
  plot_title = "",
  plotit = TRUE
)

Arguments

x

Metric variable or ordered categorical variable that has at least as many unique values as y

y

Factor variable with at least three categories.

plot_type

Plot type, one of the following: "both" (the default), "density", "boxplot". If "density", a "density" plot is produced, if "boxplot", a "boxplot" is produced, and if "both", both a "density" plot and a "boxplot" are produced. See the 'Details' section of plotMcl for details.

x_label

Optional. The label of the x-axis.

y_label

Optional. The label (heading) of the legend that differentiates the categories of y.

plot_title

Optional. The title of the plot.

plotit

This states whether the plots are actually plotted or merely returned as ggplot objects. Default is TRUE.

Details

See the 'Details' section of plotMcl.

Value

A list returned invisibly containing:

  • Only the element dens_pl if plot_type = "density";

  • Only the element boxplot_pl if plot_type = "boxplot";

  • The elements dens_pl, boxplot_pl, and combined_pl if plot_type = "both".

All returned plots are ggplot2 objects, with combined_pl being a patchwork object.

Author(s)

Roman Hornung

References

  • Hornung, R. (2022). Diversity forests: Using split sampling to enable innovative complex split procedures in random forests. SN Computer Science 3(2):1, <\Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/s42979-021-00920-1")}>.

See Also

plotMcl, plot.multifor

Examples

## Not run: 

## Load package:

library("diversityForest")



## Load the "ctg" data set:

data(ctg)


## Set seed to make results reproducible (this is necessary because
## the rug plot produced by 'plotVar' does not show all observations, but
## only a random subset of 1000 observations):

set.seed(1234)


## Using a "density" plot and a "boxplot", visualise the (estimated) 
## distributions of  the variable "Mean" for each of the categories of the 
# variable "Tendency":

plotVar(x = ctg$Mean, y = ctg$Tendency)


## Re-create this plot with labels:

plotVar(x = ctg$Mean, y = ctg$Tendency, x_label = "Mean of the histogram ('Mean')",
        y_label = "Histogram tendency ('Tendency')", 
        plot_title = "Relationship between 'Mean' and 'Tendency'")


## Re-create this plot, but only show the "density" plot:

plotVar(x = ctg$Mean, y = ctg$Tendency, plot_type = "density",
        x_label = "Mean of the histogram ('Mean')", 
        y_label = "Histogram tendency ('Tendency')", 
        plot_title = "Relationship between 'Mean' and 'Tendency'")


## Use ggplot2 and RColorBrewer functionalities to change the line colors and
## the labels of the categories of "Tendency":

library("ggplot2")
library("RColorBrewer")
p <- plotVar(x = ctg$Mean, y = ctg$Tendency, plot_type = "density",
             x_label = "Mean of the histogram ('Mean')", 
             y_label = "Histogram tendency ('Tendency')", 
             plot_title = "Relationship between 'Mean' and 'Tendency'",
             plotit = FALSE)$dens_pl +
  scale_color_manual(values = brewer.pal(n = 3, name = "Set2"),
                     labels = c("left asymmetric", "symmetric", 
                                "right asymmetric")) +
  scale_linetype_manual(values = rep(1, 3),
                        labels = c("left asymmetric", "symmetric", 
                                   "right asymmetric"))

p

## # Save as PDF:
## ggsave(file="mypathtofolder/FigureXY1.pdf", width=10, height=7)



## Further customizations:

# Create plot without plotting it:

plotobj <- plotVar(x = ctg$Mean, y = ctg$Tendency, 
                   x_label = "Mean of the histogram ('Mean')", 
                   y_label = "Histogram tendency ('Tendency')", 
                   plotit = FALSE)


# Customize the density plot:

dens_pl <- plotobj$dens_pl + theme(legend.position = "inside", 
                                   legend.position.inside = c(0.25, 0.9), 
                                   legend.title = element_text(size = 16), 
                                   legend.text = element_text(size = 12), 
                                   axis.title = element_text(size=16), 
                                   axis.text = element_text(size=12)) + 
  ylab("(Scaled) density")


# Customize the boxplot:

boxplot_pl <- plotobj$boxplot_pl + 
  theme(axis.text.x = element_text(color = "transparent"), 
        axis.ticks.x = element_line(color = "transparent"), 
        axis.title = element_text(size=16), 
        axis.text = element_text(size=12))


# Create a title with increased font size:

library("grid")
title_grob <- textGrob(
  "Title of the combined plot", 
  gp = gpar(fontsize = 18) 
)


# Arrange plots with title:

library("gridExtra")
p <- arrangeGrob(
  dens_pl, boxplot_pl, 
  top = title_grob,
  nrow = 1
)
p

## # Save as PDF:
## ggsave(file="mypathtofolder/FigureXY2.pdf", p, width=16, height=7)


## End(Not run)


diversityForest documentation built on June 8, 2025, 1:23 p.m.