statBoxPlotMultipleVar: creates a ggplot object showing a boxplot of a single column,...

View source: R/generalStat.R

statBoxPlotMultipleVarR Documentation

creates a ggplot object showing a boxplot of a single column, split (cut) along another column (or itself)

Description

creates a ggplot object showing a boxplot of a single column, split (cut) along another column (or itself)

Usage

statBoxPlotMultipleVar(
  data,
  column = 1,
  varColumn = 2,
  varBreaks = 4,
  varLabels = NA,
  varIncludeLowest = FALSE,
  varRight = TRUE,
  sampleSize = NA,
  removeNA = TRUE,
  variableName = "variable",
  outlineColor = "black",
  fillColor = NA,
  jitter = 0.05,
  alpha = 0.5,
  size = 3,
  shape = 16,
  jitterFill = "black",
  whiskerWidth = 0.5,
  boxWidth = 0.5,
  vertical = FALSE,
  xAxis = TRUE,
  yAxis = TRUE,
  yDefault = TRUE,
  yLimits = c(0, NA),
  xLabel = "",
  yLabel = "",
  title = "",
  showMean = TRUE,
  meanShape = 23,
  meanColor = "black",
  meanFill = "orange",
  meanSize = 5,
  showLegend = TRUE,
  legend.position = "bottom",
  legend.title = "cut",
  returnData = FALSE,
  ...
)

Arguments

data

the data to be plotted, can be a numeric/character/etc vector or data.frame like (or tibble etc). If it is data.frame or similar the column argument defines which column is to be used

column

defines which column are to be used for the boxplot. Can be integer or character (column name), note that if both (character) column and yLabel are defined, column is used as label for the Y-axis. If not defined, then all columns of the data.frame will be used.

varColumn

defines which column is to be used to split the data column (argument: column). Can be integer or character (column name). The splitting is performed via the function cut(). See ?base::cut for details. Note that if the varColumn contains non-numeric data (eg character or factor), no split will be performed

varBreaks

specfies how to split the varColumn, see ?base::cut (breaks argument). Note that varBreaks and other arguments specifying the split are ignored if the varColumn is not numerical

varLabels

specfies labels to use when splitting the varColumn, see ?base::cut (labels argument).

varIncludeLowest

specfies the include.lowest argument of base::cut, see ?base::cut

varRight

specfies the right argument of base::cut, see ?base::cut

sampleSize

allows to the use of a sample of the data to be used for the boxplot. By default sampleSize = NA, in which case all data is used

removeNA

if TRUE, the NA 'values' in the vector will be removed prior to plotting. @note this will remove warning messages and errors

variableName

sets the 'combined' name of the columns, must be a single word

outlineColor

defines the color of the line around the box

fillColor

defines the color of the boxes themselves. @Note: if the number of colors does not match the number of columns then ggplot2 default colors will be used

jitter

if NA, then the data points will not be shown (only outliers!), otherwise it adds a random value to the x-values of the data points plotted. Note: If set to 0 then they will be located on a straight line

alpha

alpha ('see through' value) of the data (jitter) points

size

size of the data (jitter) points

shape

shape of the data (default = 16), see vignette ggplot2::ggplot2-specs

jitterFill

defines color of the jitter (single color!)

whiskerWidth

defines the width of the whiskers (0-1)

boxWidth

defines the width of the box (0-1)

vertical

if TRUE, flips x- and y-axis

xAxis

defines if the x-axis is shown

yAxis

defines if the x-axis is shown

yDefault

default is set to TRUE, together with yLimits, this can be used to define the exact range of the Y-axis

yLimits

default = c(0,NA), together with yLimits, this can be used to define the exact range of the Y-axis

xLabel

set x-axis title

yLabel

set y-axis title

title

sets title of graph

showMean

defines if the mean value of the data should be shown

meanShape

shape of the mean symbol (default = 23)

meanColor

color of the line around the mean symbol

meanFill

fill color of the shape of the mean symbol

meanSize

size of the mean symbol

showLegend

defines if the legend is to be shown or not

legend.position

defines where a legend is to be placed

legend.title

if not NA, then to give a non-default name to the legend

returnData

if TRUE then a list with 2 elements is returned. The first element is the data.frame used to generate the graph and the second element is the graph itself

...

can be used to pass on other arguments to graphAdjust() (like xLimits, xExpand, etc)

Value

a ggplot object or a list

Note

box itself: bottom = 25 lower whisker = 25 upper whisker= 75 IQR = (75


BenBruyneel/BBPersonalR documentation built on Aug. 23, 2024, 8:28 p.m.