summaries.plot.x: Plot of central and non-central conditional tendency measures...

View source: R/UBStats_Main_Visible_ALL_202406.R

summaries.plot.xR Documentation

Plot of central and non-central conditional tendency measures for a single numeric variable

Description

summaries.plot.x() plots location statistics for a numeric vector conditioned to the levels of one or more variables.

Usage

summaries.plot.x(
  x,
  stats = "mean",
  plot.type = "bars",
  conf.level = 0.95,
  by1,
  by2,
  breaks.by1,
  interval.by1 = FALSE,
  breaks.by2,
  interval.by2 = FALSE,
  adj.breaks = TRUE,
  bw = FALSE,
  color = NULL,
  legend = TRUE,
  use.scientific = FALSE,
  data,
  ...
)

Arguments

x

An unquoted string identifying a numerical variable whose tendency measures have to be graphically displayed. x can be the name of a vector in the workspace or the name of one of the columns in the data frame specified in the data argument.

stats

A single character specifying the conditioned tendency measure/s to display in the plot. The available options are "mean", "median", "ci.mean" (to plot the means and the confidence intervals for the means), and specific sets of quantiles, namely "quartiles", "quintiles", "deciles", and "percentiles" (note that for quantiles only one single layer can be specified).

plot.type

A single character specifying the type of plot used to compare the requested measures conditioned to the levels of one variable, by1, possibly broken down by the levels of a second variable, by2, if specified. The available options are:

  • "bars": Available only when stats is "mean", "median", or "ci.mean" and one single layer (by1) is specified. For each level of by1 a bar is built whose height coincides with the conditional mean or median. Confidence intervals for the means are reported when stats = "ci.mean".

  • "points": Available only when stats is "mean", "median", and "ci.mean". Confidence intervals for the means are reported when stats = "ci.mean" and one single layer is specified.

  • "lines": Points joined by lines; this is the unique option available for quantiles.

conf.level

A number between 0 and 1 indicating the confidence level of the intervals for the conditional means when stats = "ci.mean" is specified (default to 0.95).

by1, by2

Unquoted strings identifying variables (typically taking few values/levels) used to build conditional summaries, that can be defined same way as x. At least one layer has to be specified. The conditional measures are plotted against the values of by1, broken down by the levels of by2, if specified.

breaks.by1, breaks.by2

Allow classifying the variables by1 and/or by2, if numerical, into intervals. They can be integers indicating the number of intervals of equal width used to classify by1 and/or by2, or vectors of increasing numeric values defining the endpoints of intervals (closed on the left and open on the right; the last interval is closed on the right too). To cover the entire range of values the maximum and the minimum values should be included between the first and the last break. It is possible to specify a set of breaks covering only a portion of the range of by1 and/or by2.

interval.by1, interval.by2

Logical values indicating whether by1 and/or by2 are variables measured in classes (TRUE). If the intervals for one variable are not consistent (e.g. overlapping intervals, or intervals with upper endpoint higher than the lower one), the variable is analysed as it is, even if results are not necessarily consistent; default to FALSE.

adj.breaks

Logical value indicating whether the endpoints of intervals of the numerical variables by1 or by2, when classified into intervals, should be displayed avoiding scientific notation; default to TRUE.

bw

Logical value indicating whether plots should be colored in scale of greys (TRUE) rather than using a standard palette (FALSE, default).

color

Optional string vector to specify colors to use in the plot rather than a standard palette (NULL, default).

legend

Logical value indicating whether a legend should be displayed in the plot (legend = TRUE; default) or not (legend = FALSE).

use.scientific

Logical value indicating whether numbers on axes should be displayed using scientific notation (TRUE); default to FALSE.

data

An optional data frame containing x and/or the variables specifying the layers, by1 and by2. If not found in data, the variables are taken from the environment from which distr.summary.x() is called.

...

Additional arguments to be passed to low level functions.

Value

A table (converted to dataframe) reporting the requested statistics conditioned to the levels of the specified layers.

Author(s)

Raffaella Piccarreta raffaella.piccarreta@unibocconi.it

See Also

distr.summary.x() for tabulating summary measures of a univariate distribution.

distr.plot.x() for plotting a univariate distribution.

distr.table.x() for tabulating a univariate distribution.

Examples

data(MktDATA, package = "UBStats")

# Means (and their CI) or medians by a single variable
# - Barplot of means (default) by a character 
summaries.plot.x(x = TotVal, stats = "mean",  
               by1 = Gender, data = MktDATA)
# - Barplot of medians by a numerical variable
#   classified into intervals: user-defined color
summaries.plot.x(x = TotVal, stats = "median", 
                 by1 = AOV, breaks.by1 = 5, 
                 color = "purple", data = MktDATA)
# - Lineplot of means and their CI by a variable 
#   measured in classes
summaries.plot.x(x = TotVal, 
                 stats = "ci.mean", plot.type = "lines",
                 by1 = Income.S, interval.by1 = TRUE,
                 data = MktDATA)
# - Barplot of means and their CI by a 
#   numerical variable; change the confidence level
summaries.plot.x(x = TotVal, 
                 stats = "ci.mean", conf.level = 0.90,
                 plot.type = "bars", 
                 by1 = NWeb_Purch, data = MktDATA)
# - Note: no plot built for a variable with 
#   too many levels (>20)
# summaries.plot.x(x = TotVal, 
#                  stats = "ci.mean", plot.type = "lines",
#                  by1 = AOV, data = MktDATA)

# Quantiles by a single variable
# - Only lines plots allowed for quantiles
summaries.plot.x(x = Baseline, 
                 stats = "deciles", plot.type = "lines",
                 by1 = NDeals, data = MktDATA)
summaries.plot.x(x = Baseline, 
                 stats = "quartiles", plot.type = "lines",
                 by1 = Marital_Status, data = MktDATA)

# Means and medians by two variables
# - Default: only lines allowed
summaries.plot.x(x = TotVal, stats = "mean", 
                 by1 = Education, by2 = Kids, 
                 data = MktDATA)
summaries.plot.x(x = TotVal, stats = "median", 
                 by1 = Income.S, by2 = Gender,
                 interval.by1 = TRUE,
                 data = MktDATA)
summaries.plot.x(x = Baseline, stats = "mean", 
                 by1 = CustClass, by2 = AOV,
                 breaks.by2 = 5, data = MktDATA)
# - "ci.mean" not allowed with two layers
CustClass_Kids<-paste0(MktDATA$CustClass,"-",MktDATA$Kids)
summaries.plot.x(x = Baseline, stats = "ci.mean", 
                 conf.level = 0.99, by1 = CustClass_Kids,
                 color = "gold", data = MktDATA)

# Arguments adj.breaks and use.scientific
#  Variables with a very wide range
LargeX<-MktDATA$TotVal*1000000
LargeBY<-MktDATA$AOV*5000000 
#  - Default: no scientific notation
summaries.plot.x(LargeX, plot.type = "bars",
                 by1=LargeBY, breaks.by1 = 5, data = MktDATA)
#  - Scientific notation for summaries (axes) 
summaries.plot.x(LargeX, plot.type = "lines",
                 by1=LargeBY, breaks.by1 = 5, 
                 use.scientific = TRUE, data = MktDATA)
#  - Scientific notation for intervals endpoints
summaries.plot.x(LargeX, stats = "ci.mean",
                 plot.type = "lines",
                 by1=LargeBY, breaks.by1 = 5,
                 adj.breaks = FALSE, data = MktDATA)
#  - Scientific notation for intervals endpoints and summaries
summaries.plot.x(LargeX, stats = "quartiles",
                 plot.type = "lines", 
                 by1=LargeBY, breaks.by1 = 5, 
                 adj.breaks = FALSE, use.scientific = TRUE,
                 data = MktDATA)

# Output the table with the requested summaries 
Out_TotVal<-summaries.plot.x(x = TotVal, stats = "ci.mean", 
                             by1 = Education, data = MktDATA)


UBStats documentation built on Sept. 11, 2024, 6:52 p.m.