distr.table.xy: Analysis of a bivariate distribution using cross-tables
In UBStats: Basic Statistics

View source: R/UBStats_Main_Visible_ALL_202406.R

distr.table.xy

R Documentation

Analysis of a bivariate distribution using cross-tables

Description

distr.table.xy() displays tables of joint or conditional distributions.

Usage

distr.table.xy(
  x,
  y,
  freq = "counts",
  freq.type = "joint",
  total = TRUE,
  breaks.x,
  breaks.y,
  adj.breaks = TRUE,
  interval.x = FALSE,
  interval.y = FALSE,
  f.digits = 2,
  p.digits = 0,
  force.digits = FALSE,
  data,
  ...
)

Arguments

`x`, `y`	Unquoted strings identifying the variables whose joint distribution has to be analysed. `x` and `y` can be the name of a vector or a factor in the workspace or the name of one of the columns in the data frame specified in the `data` argument. Note that in the table `x` is displayed on the rows and `y` on the columns.
`freq`	A character vector specifying the set of frequencies to be displayed (more options are allowed). Allowed options (possibly abbreviated) are `"counts"`, `"percentages"` and `"proportions"`.
`freq.type`	A character vector specifying the types of frequencies to be displayed (more types are allowed). Allowed options are `joint` (default) for joint frequencies, `x\|y` (or `column`) for the distributions of `x` conditioned to `y`, and `y\|x` (or `row`) for the distributions of `y` conditioned to `x`.
`total`	Logical value indicating whether the sum of the requested frequencies should be added to the table; default to `TRUE`.
`breaks.x`, `breaks.y`	Allow to classify the variables `x` and/or `y`, if numerical, into intervals. They can be integers indicating the number of intervals of equal width used to classify `x` and/or `y`, or vectors of increasing numeric values defining the endpoints of the intervals (closed on the left and open on the right; the last interval is closed on the right too). To cover the entire range of values taken by one variable, the maximum and the minimum values should be included between the first and the last break. It is possible to specify a set of breaks covering only a portion of the variable's range.
`adj.breaks`	Logical value indicating whether the endpoints of intervals of a numerical variable (`x` or `y`) when classified into intervals should be displayed avoiding scientific notation; default to `TRUE`.
`interval.x`, `interval.y`	Logical values indicating whether `x` and/or `y` are variables measured in classes (`TRUE`). If the detected intervals are not consistent (e.g. overlapping intervals, or intervals with upper endpoint higher than the lower one), the variable is tabulated as it is, even if results are not necessarily consistent; default to `FALSE`.
`f.digits`, `p.digits`	Integer values specifying the number of decimals used to round respectively proportions (default: `f.digits=2`) and percentages (default: `p.digits=0`). If the chosen rounding formats some non-zero values as zero, the number of decimals is increased so that all values have at least one significant digit, unless the argument `force.digits` is set to `TRUE`.
`force.digits`	Logical value indicating whether proportions and percentages should be forcedly rounded to the number of decimals specified in `f.digits` and `p.digits` even if non-zero values are rounded to zero (default to `FALSE`).
`data`	An optional data frame containing `x` and/or `y`. If not found in `data`, the variables are taken from the environment from which `distr.table.xy()` is called.
`...`	Additional arguments to be passed to low level functions.

Value

A list whose elements are the requested tables (converted to dataframes) listing the values taken by the two variables arranged in standard order (logical, alphabetical or numerical order for vectors, order of levels for factors, ordered intervals for classified variables or for variables measured in classes) and the specified joint or conditional types of frequencies.

Author(s)

Raffaella Piccarreta raffaella.piccarreta@unibocconi.it

Examples

data(MktDATA, package = "UBStats")

# Character vectors, factors, and discrete numeric vectors
# - Default: joint counts
distr.table.xy(LikeMost, Children, data = MktDATA) 

# - Joint and conditional distribution of x|y
#   counts and proportions, no totals
distr.table.xy(LikeMost, Education, freq = c("counts","Prop"), 
               freq.type = c("joint","x|y"), total = FALSE,
               data = MktDATA)
# - Joint and conditional row and column distributions (%) 
distr.table.xy(CustClass, Children, freq = "Percentages", 
               freq.type = c("joint","row","column"),
               data = MktDATA)

# Numerical variables classified or measured in classes
# - A numerical variable classified into intervals 
#   and a factor
distr.table.xy(CustClass, TotPurch, 
               breaks.y = c(0,5,10,15,20,35),
               freq = c("Counts","Prop"), freq.type = "y|x", 
               data = MktDATA)

# - Two numerical variables, one measured in classes
#   and the other classified into intervals 
distr.table.xy(Income.S, TotPurch, interval.x = TRUE,
               breaks.y = c(0,5,10,15,20,35),
               freq = c("Counts","Prop"), 
               freq.type = c("row","col"), data = MktDATA)

# Argument force.digits
# - Default: manages possible excess of rounding
distr.table.xy(CustClass, Children, freq = "Percentages", 
               freq.type = c("x|y"),data = MktDATA)
# - Force to the required rounding
distr.table.xy(CustClass, Children, freq = "Percentages", 
               freq.type = c("x|y"), 
               force.digits = TRUE, data = MktDATA)

# Output the list with the requested tables
tables.xy<-distr.table.xy(Income.S, TotPurch, 
                          interval.x = TRUE,
                          breaks.y = c(0,5,10,15,20,35),
                          freq = c("Counts","Prop"), 
                          freq.type = c("joint","row","col"), 
                          data = MktDATA)

UBStats documentation built on Sept. 11, 2024, 6:52 p.m.