View source: R/order_multiway.R
order_multiway | R Documentation |
Transform a data frame such that two independent categorical variables are factors with levels ordered for display in a multiway dot plot. Multiway data comprise a single quantitative value (or response) for every combination of levels of two categorical variables. The ordering of the rows and panels is crucial to the perception of effects (Cleveland, 1993).
order_multiway(
dframe,
quantity,
categories,
...,
method = NULL,
ratio_of = NULL
)
dframe |
Data frame containing a single quantitative value (or response) for every combination of levels of two categorical variables. Categories may be class character or factor. Two additional numeric columns are required when using the "percent" ordering method. |
quantity |
Character, name (in quotes) of the single multiway quantitative variable |
categories |
Character, vector of names (in quotes) of the two multiway categorical variables |
... |
Not used for passing values; forces subsequent arguments to be referable only by name. |
method |
Character, “median” (default) or “percent”, method of ordering the levels of the categories. The median method computes the medians of the quantitative column grouped by category. The percent method computes percentages based on the same ratio underlying the quantitative percentage variable except grouped by category. |
ratio_of |
Character vector with the names (in quotes) of the
numerator and denominator columns that produced the quantitative
variable, required when |
In our context, "multiway" refers to the data structure and graph design defined by Cleveland (1993), not to the methods of analysis described by Kroonenberg (2008).
Multiway data comprise three variables: a categorical variable of m levels; a second independent categorical variable of n levels; and a quantitative variable (or response) of length mn that cross-classifies the categories, that is, there is a value of the response for each combination of levels of the two categorical variables.
In a multiway dot plot, one category is encoded by the panels, the second category is encoded by the rows of each panel, and the quantitative variable is encoded along identical horizontal scales.
A data frame in data.table
format with
the following properties: rows are preserved; columns specified by
categories
are converted to factors and ordered; the column specified
by quantity
is converted to type double; other columns are preserved
with the exception that columns added by the function overwrite existing
columns of the same name (if any); grouping structures are not preserved.
The added columns are:
CATEGORY_median
columns (when ordering method is "median")Numeric. Two columns of medians of the quantitative variable grouped
by the categorical variables. The CATEGORY
placeholder in
the column name is replaced by a category name from the
categories
argument. For example, suppose
categories = c("program", "people")
and
method = "median"
. The two new column names would be
program_median
and people_median.
CATEGORY_QUANTITY
columns (when ordering method is "percent")Numeric. Two columns of percentages based on the same ratio that
produces the quantitative variable except grouped by the categorical
variables. The CATEGORY
placeholder in the column name is
replaced by a category name from the categories
argument; the
QUANTITY
placeholder is replaced by the quantitative variable
name in the quantity
argument. For example, suppose
categories = c("program", "people")
, and
quantity = "grad_rate"
, and method = "percent"
. The two
new column names would be program_grad_rate
and
people_grad_rate.
Cleveland WS (1993). Visualizing Data. Hobart Press, Summit, NJ.
Kroonenberg PM (2008). Applied Multiway Data Analysis. Wiley, Hoboken, NJ.
# Subset of built-in data set
dframe <- study_results[program == "EE" | program == "ME"]
dframe[, people := paste(race, sex)]
dframe[, c("race", "sex") := NULL]
data.table::setcolorder(dframe, c("program", "people"))
# Class before ordering
class(dframe$program)
class(dframe$people)
# Class and levels after ordering
mw1 <- order_multiway(dframe,
quantity = "stickiness",
categories = c("program", "people"))
class(mw1$program)
levels(mw1$program)
class(mw1$people)
levels(mw1$people)
# Display category medians
mw1
# Existing factors (if any) are re-ordered
mw2 <- dframe
mw2$program <- factor(mw2$program, levels = c("ME", "EE"))
# Levels before conditioning
levels(mw2$program)
# Levels after conditioning
mw2 <- order_multiway(dframe,
quantity = "stickiness",
categories = c("program", "people"))
levels(mw2$program)
# Ordering using percent method
order_multiway(dframe,
quantity = "stickiness",
categories = c("program", "people"),
method = "percent",
ratio_of = c("graduates", "ever_enrolled"))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.