#' Parallel coordinate plot for both numeric and categorical data
#'
#' The parallel coordinate plot displays multiple y-axes, and shows the observations across
#' several dimensions as lines. This function work well with both numeric and categorical
#' variables at the same time after proper scaling.
#'
#' \code{method} is a character string that denotes how to scale the variables
#' in the parallel coordinate plot. Options are named in the same way as the options in `ggparcoord` (GGally):
#' \itemize{
#' \item{\code{raw}}{: raw data used, no scaling will be done.}
#' \item{\code{std}}{: univariately, subtract mean and divide by standard deviation. To get values into a [0,1] interval we use a linear transformation of f(y) = y/4+0.5. }
#' \item{\code{robust}}{: univariately, subtract median and divide by median absolute deviation. To get values into a [0,1] interval we use a linear transformation of f(y) = y/4+0.5. }
#' \item{\code{uniminmax}}{: univariately, scale so the minimum of the variable is zero, and the maximum is one.}
#' \item{\code{globalminmax}}{: global scaling; the global maximum is mapped to 1,
#' global minimum across the variables is mapped to 0. }
#' }
#'
#' \code{overplot} is a character string that denotes how to conduct overplotting
#' in the parallel coordinate plot. The lines from \code{geom_pcp()} are drawn according to the order they shown in your data set in default.
#' Note that this argument provides a framework, the order in the original data still has a role in overplotting,
#' especially for lines outside factor blocks(for \code{hierarchical} only), plots with \code{resort} turned on(for methods except \code{hierarchical}):
#' \itemize{
#' \item{\code{original}}{: use the original order, first shown first drawn.}
#' \item{\code{hierarchical}}{: hierarchically drawn according to the combinations of levels of factor variables,
#' which will change according to different level structures of factor variables you provided.
#' This was done separately for each factor block. The right most factor variables have the largest weight across a sequence of factor variables,
#' the last level of a factor variable has the largest weight within a factor variable.
#' Groups of lines with larger weight will be drawn on top. Lines outside of factor blocks still use the original order, which is different from other methods.}
#' \item{\code{smallfirst}}{: smaller groups of lines are drawn first, placing large groups of lines on top.}
#' \item{\code{largefirst}}{: larger groups of lines are drawn first, placing small groups of lines on top.}
#' }
#' @param mapping Set of aesthetic mappings created by [aes()] or
#' [aes_()]. If specified and `inherit.aes = TRUE` (the
#' default), it is combined with the default mapping at the top level of the
#' plot. You must supply `mapping` if there is no plot mapping.
#' @param data The data to be displayed in this layer. There are three
#' options:
#'
#' If `NULL`, the default, the data is inherited from the plot
#' data as specified in the call to [ggplot()].
#'
#' A `data.frame`, or other object, will override the plot
#' data. All objects will be fortified to produce a data frame. See
#' [fortify()] for which variables will be created.
#'
#' A `function` will be called with a single argument,
#' the plot data. The return value must be a `data.frame`, and
#' will be used as the layer data.
#' @param stat The statistical transformation to use on the data for this
#' layer, as a string.
#' @param position Position adjustment, either as a string, or the result of
#' a call to a position adjustment function.
#' @param na.rm If `FALSE`, the default, missing values are removed with
#' a warning. If `TRUE`, missing values are silently removed.
#' @param show.legend logical. Should this layer be included in the legends?
#' `NA`, the default, includes if any aesthetics are mapped.
#' `FALSE` never includes, and `TRUE` always includes.
#' It can also be a named logical vector to finely select the aesthetics to
#' display.
#' @param inherit.aes If `FALSE`, overrides the default aesthetics,
#' rather than combining with them. This is most useful for helper functions
#' that define both data and aesthetics and shouldn't inherit behaviour from
#' the default plot specification, e.g. [borders()].
#' @param ... Other arguments passed on to [layer()]. These are
#' often aesthetics, used to set an aesthetic to a fixed value, like
#' `colour = "red"` or `size = 3`. They may also be parameters
#' to the paired geom/stat.
#'
#' @param method string specifying the method that should be used for scaling the values
#' in a parallel coordinate plot (see Details).
#' @param freespace A number in 0 to 1 (excluded). The total gap space among levels within each factor variable
#' @param boxwidth A number or a numeric vector (length equal to the number of factor variables) for the widths of the boxes for each factor variable
#' @param rugwidth A number or a numeric vector (length equal to the number of numeric variables) for the widths of the rugs for numeric variable
#' @param interwidth A number or a numeric vector (length equal to the number of variables minus 1) for the width for the lines between every neighboring variables, either
#' a scalar or a vector.
#' @param resort A integer or a integer vector to indicate the positions of vertical axes inside (can't be the boundary of) a sequence of factors.
#' To break three or more factors into sub factor blocks,
#' and conduct resort at the axes. Makes the plot clearer for adjacent factor variables.
#' @param overplot methods used to conduct overplotting when overplotting becomes an issue.
#' @param reverse reverse the plot, useful especially when you want to reverse the structure in factor blocks,
#' i.e. to become more ordered from right to left
#' @param arrow specification for arrow heads, as created by arrow()
#' @param arrow.fill fill colour to use for the arrow head (if closed). NULL means use colour aesthetic
#' @param lineend Line end style (round, butt, square)
#' @param linejoin Line join style (round, mitre, bevel)
#' @import ggplot2
#' @export geom_pcp
#' @examples
#' library(ggplot2)
#' library(dplyr)
#' data(mtcars)
#' mtcars %>%
#' mutate(cyl = factor(cyl),
#' vs = factor(vs),
#' am = factor(am),
#' gear = factor(gear)) %>%
#' ggplot(aes(vars = vars(cyl, vs:gear))) +
#' geom_pcp(aes(color = vs), boxwidth = 0.2, resort = 2:3) +
#' geom_pcp_box(boxwidth = 0.2) +
#' geom_pcp_band(boxwidth = 0.2, resort = 2:3) +
#' geom_pcp_text(boxwidth = 0.2)
geom_pcp <- function(
mapping = NULL, data = NULL,
stat = "pcp", position = "identity",
...,
method = "uniminmax",
freespace = 0.1,
boxwidth = 0,
rugwidth = 0,
interwidth = 1,
resort = NULL,
overplot = "hierarchical",
reverse = FALSE,
arrow = NULL,
arrow.fill = NULL,
lineend = "butt",
linejoin = "round",
na.rm = FALSE,
show.legend = NA,
inherit.aes = TRUE) {
#browser()
ll <- layer(
data = data,
mapping = mapping,
stat = stat,
geom = GeomPcp,
position = position,
show.legend = show.legend,
inherit.aes = inherit.aes,
params = list(
method = method,
freespace = freespace,
boxwidth = boxwidth,
rugwidth = rugwidth,
interwidth = interwidth,
resort = resort,
overplot = overplot,
reverse = reverse,
arrow = arrow,
arrow.fill = arrow.fill,
lineend = lineend,
linejoin = linejoin,
na.rm = na.rm,
...
)
)
ll$comp_aes <- ll$compute_aesthetics
ll$compute_aesthetics <- compute_aesthetics_pcp
ll$setup_layer <- setup_layer_pcp
ll
}
GeomPcp <- ggproto(
"GeomPcp", Geom,
default_aes = aes(
colour = "grey30", size = 0.5, linetype = "solid", alpha = 1,
linewidth=.1, stroke = 2, method = "uniminmax", vars = NULL
),
draw_panel = function(data, panel_params, coord,
arrow = NULL,
arrow.fill = NULL,
lineend = "butt",
linejoin = "round",
na.rm = na.rm) {
# if (arrange)
# data <- data %>% group_by(group) %>% mutate(n = n()) %>% arrange(desc(n))
GeomSegment$draw_panel(data, panel_params, coord,
arrow = arrow,
arrow.fill = arrow.fill,
lineend = lineend,
linejoin = linejoin,
na.rm = na.rm)
}
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.