venndir: Directional Venn diagram

venndirR Documentation

Directional Venn diagram

Description

Directional Venn diagram

Usage

venndir(
  setlist,
  overlap_type = c("detect", "concordance", "each", "overlap", "agreement"),
  sets = NULL,
  set_colors = NULL,
  proportional = FALSE,
  return_items = FALSE,
  show_set = c("main", "all", "none"),
  show_label = NA,
  show_items = c(NA, "none", "sign item", "sign", "item"),
  max_items = 3000,
  display_counts = TRUE,
  show_zero = FALSE,
  font_cex = c(1, 1, 0.8),
  poly_alpha = 0.8,
  alpha_by_counts = FALSE,
  label_preset = c("none"),
  label_style = c("basic", "fill", "shaded", "shaded_box", "lite", "lite_box"),
  padding = 3,
  r = 2,
  circle_nudge = NULL,
  rotate_degrees = 0,
  unicode = TRUE,
  big.mark = ",",
  sep = "&",
  curate_df = NULL,
  venn_sp = NULL,
  inside_percent_threshold = NULL,
  plot_style = c("base", "gg"),
  item_cex = NULL,
  item_style = c("text", "gridtext"),
  item_buffer = -0.15,
  sign_count_delim = ": ",
  do_plot = TRUE,
  verbose = FALSE,
  ...
)

Arguments

setlist

list of named vectors, whose names represent set items, and whose values represent direction using values c(-1, 0, 1).

overlap_type

character value indicating the type of overlap logic:

  • "each" records each combination of signs;

  • "overlap" disregards the sign and returns any match item overlap;

  • "concordance" represents counts for full agreement, or "mixed" for any inconsistent overlapping direction;

  • "agreement" represents full agreement in direction as "agreement", and "mixed" for any inconsistent direction.

sets

integer index of sets in setlist to display in the Venn diagram. This subset is useful when creating a Venn diagram for a subset of a list, because it defines consistent colors across all sets, and uses the appropriate subset of colors in the Venn diagram.

set_colors

character vector of R colors, or NULL to use default colors defined by colorjam::rainbowJam().

proportional

logical indicating whether the Venn circles are proportionally sized, also known as a Euler diagram. Note that proportionally sized circles are not guaranteed to represent every possible overlap.

return_items

logical indicating whether to return the items within each overlap set.

show_items

character indicating the type of item label: "none" does not display item labels; "sign" displays the sign; "item" displays the item; "sign item" will display the sign and item together; "item sign" will reverse the order, with item followed by sign. In this context "sign" refers to the incidence values concatenated by space, sent to curate_venn_labels(..., type="sign").

show_zero

logical indicating whether empty overlaps are labeled with zero 0 when show_zero=TRUE, or are blank when show_zero=FALSE.

font_cex

numeric vector with length up to 3, to specify the relative font size for: (1) main set label, (2) the main count label, (3) the directional count label. The default c(1, 1, 0.8) defines the main set label with 1x size, the main count label with 1x size, and the directional labels with 80% that size. It is usually helpful to make the directional count labels slightly smaller than the main count labels, but probably depends upon the figure.

poly_alpha

numeric value between 0 and 1, indicating the alpha transparency of the polygon backgroun color, where poly_alpha=1 is completely 100% opaque (no transparency), and poly_alpha=0.4 is 40% opaque, therefore 60% transparency.

alpha_by_counts

logical indicating whether to define alpha transparency to Venn polygon fill based upon the counts contained in each polygon.

label_style

character string indicating the style of label to display: "basic" displays text with no background shading or border, "fill" displays text on opaque colored background, "shaded" displays text on partially transparent colored background, "lite" displays text on partially transparent lite background, "lite_box" displays text on lite background with border.

unicode

logical passed to curate_venn_labels() indicating whether the directional label can include special Unicode characters.

big.mark

character passed to format() for numeric labels.

sep

character used as a delimiter between set names, the default is "&".

curate_df

data.frame or NULL passed to curate_venn_labels().

venn_sp

NULL or sp::SpatialPolygons that contains one polygon per entry in setlist. This argument is intended to allow custom Venn circles to be supplied. When venn_sp is NULL, then get_venn_shapes() is called.

inside_percent_threshold

numeric value indicating the percent threshold, below which a polygon label is moved outside the polygon by default. The threshold is calculated by area of the polygon divided by total area of the enclosing polygon, multiplied by 100. Therefore inside_percent_threshold=5 will require a polygon to represent at least 5 percent of the total area.

plot_style

character indicating the style of graphics plot: "gg" uses ggplot2; "base" uses base R graphics. This argument is passed to render_venndir().

item_cex

numeric value used to resize item labels, used when show_items is used, passed to render_venndir().

  • When item_cex=NULL or is a single value, auto-scaling is performed based upon the number of items in each overlap polygon, and the relative polygon areas. Any numeric value for item_cex is multiplied by the auto-scaled value.

  • When two or more values are supplied as a vector, the values are recycled and applied to the number of Venn overlap polygons, in the order of polygons with type="overlap" represented in venndir_output$venv_spdf, which is also the order returned by signed_overlaps(), for those overlaps represented by a polygon.

item_style

character string indicating the style used to display item labels when they are enabled. The "gridtext" option is substantially slower for a large number of labels, but enables use of markdown. The "text" option is substantially faster, but does not allow markdown. Therefore the default is "text", and "gridtext" is mostly useful for venn_meme() which usually only has one or a small number of labels in each polygon.

  • "text" uses text() for base R, or geom_text() for ggplot2. This option does not allow markdown, but is very fast.

  • "gridtext" uses gridtext::richtext_grob() for base R, or ggtext::geom_richtext() for ggplot2. This option does allow markdown, but for many item labels (more than 300) this option is notably slower, on the order of several seconds to render.

item_buffer

numeric value representing a fractional buffer width inside each polygon applied before placing labels inside each polygon. This argument is passed to polygon_label_fill() as argument scale_width. The value should be negative, because the value represents the size relative to the full polygon size, and negative values make the polygon smaller.

sign_count_delim

character string used as a delimiter between the sign and counts, when overlap_type is not "overlap".

...

additional arguments are passed to render_venndir().

Details

This function displays a Venn diagram, or when proportional=TRUE it displays a Euler diagram, representing counts in each overlap set.

Input data is supplied either as a list or ⁠incidence matrix⁠, or ⁠signed incidence matrix⁠ (whose values indicate direction), and is passed to signed_overlaps() to summarize counts by Venn overlaps.

By default, when input data contains signed direction, the counts include summary counts for the different forms of agreement in direction. See signed_overlaps() for description of each overlap_type, for different methods of summarizing the overlapping directions.

Detailed workflow

In more detail, input setlist is named by set names, and each vector contains items.

When the vector elements in setlist are not named, the values are considered items. In this case, the values are all defined as 1 for the purpose of defining overlaps, and overlap_type is automatically set to overlap_type="overlap". At this point the "sign" is no longer used.

When the vector elements in setlist are named, the vector element names are considered items, and vector values are considered the "sign". For common scenarios, the "sign" is usually one of the values c(-1, 1) to indicate "up" and "down". However, the "sign" may contain any atomic value, including numeric, integer, or character values for example.

The setlist data is passed to signed_overlaps() which in turn calls list2im_signed(). At this point the incidence matrix values represent the values from each vector in setlist.

For each item, the "sign" is defined as the concatenated signs from each vector in setlist for that item. For example the "sign" may be "1 1 -1", which indicates the item is present in all three vectors of setlist, and is "up", "up", "down" in these vectors. The sign "0 1 0" indicates an item is present only in the second vector of setlist and is "up".

Each item sign is curated by calling curate_venn_labels(). This function is used to convert "sign" to visual symbols, for example "1" may be converted to a Unicode up arrow "\u2191". Unicode output can be disabled with unicode=FALSE. The same function converts "sign" to color, which can be a helpful visual cue. This step can be customized to use any output valid in R and recognized by gridtext::richtext_grob() or ggtext::geom_richtext(). Specifically, it can contain Unicode characters, or limited markdown format recognized by these functions.

Display of Venn or Euler circles

The overlap counts are used to define suitable Venn circles or ellipses when proportional=FALSE, or Euler proportional circles when proportional=TRUE. This step is performed by get_venn_shapes().

For Venn circles, the method allows 1, 2, or 3 sets.

For Venn ellipses, the method allows 4 or 5 sets.

For Euler circles, the method allows as many sets as are supported by eulerr::euler().

In the event the circles or ellipses does not include an overlap, a label is printed below the plot. See render_venndir() and the argument plot_warning=TRUE. For proportional Euler diagrams, even for 3-way diagrams there are often missing overlaps, and this warning is helpful to reinforce what is missing.

Adjusting Venn or Euler circles

As indicated above, when proportional=TRUE sometimes the Euler circles do not represent all set overlaps. It may be helpful to nudge one or more circles to represent a missing overlap, using the argument circle_nudge. This argument takes a list named by one or more names(setlist), of vectors with c(x, y) values to "nudge" that set circle.

Display of counts

By default, total counts are displayed for each set overlap. When setlist contains signed data, count signs are summarized and displayed beside the total counts. The summary options are defined by overlap_type.

Count labels can be styled using label_style, which customizes the background color fill and optional border.

Display of items

Displaying item labels inside the polygons can be a convenient way to answer the question, "What are those shared items?" This step can also include the "sign", showing which shared items also have the same or different "sign" values.

Note that when items are displayed, summary counts are currently hidden. In future the counts may be positioned outside the polygons.

More customizations

This function actually calls render_venndir() to display the diagram. The output from this function can be customized and passed to render_venndir() or ggrender_venndir() to allow much more customized options.

See Also

Other venndir core: render_venndir(), signed_overlaps(), textvenn(), venn_meme()

Examples

setlist <- make_venn_test(100, 3);
print(setlist);
venndir(setlist)

setlist <- make_venn_test(100, 3, do_signed=TRUE);
venndir(setlist)
venndir(setlist)

venndir(setlist, label_style="basic")
venndir(setlist, label_style="fill")
venndir(setlist, label_style="shaded")
venndir(setlist, label_style="shaded_box")
venndir(setlist, label_style="lite")
venndir(setlist, label_style="lite_box")

# proportional Euler diagram
venndir(setlist,
   proportional=TRUE);

# nudge circles and hide the zero
venndir(setlist,
   proportional=TRUE,
   show_zero=FALSE,
   circle_nudge=list(set_C=c(-0.5, 0.1))
)

# nudge circles so one overlap is no longer shown
venndir(setlist,
   proportional=TRUE,
   show_zero=FALSE,
   circle_nudge=list(set_C=c(-1.4, 0.1))
)

setlist2k <- make_venn_test(2000, 3, 80, do_signed=TRUE);
venndir(setlist2k)
venndir(setlist2k, proportional=TRUE)

# example using character values
setlist <- make_venn_test(100, 3, do_signed=TRUE)
# make a simple character vector list
setlistv <- lapply(setlist, function(i){
   j <- letters[i+3];
   names(j) <- names(i);
   j;
});
# make custom curate_df
curate_df <- data.frame(
   from=c("b", "d"),
   sign=c("b", "d"),
   color=c("blue", "red"),
   stringsAsFactors=FALSE)
vo <- venndir(setlistv,
   overlap_type="each",
   font_cex=c(1.5, 1.5, 0.9), 
   curate_df=curate_df,
   show_zero=TRUE);



jmw86069/venndir documentation built on Sept. 26, 2023, 3:43 a.m.