contrast2comp: Convert contrast to short-form comp, convert comp to contrast

contrast2compR Documentation

Convert contrast to short-form comp, convert comp to contrast

Description

Convert contrast to short-form comp, and convert comp back to original contrast.

Usage

contrast2comp(
  contrast_names,
  contrast_delim = "-",
  contrast_factor_delim = "_",
  comp_factor_delim = ":",
  add_attr = FALSE,
  verbose = FALSE,
  ...
)

comp2contrast(
  comps,
  contrast_delim = "-",
  contrast_factor_delim = "_",
  comp_factor_delim = ":",
  factor_order = NULL,
  add_attr = FALSE,
  verbose = FALSE,
  ...
)

names_contrast2comp(
  contrast_names,
  contrast_delim = "-",
  contrast_factor_delim = "_",
  comp_factor_delim = ":",
  add_attr = FALSE,
  verbose = FALSE,
  ...
)

names_comp2contrast(
  comps,
  contrast_delim = "-",
  contrast_factor_delim = "_",
  comp_factor_delim = ":",
  factor_order = NULL,
  add_attr = FALSE,
  verbose = FALSE,
  ...
)

Arguments

contrast_names

character vector of statistical contrasts

contrast_delim

character string delimiter between groups, typically "-" to indicate subtraction of group means.

contrast_factor_delim

character string delimiter between design factors in a contrast.

comp_factor_delim

character string delimiter between design factors in a comp.

add_attr

logical indicating whether to add attributes to the output, containing the input values provided.

verbose

logical indicating whether to print verbose output, or for much more verbose output use verbose=2.

...

additional arguments are ignored.

factor_order

integer, list of integer vectors, or NULL. When supplied as integer vector, it is converted to a list and expanded to length() of the input. The integer values are used by comp2contrast() to force the order of factor comparisons for two-way and higher order contrasts.

Details

These functions are intended to reduce the number of characters required to represent a statistical contrast. contrast2comp() converts long to short form, and comp2contrast() converts short to long form.

  1. "contrast": the fully-defined contrast

  2. "comp": equivalent abbreviated form, a short comparison

Note that one goal is to reduce characters in Excel worksheet names, currently limited to 31 characters. Also note, the ":" delimiter is not permitted in Excel sheet names, thus save_sestats() uses semicolon ";". This limitation may warrant using a different default delimiter between factors, such as comma ",", or pipe "|", or forward-slash "/".

Assumptions

The key assumption is that an experimental group name is a character string composed of its factor levels, with a delimiter between factors. For example:

  • CellA_Treated - is interpreted as "CellA" and "Treated"

  • CellA_Control - is interpreted as "CellA" and "Control"

A contrast therefore:

  • CellA_Treated-CellA_Control can be re-written

  • CellA:Treated-Control

Factors must be in identical order for all groups, and there must be no empty factor levels. Do not use: "CellA_Treated_Time0", "CellA_Time0".

Finally, the overall assumption is that contrasts are composed of reasonable comparisons between factor levels, with no more factors being compared than the depth of contrast. For example, a one-way contrast can compare one factor, a two-way contrast can compared two factors, and so on. In most cases where the assumptions above are broken, the output should be the same as the input, with no change.

When using groups_to_sedesign(), the output contrasts should all meet these requirements, therefore all contrasts can be converted to "comp" form for plot labels, and converted back to "contrasts" as needed.

Delimiters can be customized, however they must all be single-character values, avoiding ⁠()[]⁠ which are reserved. For example, sometimes factors are separated by "." such as in the contrast: "A.B-C.B". In this case, use: contrast2comp("A.B-C.B", contrast_factor_delim="."). The corresponding conversion back to contrast would be: comp2contrast("A-C:B", contrast_factor_delim=".")

Design goals for conversion to short form comp

  1. "comp" should be interchangeable with "contrast"

    • use contrast2comp() and comp2contrast()

  2. when a contrast cannot be abbreviated, comp will use contrast

    • see examples

    • when more factors are being compared than the contrast order, the function will leave the contrast as-is

    • Consider "CellA_Treated-CellB_Control". Both "CellA-CellB" and "Treated-Control" are compared in a one-way contrast, therefore it cannot be abbreviated.

  3. "comp" shall not create any whitespace

    • factors will be delimited with ":"

    • factor levels will be delimited with "-"

    • other potential delimiters "*", "+" already have meaning in formula context.

  4. "comp" shall not use parentheses "()", where possible

    • the goal is to reduce characters

    • parentheses are not necessary for balanced contrasts

    • unbalanced contrasts (see point 2) will retain the original syntax

  5. the order of factors should be maintained in "comp"

    • goal is to reproduce the original correct group name in contrast form

    • the original group name is necessary for the design matrix

Worked examples

  1. One-way contrast

    • contrast: CellA_Treated-CellA_Control

    • comment: CellA is unchanged, Treated-Control is changed

    • comp: CellA:Treated-Control

  2. Two-way contrast

    • contrast: (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)

    • comment: CellA-CellB is changed, Treated-Control is changed

    • comp: CellA-CellB:Treated-Control

    • note: when converting comp CellA-CellB:Treated-Control back to contrast, two forms are mathematically equivalent:

    # form 1
    (CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)
    # form 2
    (CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control)
    # both are equivalent
    CellA_Treated - CellB_Treated - CellA_Control + CellB_Control
    
    • These two forms can be controlled in comp2contrast() with argument factor_order.

  3. Three-way contrast (it happens rarely, but does happen)

    • contrast:

    (CellA_Treated_Mut-CellA_Control_Mut)-(CellB_Treated_Mut-CellB_Control_Mut) -
    (CellA_Treated_WT-CellA_Control_WT)-(CellB_Treated_WT-CellB_Control_WT)
    
    • comment: CellA-CellB, Treated-Control, Mut-WT are changed

    • comp: CellA-CellB:Treated-Control:Mut-WT

  4. One-way contrast with additional unchanged factors

    • contrast: CellA_Treated_WT-CellA_Control_WT

    • comment: CellA, WT are unchanged, Treated-Control is changed

    • comp: CellA:Treated-Control:WT

  5. Unbalanced one-way contrast

    • contrast: CellA_Treated-CellB_Control

    • comment: CellA-CellB and Treated-Control are changed

    • comp: CellA_Treated-CellB_Control

  6. Mis-directed two-way contrast

    • contrast: (CellA_Treated-CellA_Control)-(CellB_Control-CellB_Treated)

    • comment: CellA-CellB are changed, Treated-Control/Control-Treated are changed

    • Note: The Treated-Control and Control-Treated do not agree in direction. The output is partially abbreviated, and maintains the original direction to prevent loss of information.

    • comp: (CellA:Treated-Control)-(CellB:Control-Treatment)

Examples

contrast_names <- c(
   "CellA_Treated-CellA_Control",
   "CellB_Treated-CellB_Control",
   "CellB_Treated-CellA_Control",
   "(CellA_Treated-CellA_Control)-(CellB_Treated-CellB_Control)",
   "(CellB_Treated-CellB_Control)-(CellA_Treated-CellA_Control)",
   "(CellA_Treated-CellB_Treated)-(CellA_Control-CellB_Control)"
);
contrast2comp(contrast_names)

contrast2comp(contrast_names, comp_factor_delim=";")

comps <- contrast2comp(contrast_names)
data.frame(contrast_names,
   nchar_contrasts=nchar(contrast_names),
   comps,
   nchar_comps=nchar(comps))

# compare conversion back to contrast
data.frame(contrast_names,
   comps=comps,
   contrast_again=comp2contrast(comps),
   changed=contrast_names != comp2contrast(comps))

# factors can be ordered by contrast
contrasts2 <- comp2contrast(comps,
   factor_order=list(1:2, 1:2, 1:2,
      2:1, 2:1, 1:2))
# compare conversion back to contrast
data.frame(contrast_names,
   comps=comps,
   contrasts2,
   changed=contrast_names != contrasts2)

# note change in direction for two-way contrasts
# Treated-Control and Control-Treated
contrast_diff <- "(CellA_Treated-CellA_Control)-(CellB_Control-CellB_Treated)";
comp_diff <- contrast2comp(contrast_diff)
# partially abbreviated comp
comp_diff
# it is converted back to original form
comp2contrast(comp_diff)

data.frame(contrast_diff,
   nchar_contrasts=nchar(contrast_diff),
   comp_diff,
   nchar_comps=nchar(comp_diff))

# evaluate the rare three-way contrast
contrast_names_3way <- c(
   contrast_names[4],
   gsub("([a-zA-Z])([-)])", "\\1_Mut\\2", contrast_names[4]),
   gsub("([a-zA-Z])([-)])", "\\1_WT\\2", contrast_names[4]),
   paste0("(",
   gsub("([a-zA-Z])([-)])", "\\1_Mut\\2", contrast_names[4]),
   ")-(",
   gsub("([a-zA-Z])([-)])", "\\1_WT\\2", contrast_names[4]),
   ")"))
contrast_names_3way <- c(
   paste0("(CellA_Treated-CellA_Control)-",
      "(CellB_Treated-CellB_Control)"),
   paste0("(CellA_Treated_Mut-CellA_Control_Mut)-",
      "(CellB_Treated_Mut-CellB_Control_Mut)"),
   paste0("(CellA_Treated_WT-CellA_Control_WT)-",
      "(CellB_Treated_WT-CellB_Control_WT)"),
   paste0("((CellA_Treated_Mut-CellB_Treated_Mut)-",
      "(CellA_Control_Mut-CellB_Control_Mut))-",
      "((CellA_Treated_WT-CellB_Treated_WT)-",
      "(CellA_Control_WT-CellB_Control_WT))"),
   paste0("((CellA_Treated_Mut-CellA_Control_Mut)-",
      "(CellB_Treated_Mut-CellB_Control_Mut))-",
      "((CellA_Treated_WT-CellA_Control_WT)-",
      "(CellB_Treated_WT-CellB_Control_WT))"))
comp_3way <- contrast2comp(contrast_names_3way);
data.frame(contrast_names_3way,
   nchar_contrasts=nchar(contrast_names_3way),
   comp_3way,
   nchar_comps=nchar(comp_3way));

# compare to input
contrasts2_3way <- comp2contrast(comp_3way);
# mathematically correct contrasts but in different order from input
data.frame(contrast_names_3way,
   contrasts2_3way,
   changed=contrast_names_3way != contrasts2_3way);

# custom factor order produces the same contrasts as input
contrasts2_3way_v2 <- comp2contrast(comp_3way,
   factor_order=list(c(2,1,3), c(2,1,3), c(2,1,3),
      c(1,2,3), c(2,1,3)));
data.frame(contrast_names_3way,
   contrasts2_3way_v2,
   changed=contrast_names_3way != contrasts2_3way_v2);


jmw86069/jamses documentation built on May 31, 2024, 1:36 p.m.