convert-pair-value: Convert between long pair-value data and matrix

convert-pair-valueR Documentation

Convert between long pair-value data and matrix

Description

Functions for conversion between long pair-value data (data frame with columns for pair identifiers and value column) and matrix.

Usage

long_to_mat(tbl, row_key, col_key, value = NULL, fill = NULL,
  silent = FALSE)

mat_to_long(mat, row_key, col_key, value, drop = FALSE)

Arguments

tbl

Data frame with pair-value data.

row_key

String name of column for first key in pair.

col_key

String name of column for second key in pair.

value

String name of column for value (or NULL for long_to_mat()).

fill

Value to fill for missing pairs.

silent

Use TRUE to omit message about guessed value column (see Details).

mat

Matrix with pair-value data.

drop

Use TRUE to drop rows with missing value (see Details).

Details

Pair-value data is commonly used in description of pairs of objects. Pair is described by two keys (usually integer or character) and value is an object of arbitrary nature.

In long format there are at least three columns: for first key in pair, for second key and for value (might be more). In matrix format pair-value data is represented as matrix of values with row names as character representation of first key, column names - second key.

long_to_mat() works as follows:

  • Pair identifiers are taken from columns with names row_key (to be used as row names) and col_key (to be used as column names). Unique identifiers (and future dimension names) are determined with levels2(). This is a way to target function on specific set of pairs by using factor columns. Note that NAs are treated as single unknown key and put on last place (in case of non-factor).

  • Values are taken from column with name value. Note that if value has length 0 (typically NULL) then long_to_mat() will take first non-key column. If there is no such column, it will use vector of dummy values (NAs or fills). In both cases a message is given if silent = FALSE.

  • Output is a matrix with described row and column names. Value of pair "key_1" and "key_2" is stored at intersection of row "key_1" and "key_2". Note that in case of duplicated pairs the value from first occurrence is taken.

mat_to_long() basically performs inverse operation to long_to_mat() but pair identifiers are always character. If drop = TRUE it drops rows with values (but not keys) being missing.

Value

long_to_mat() returns a matrix with selected values where row names indicate first key in pair, col names - second.

mat_to_long() returns a tibble with three columns: the one for first key in pair, the one for second, and the one for value.

Examples

long_data <- data.frame(
  key_1 = c("a", "a", "b"),
  key_2 = c("c", "d", "c"),
  val = 1:3,
  stringsAsFactors = FALSE
)

mat_data <- long_data %>% long_to_mat("key_1", "key_2", "val")
print(mat_data)

# Converts to tibble
mat_data %>% mat_to_long("new_key_1", "new_key_2", "new_val")

# Drops rows with valuus missing
mat_data %>% mat_to_long("new_key_1", "new_key_2", "new_val", drop = TRUE)

echasnovski/comperes documentation built on March 5, 2023, 4:27 p.m.