row: Find the sum of selected columns within a row

rowR Documentation

Find the sum of selected columns within a row

Description

Sums across columns within a row, while accounting for nonmissingness. Specify the desired columns by passing their explicit column names or by passing a regular expression to matches the column names.

Usage

row_sum(
  d,
  columns_to_process = character(0),
  pattern = "",
  new_column_name = "row_sum",
  threshold_proportion = 0.75,
  nonmissing_count_name = NA_character_,
  verbose = FALSE
)

row_mean(
  d,
  columns_to_process = character(0),
  pattern = "",
  new_column_name = "row_mean",
  threshold_proportion = 0.75,
  nonmissing_count_name = NA_character_,
  verbose = FALSE
)

Arguments

d

The data.frame containing the values to sum. Required.

columns_to_process

A character vector containing the columns names to process (e.g., to average or to sum). If empty, pattern is used to select columns. Optional.

pattern

A regular expression pattern passed to base::grep() (with perl = TRUE). Optional

new_column_name

The name of the new column that represents the sum of the specified columns. Required.

threshold_proportion

Designates the minimum proportion of columns that have a nonmissing values (within each row) in order to return a sum. Required; defaults to to 0.75. In other words, by default, if less than 75% of the specified cells are missing within a row, the row sum will be NA.

nonmissing_count_name

If a non-NA value is passed, a second column will be added to d that contains the row's count of nonmissing items among the selected columns. Must be a valid column name. Optional.

verbose

a logical value to designate if extra information is displayed in the console, such as which columns are matched by pattern.

Details

If the specified columns are all logicals or integers, the new column will be an integer. Otherwise the new column will be a double.

Value

The data.frame d, with the additional column containing the row sum. If a valid value is passed to nonmissing_count_name, a second column will be added as well.

Author(s)

Will Beasley

Examples

mtcars |>
  OuhscMunge::row_sum(
    columns_to_process = c("cyl", "disp", "vs", "carb"),
    new_column_name    = "engine_sum"
  )

mtcars |>
  OuhscMunge::row_sum(
    columns_to_process     = c("cyl", "disp", "vs", "carb"),
    new_column_name        = "engine_sum",
    nonmissing_count_name  = "engine_nonmissing_count"
  )

mtcars |>
  OuhscMunge::row_mean(
    columns_to_process     = c("cyl", "disp", "vs", "carb"),
    new_column_name        = "engine_mean",
    nonmissing_count_name  = "engine_nonmissing_count"
  )

if (require(tidyr))
  tidyr::billboard |>
    OuhscMunge::row_sum(
      pattern               = "^wk\\d{1,2}$",
      new_column_name       = "week_sum",
      threshold_proportion  = .1,
      verbose               = TRUE
    ) |>
    dplyr::select(
      artist,
      date.entered,
      week_sum,
    )

  tidyr::billboard |>
    OuhscMunge::row_sum(
      pattern               = "^wk\\d$",
      new_column_name       = "week_sum",
      verbose               = TRUE
    ) |>
    dplyr::select(
      artist,
      date.entered,
      week_sum,
    )

OuhscBbmc/OuhscMunge documentation built on Dec. 5, 2024, 4:34 a.m.