row: Find the sum of selected columns within a row
In OuhscBbmc/OuhscMunge: Data Manipulation Operations

row	R Documentation

Find the sum of selected columns within a row

Description

Sums across columns within a row, while accounting for nonmissingness. Specify the desired columns by passing their explicit column names or by passing a regular expression to matches the column names.

Usage

row_sum(
  d,
  columns_to_process = character(0),
  pattern = "",
  new_column_name = "row_sum",
  threshold_proportion = 0.75,
  nonmissing_count_name = NA_character_,
  verbose = FALSE
)

row_mean(
  d,
  columns_to_process = character(0),
  pattern = "",
  new_column_name = "row_mean",
  threshold_proportion = 0.75,
  nonmissing_count_name = NA_character_,
  verbose = FALSE
)

Arguments

`d`	The data.frame containing the values to sum. Required.
`columns_to_process`	A character vector containing the columns names to process (e.g., to average or to sum). If empty, `pattern` is used to select columns. Optional.
`pattern`	A regular expression pattern passed to `base::grep()` (with `perl = TRUE`). Optional
`new_column_name`	The name of the new column that represents the sum of the specified columns. Required.
`threshold_proportion`	Designates the minimum proportion of columns that have a nonmissing values (within each row) in order to return a sum. Required; defaults to to 0.75. In other words, by default, if less than 75% of the specified cells are missing within a row, the row sum will be `NA`.
`nonmissing_count_name`	If a non-NA value is passed, a second column will be added to `d` that contains the row's count of nonmissing items among the selected columns. Must be a valid column name. Optional.
`verbose`	a logical value to designate if extra information is displayed in the console, such as which columns are matched by `pattern`.

Details

If the specified columns are all logicals or integers, the new column will be an integer. Otherwise the new column will be a double.

Value

The data.frame d, with the additional column containing the row sum. If a valid value is passed to nonmissing_count_name, a second column will be added as well.

Author(s)

Will Beasley

Examples

mtcars |>
  OuhscMunge::row_sum(
    columns_to_process = c("cyl", "disp", "vs", "carb"),
    new_column_name    = "engine_sum"
  )

mtcars |>
  OuhscMunge::row_sum(
    columns_to_process     = c("cyl", "disp", "vs", "carb"),
    new_column_name        = "engine_sum",
    nonmissing_count_name  = "engine_nonmissing_count"
  )

mtcars |>
  OuhscMunge::row_mean(
    columns_to_process     = c("cyl", "disp", "vs", "carb"),
    new_column_name        = "engine_mean",
    nonmissing_count_name  = "engine_nonmissing_count"
  )

if (require(tidyr))
  tidyr::billboard |>
    OuhscMunge::row_sum(
      pattern               = "^wk\\d{1,2}$",
      new_column_name       = "week_sum",
      threshold_proportion  = .1,
      verbose               = TRUE
    ) |>
    dplyr::select(
      artist,
      date.entered,
      week_sum,
    )

  tidyr::billboard |>
    OuhscMunge::row_sum(
      pattern               = "^wk\\d$",
      new_column_name       = "week_sum",
      verbose               = TRUE
    ) |>
    dplyr::select(
      artist,
      date.entered,
      week_sum,
    )

OuhscBbmc/OuhscMunge documentation built on Dec. 5, 2024, 4:34 a.m.