fillByGroup: Fill NAs in data.frame by grouping
In HelenLindsay/AbNames: Standardize Antibody Names

fillByGroup

R Documentation

Fill NAs in data.frame by grouping

Description

Wrapper of tidyr::fill with checks for NAs in grouping values and the option to fill with the majority value if there is more than one value per group. If an error is raised, an example group that causes an error is returned

Usage

fillByGroup(
  df,
  group,
  fill,
  method = c("only_na", "all"),
  multiple = c("stop", "mode", "ignore")
)

Arguments

`df`	A data.frame or tibble with missing (NA) values to be filled
`group`	(character(n)) Names of column(s) to group by
`fill`	(character(n)) Name(s) of column(s) to fill
`method`	Either "only_na", of only missing entries should be filled, or "all", if all entries should be replaced with their group mode
`multiple`	(Default: "stop") How should multiple values in columns to be filled be handled? Either "stop" (raise an error), "mode" (select the most common value) or "ignore" (entries with multiple possible modes are set to NA).

Value

df, with NA entries filled where possible by grouping and taking the most common value within each column and group.

Author(s)

Helen Lindsay

Examples


df <- data.frame(A = rep(c("A", "B"), each = 3),
                B = c(NA, "C", "D", "E", NA, "E"))
# Setting multiple = "mode" means that the most common value will be
# used for filling, or the first if there are ties.
fillByGroup(df, group = "A", fill = "B", multiple = "mode")

# Setting multiple = "ignore" means that groups with multiple values
# will not be filled.
fillByGroup(df, group = "A", fill = "B", multiple = "ignore")

HelenLindsay/AbNames documentation built on June 6, 2023, 1:18 p.m.