fillByGroup: Fill NAs in data.frame by grouping

View source: R/fillByGroup.R

fillByGroupR Documentation

Fill NAs in data.frame by grouping

Description

Wrapper of tidyr::fill with checks for NAs in grouping values and the option to fill with the majority value if there is more than one value per group. If an error is raised, an example group that causes an error is returned

Usage

fillByGroup(
  df,
  group,
  fill,
  method = c("only_na", "all"),
  multiple = c("stop", "mode", "ignore")
)

Arguments

df

A data.frame or tibble with missing (NA) values to be filled

group

(character(n)) Names of column(s) to group by

fill

(character(n)) Name(s) of column(s) to fill

method

Either "only_na", of only missing entries should be filled, or "all", if all entries should be replaced with their group mode

multiple

(Default: "stop") How should multiple values in columns to be filled be handled? Either "stop" (raise an error), "mode" (select the most common value) or "ignore" (entries with multiple possible modes are set to NA).

Value

df, with NA entries filled where possible by grouping and taking the most common value within each column and group.

Author(s)

Helen Lindsay

Examples


df <- data.frame(A = rep(c("A", "B"), each = 3),
                B = c(NA, "C", "D", "E", NA, "E"))
# Setting multiple = "mode" means that the most common value will be
# used for filling, or the first if there are ties.
fillByGroup(df, group = "A", fill = "B", multiple = "mode")

# Setting multiple = "ignore" means that groups with multiple values
# will not be filled.
fillByGroup(df, group = "A", fill = "B", multiple = "ignore")

HelenLindsay/AbNames documentation built on June 6, 2023, 1:18 p.m.