find_starts: Find start positions of groups in data

View source: R/find_starts.R

find_startsR Documentation

Find start positions of groups in data

Description

\Sexpr[results=rd, stage=render]{lifecycle::badge("maturing")}

Finds values or indices of values that are not the same as the previous value.

E.g. to use with the "l_starts" method.

Wraps differs_from_previous().

Usage

find_starts(
  data,
  col = NULL,
  return_index = FALSE,
  handle_na = "ignore",
  factor_conversion_warning = TRUE
)

Arguments

data

data.frame or vector.

N.B. If checking a factor, it is converted to a character vector. Conversion will generate a warning, which can be turned off by setting `factor_conversion_warning` to FALSE.

N.B. If `data` is a grouped data.frame, the function is applied group-wise and the output is a list of vectors. The names are based on the group indices (see dplyr::group_indices()).

col

Name of column to find starts in. Used when `data` is a data.frame. (Character)

return_index

Whether to return indices of starts. (Logical)

handle_na

How to handle NAs in the column.

"ignore"

Removes the NAs before finding the differing values, ensuring that the first value after an NA will be correctly identified as new, if it differs from the value before the NA(s).

"as_element"

Treats all NAs as the string "NA". This means, that threshold must be NULL when using this method.

Numeric scalar

A numeric value to replace NAs with.

factor_conversion_warning

Throw warning when converting factor to character. (Logical)

Value

vector with either the start values or the indices of the start values.

N.B. If `data` is a grouped data.frame, the output is a list of vectors. The names are based on the group indices (see dplyr::group_indices()).

Author(s)

Ludvig Renbo Olsen, r-pkgs@ludvigolsen.dk

See Also

Other l_starts tools: differs_from_previous(), find_missing_starts(), group(), group_factor()

Examples

# Attach packages
library(groupdata2)

# Create a data frame
df <- data.frame(
  "a" = c("a", "a", "b", "b", "c", "c"),
  stringsAsFactors = FALSE
)

# Get start values for new groups in column 'a'
find_starts(df, col = "a")

# Get indices of start values for new groups
# in column 'a'
find_starts(df,
  col = "a",
  return_index = TRUE
)

## Use found starts with l_starts method
# Notice: This is equivalent to n = 'auto'
# with l_starts method

# Get start values for new groups in column 'a'
starts <- find_starts(df, col = "a")

# Use starts in group() with 'l_starts' method
group(df,
  n = starts, method = "l_starts",
  starts_col = "a"
)

# Similar but with indices instead of values

# Get indices of start values for new groups
# in column 'a'
starts_ind <- find_starts(df,
  col = "a",
  return_index = TRUE
)

# Use starts in group() with 'l_starts' method
group(df,
  n = starts_ind, method = "l_starts",
  starts_col = "index"
)

LudvigOlsen/R-splitters documentation built on March 7, 2024, 6:59 p.m.