pivot_longer_multicol: pivot_longer_multicol

View source: R/pivot_longer_multicol.R

pivot_longer_multicolR Documentation

pivot_longer_multicol

Description

This function chains two pivots from pivot_longer with one pivot from pivot_wider to pivot from wide to long on multiple columns. The columns to be pivoted must all have a "tag" in in the original column name (e.g., "sch_1" and "sch_2" have a "_1" and a "_2" tag indicating that data from two schools is present in the dataset). The names must all have some common formatting/symbols before the tag in order to allow all variables to be manipulated at the same time using this function.

Usage

pivot_longer_multicol(
  .dat,
  .cols,
  .tag_group = 2,
  .capture_groups = "(.+)_(.+)",
  .tag_name = "time"
)

Arguments

.dat

A dataframe containing the data to be pivoted.

.cols

Either a string vector listing the names of the columns to be pivoted or a starts_with selection helper.

.tag_group

Numeric scalar. Indicates the capture group (as defined by the regular expression in .capture_groups) that contains the commonly formatted "tag". Defaults to 2 (e.g., "sch_1" contains the tag in the second group, while "t1_sch" contains a tag in the first). The tags will eventually be passed to a variable (named by .tag_name), so it is helpful if the tags are meaningful. For example, for variables "sch_1" and "sch_2", a new variable will be created that will take values of "1" and "2".

.capture_groups

A string containing a regular expression defining two capture groups, one capturing the name of the variable to be saved and one group capturing the commonly formatted "tag" of the variables. Defaults to "(.+)_(.+)", which splits the variable names at the last "" character present in the variable name (the final "" is removed).

.tag_name

String. Indicates the variable name that should be given to the new variable containing the variable tags. Because this function was created to manage the longitudinal school mobility process, the argument defaults to "time".

Value

This function returns a dataframe with the columns in .cols pivoted to have a single column per unique value of the non-tag in the capture group and a number of rows equal to the number of unique values in the tag. NOTE: If the variables to be pivoted contain different tag lengths (e.g., "sch_1" and "sch_2" have a tag with length 2 while "id_1", "id_2", "id_3", and "id_4" have a tag with length 4), the variable with the shorter tag length will have missing values where the tags do not overlap.

Examples

## Not run: 

# prepare some data

temp_dat <- tibble(
  x = rnorm(100),
  y = rnorm(100),
  z_1 = rep(c(1, 0), 50),
  z_2 = rep(c(5, 6, 7, 8), 25),
  g_1 = rnorm(100),
  g_2 = rnorm(100),
  g_3 = rnorm(100),
  g_4 = rnorm(100)
)

pivot_longer_multicol(
  .dat = temp_dat,
  .cols = tidyr::matches("_")
)


## End(Not run)

tessaleejohnson/corclus documentation built on Oct. 11, 2022, 3:46 a.m.