at: Scoped Helper Verbs

center_atR Documentation

Scoped Helper Verbs

Description

Scoped helper verbs included in this R Documentation file allow for targeted commands on specified columns. They also rename the ensuing output to conform to my preferred style. The commands here are multiple and explained in the details section below.

Usage

center_at(data, x, prefix = "c", na = TRUE, .by = NULL)

diff_at(data, x, o = 1, prefix = "d", .by = NULL)

group_mean_center_at(
  data,
  x,
  mean_prefix = "mean",
  prefix = "b",
  na = TRUE,
  .by
)

lag_at(data, x, prefix = "l", o = 1, .by = NULL)

log_at(data, x, prefix = "ln", plus_1 = FALSE)

mean_at(data, x, prefix = "mean", na = TRUE, .by = NULL)

r1sd_at(data, x, prefix = "s", na = TRUE, .by = NULL)

r2sd_at(data, x, prefix = "z", na = TRUE, .by = NULL)

Arguments

data

a data frame

x

a vector, likely in your data frame

prefix

Allows the user to rename the prefix of the new variables. Each function has defaults (see details section).

na

a logical about whether missing values should be ignored in the creation of means and re-scaled variables. Defaults to TRUE (i.e. pass over/remove missing observations). Not applicable to diff_at, lag_at, and log_at.

.by

a selection of columns by which to group the operation. Defaults to NULL. This will eventually become a standard feature of the functions as this operator moves beyond the experimental in dplyr. The argument is not applicable to log_at (why would it be) and is optional for all functions except group_mean_center_at. group_mean_center_at must have something specified for grouped mean-centering.

o

The order of lags for calculating differences or lags in diff_at or lag_at. Applicable only to these functions.

mean_prefix

Applicable only to group_mean_center_at. Specifies the prefix of the (assumed) total population mean variables. Default is "mean", though the user can change this as they see fit.

plus_1

Applicable only to log_at. If TRUE, adds 1 to the variables prior to log transformation. If FALSE, performs logarithmic transformation on variables no matter whether 0 occurs (i.e. 0s will come back as -Inf). Defaults to FALSE.

Details

center_at is a wrapper for mutate_at and rename_at from dplyr. It takes supplied vectors and effectively centers them from the mean. It then renames these new variables to have a prefix of c_. The default prefix ("c") can be changed by way of an argument in the function.

diff_at is a wrapper for mutate and across from dplyr. It takes supplied vectors and creates differences from the previous value recorded above it. It then renames these new variables to have a prefix of d_ (in the case of a first difference), or something like d2_ in the case of second differences, or d3_ in the case of third differences (and so on). The exact prefix depends on the o argument, which communicates the order of lags you want. It defaults to 1. The default prefix ("d") can be changed by way of an argument in the function, though the naming convention will omit a numerical prefix for first differences.

group_mean_center_at is a wrapper for mutate and across in dplyr. It takes supplied vectors and centers an (assumed) group mean of the variables from an (assumed) total population mean of the variables provided to it. It then returns the new variables with a prefix, whose default is b_. This prefix communicates, if you will, a kind of "between" variable in the panel model context, in juxtaposition to "within" variables in the panel model context.

lag_at is a wrapper for mutate and across from dplyr. It takes supplied vector(s) and creates lag variables from them. These new variables have a prefix of l[o]_ where o corresponds to the order of the lag (specified by an argument in the function, which defaults to 1). This default prefix ("l") can be changed by way of an another argument in the function.

log_at is a wrapper for mutate and across from dplyr. It takes supplied vectors and creates a variable that takes a natural logarithmic transformation of them. It then renames these new variables to have a prefix of ln_. This default prefix ("ln") can be changed by way of an argument in the function. Users can optionally specify that they want to add 1 to the vector before taking its natural logarithm, which is a popular thing to do when positive reals have naturally occurring zeroes.

mean_at is a wrapper for mutate and across from dplyr. It takes supplied vectors and creates a variable communicating the mean of the variable. It then renames these new variables to have a prefix of mean_. This default prefix ("mean") can be changed by way of an argument in the function.

r1sd_at is a wrapper for mutate and across from dplyr. It both rescales the supplied vectors to new vectors and renames the vectors to each have a prefix of s_. Note the rescaling here is just by one standard deviation and not two. The default prefix ("s") can be changed by way of an argument in the function.

r2sd_at is a wrapper for mutate and across from dplyr. It both rescales the supplied vectors to new vectors and renames the vectors to each have a prefix of z_. Note the rescaling here is by two standard deviations and not one. The default prefix ("z") can be changed by way of an argument in the function.

All functions, except for lag_at, will fail in the absence of a character vector of a length of one. They are intended to work across multiple columns instead of just one. If you are wanting to create one new variable, you should think about using some other dplyr verb on its own.

Value

The function returns a set of new vectors in a data frame after performing relevant functions. The new vectors have distinct prefixes corresponding with the action performed on them.

Examples


set.seed(8675309)
Example <- data.frame(category = c(rep("A", 5),
                                   rep("B", 5),
                                   rep("C", 5)),
                      x = runif(15), y = runif(15),
                      z = sample(1:20, 15, replace=TRUE))

my_vars <- c("x", "y", "z")
center_at(Example, my_vars)

diff_at(Example, my_vars)

diff_at(Example, my_vars, o=3)

lag_at(Example, my_vars)

lag_at(Example, my_vars, o=3)

log_at(Example, my_vars)

log_at(Example, my_vars, plus_1 = TRUE)

mean_at(Example, my_vars)

r1sd_at(Example, my_vars)

r2sd_at(Example, my_vars)

stevemisc documentation built on Sept. 11, 2024, 5:23 p.m.