| group_var | R Documentation |
Recode numeric variables into equal ranged, grouped factors,
i.e. a variable is cut into a smaller number of groups, where each group
has the same value range. group_labels() creates the related value
labels. group_var_if() and group_labels_if() are scoped
variants of group_var() and group_labels(), where grouping
will be applied only to those variables that match the logical condition
of predicate.
group_var(
x,
...,
size = 5,
as.num = TRUE,
right.interval = FALSE,
n = 30,
append = TRUE,
suffix = "_gr"
)
group_var_if(
x,
predicate,
size = 5,
as.num = TRUE,
right.interval = FALSE,
n = 30,
append = TRUE,
suffix = "_gr"
)
group_labels(x, ..., size = 5, right.interval = FALSE, n = 30)
group_labels_if(x, predicate, size = 5, right.interval = FALSE, n = 30)
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
size |
Numeric; group-size, i.e. the range for grouping. By default,
for each 5 categories of |
as.num |
Logical, if |
right.interval |
Logical; if |
n |
Sets the maximum number of groups that are defined when auto-grouping is on
( |
append |
Logical, if |
suffix |
Indicates which suffix will be added to each dummy variable.
Use |
predicate |
A predicate function to be applied to the columns. The
variables for which |
If size is set to a specific value, the variable is recoded
into several groups, where each group has a maximum range of size.
Hence, the amount of groups differ depending on the range of x.
If size = "auto", the variable is recoded into a maximum of
n groups. Hence, independent from the range of
x, always the same amount of groups are created, so the range
within each group differs (depending on x's range).
right.interval determins which boundary values to include when
grouping is done. If TRUE, grouping starts with the lower
bound of size. For example, having a variable ranging from
50 to 80, groups cover the ranges from 50-54, 55-59, 60-64 etc.
If FALSE (default), grouping starts with the upper bound
of size. In this case, groups cover the ranges from
46-50, 51-55, 56-60, 61-65 etc. Note: This will cover
a range from 46-50 as first group, even if values from 46 to 49
are not present. See 'Examples'.
If you want to split a variable into a certain amount of equal
sized groups (instead of having groups where values have all the same
range), use the split_var function!
group_var() also works on grouped data frames (see group_by).
In this case, grouping is applied to the subsets of variables
in x. See 'Examples'.
For group_var(), a grouped variable, either as numeric or as factor (see paramter as.num). If x is a data frame, only the grouped variables will be returned.
For group_labels(), a string vector or a list of string vectors containing labels based on the grouped categories of x, formatted as "from lower bound to upper bound", e.g. "10-19" "20-29" "30-39" etc. See 'Examples'.
Variable label attributes (see, for instance,
set_label) are preserved. Usually you should use
the same values for size and right.interval in
group_labels() as used in the group_var function if you want
matching labels for the related recoded variable.
split_var to split variables into equal sized groups,
group_str for grouping string vectors or
rec_pattern and rec for another convenient
way of recoding variables into smaller groups.
age <- abs(round(rnorm(100, 65, 20)))
age.grp <- group_var(age, size = 10)
hist(age)
hist(age.grp)
age.grpvar <- group_labels(age, size = 10)
table(age.grp)
print(age.grpvar)
# histogram with EUROFAMCARE sample dataset
# variable not grouped
library(sjlabelled)
data(efc)
hist(efc$e17age, main = get_label(efc$e17age))
# bar plot with EUROFAMCARE sample dataset
# grouped variable
ageGrp <- group_var(efc$e17age)
ageGrpLab <- group_labels(efc$e17age)
barplot(table(ageGrp), main = get_label(efc$e17age), names.arg = ageGrpLab)
# within a pipe-chain
library(dplyr)
efc %>%
select(e17age, c12hour, c160age) %>%
group_var(size = 20)
# create vector with values from 50 to 80
dummy <- round(runif(200, 50, 80))
# labels with grouping starting at lower bound
group_labels(dummy)
# labels with grouping startint at upper bound
group_labels(dummy, right.interval = TRUE)
# works also with gouped data frames
mtcars %>%
group_var(disp, size = 4, append = FALSE) %>%
table()
mtcars %>%
group_by(cyl) %>%
group_var(disp, size = 4, append = FALSE) %>%
table()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.