dicho | R Documentation |
Dichotomizes variables into dummy variables (0/1). Dichotomization is
either done by median, mean or a specific value (see dich.by
).
dicho_if()
is a scoped variant of dicho()
, where recoding
will be applied only to those variables that match the logical condition
of predicate
.
dicho(
x,
...,
dich.by = "median",
as.num = FALSE,
var.label = NULL,
val.labels = NULL,
append = TRUE,
suffix = "_d"
)
dicho_if(
x,
predicate,
dich.by = "median",
as.num = FALSE,
var.label = NULL,
val.labels = NULL,
append = TRUE,
suffix = "_d"
)
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
dich.by |
Indicates the split criterion where a variable is dichotomized. Must be one of the following values (may be abbreviated):
|
as.num |
Logical, if |
var.label |
Optional string, to set variable label attribute for the
returned variable (see vignette Labelled Data and the sjlabelled-Package).
If |
val.labels |
Optional character vector (of length two), to set value label
attributes of dichotomized variable (see |
append |
Logical, if |
suffix |
Indicates which suffix will be added to each dummy variable.
Use |
predicate |
A predicate function to be applied to the columns. The
variables for which |
dicho()
also works on grouped data frames (see group_by
).
In this case, dichotomization is applied to the subsets of variables
in x
. See 'Examples'.
x
, dichotomized. If x
is a data frame,
for append = TRUE
, x
including the dichotomized. variables
as new columns is returned; if append = FALSE
, only
the dichotomized variables will be returned. If append = TRUE
and
suffix = ""
, recoded variables will replace (overwrite) existing
variables.
Variable label attributes are preserved (unless changed via
var.label
-argument).
data(efc)
summary(efc$c12hour)
# split at median
table(dicho(efc$c12hour))
# split at mean
table(dicho(efc$c12hour, dich.by = "mean"))
# split between value lowest to 30, and above 30
table(dicho(efc$c12hour, dich.by = 30))
# sample data frame, values from 1-4
head(efc[, 6:10])
# dichtomized values (1 to 2 = 0, 3 to 4 = 1)
library(dplyr)
efc %>%
select(6:10) %>%
dicho(dich.by = 2) %>%
head()
# dichtomize several variables in a data frame
dicho(efc, c12hour, e17age, c160age, append = FALSE)
# dichotomize and set labels
frq(dicho(
efc, e42dep,
var.label = "Dependency (dichotomized)",
val.labels = c("lower", "higher"),
append = FALSE
))
# works also with gouped data frames
mtcars %>%
dicho(disp, append = FALSE) %>%
table()
mtcars %>%
group_by(cyl) %>%
dicho(disp, append = FALSE) %>%
table()
# dichotomizing grouped data frames leads to different
# results for a dichotomized variable, because the split
# value is different for each group.
# compare:
mtcars %>%
group_by(cyl) %>%
summarise(median = median(disp))
median(mtcars$disp)
# dichotomize only variables with more than 10 unique values
p <- function(x) dplyr::n_distinct(x) > 10
dicho_if(efc, predicate = p, append = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.