View source: R/data_management.R
prepare_data | R Documentation |
This function prepares choice data for estimation.
prepare_data(
form,
choice_data,
re = NULL,
alternatives = NULL,
ordered = FALSE,
ranked = FALSE,
base = NULL,
id = "id",
idc = NULL,
standardize = NULL,
impute = "complete_cases"
)
form |
A
Multiple covariates (of one type) are separated by a In the ordered probit model ( |
choice_data |
A |
re |
A character (vector) of covariates of |
alternatives |
A character vector with the names of the choice alternatives.
If not specified, the choice set is defined by the observed choices.
If |
ordered |
A boolean, |
ranked |
TBA |
base |
A character, the name of the base alternative for covariates that are not
alternative specific (i.e. type 2 covariates and ASCs). Ignored and set to
|
id |
A character, the name of the column in |
idc |
A character, the name of the column in |
standardize |
A character vector of names of covariates that get standardized.
Covariates of type 1 or 3 have to be addressed by
|
impute |
A character that specifies how to handle missing covariate entries in
|
Requirements for the data.frame
choice_data
:
It must contain a column named id
which contains unique
identifier for each decision maker.
It can contain a column named idc
which contains unique
identifier for each choice situation of each decision maker.
If this information is missing, these identifier are generated
automatically by the appearance of the choices in the data set.
It can contain a column named choice
with the observed
choices, where choice
must match the name of the dependent
variable in form
.
Such a column is required for model fitting but not for prediction.
It must contain a numeric column named p_j for each alternative
specific covariate p in form
and each choice alternative j
in alternatives
.
It must contain a numeric column named q for each covariate q
in form
that is constant across alternatives.
In the ordered case (ordered = TRUE
), the column choice
must
contain the full ranking of the alternatives in each choice occasion as a
character, where the alternatives are separated by commas, see the examples.
See the vignette on choice data for more details.
An object of class RprobitB_data
.
check_form()
for checking the model formula
overview_effects()
for an overview of the model effects
create_lagged_cov()
for creating lagged covariates
as_cov_names()
for re-labeling alternative-specific covariates
simulate_choices()
for simulating choice data
train_test()
for splitting choice data into a train and test subset
data <- prepare_data(
form = choice ~ price + time + comfort + change | 0,
choice_data = train_choice,
re = c("price", "time"),
id = "deciderID",
idc = "occasionID",
standardize = c("price", "time")
)
### ranked case
choice_data <- data.frame(
"id" = 1:3, "choice" = c("A,B,C", "A,C,B", "B,C,A"), "cov" = 1
)
data <- prepare_data(
form = choice ~ 0 | cov + 0,
choice_data = choice_data,
ranked = TRUE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.