The package RUMdesignSimulator proposes convenient tools for generating synthetic data for decision theory.
Firstly, Alternatives, Decision Makers and Preference Coefficients are easily generated. Then, experimental designs are generated in format long or wide. In addition, the effect of each variable can be visualized in a 3D graph.
devtools::install_github("AntoineDubois/RUMdesignSimulator")
This is an R Markdown present the package RUMdesignSimulator. This package generates experimental designs from real data or from probabilistic distributions. First of all, we need to install and load the package. One should install the package devtools if necessary.
knitr::opts_chunk$set(echo = TRUE)
#devtools::install_github("AntoineDubois/RUMdesignSimulator") library(RUMdesignSimulator)
Now, we can define the setup of the experiment. Thus, we define the name of the alternatives as well as their attributes. In addition, we define the number of decision makers as well as their characteristics.
DM_att_names <- list("X1", "X2", "X3") # the list of the decision makers' characteristics AT_names <- list("good1", "good2", "good3", "good4") # the list of the alternatives' names AT_att_names <- list("Z1", "Z2", "Z3") # the list of the alternatives' attributes groups <- c(10, 20) # the groups of decision makers
Then, we initialize the instance of the class Experiment. Furthermore, Experiment is a \textit{reference class}. This type of class is the most flexible embeded in R.
FD <- Experiment(DM_att_names=DM_att_names, AT_att_names=AT_att_names, AT_names=AT_names, groups=groups, no_choice=TRUE) # creation of an instance of the call Experiment
Since the instance FD is an instance of the class Experiment, we can use methods to generate decision makers characteristics according to distributions or from data. To know which laws are implemented in this package, we use the function \textbf{information}.
information()
We have chosen the distributions underneath for generating decision makers characteristics.
# the characteristics of X1 are drawn from a data set FD$gen_DM_attributes("empirical", data = data.frame(X1=c(0.5, 0, 12, 6, 7.3)), which = "X1") # the characteristics X2 and X3 follow a standardized normal distribution within the group 1 FD$gen_DM_attributes("normal", which=c("X2", "X3"), group=1) # the characteristics X2 and X3 follow a normal distribution with mean 1 and 2 standard deviation within the group 2 FD$gen_DM_attributes("normal", mu=1, sd=2, which=c("X2","X3"), group=2) FD$X
In addition, we can observe cross efffects between the decision makers' characteristics.
FD$gen_DM_attributes(observation=~X1+X2+X3+I(X1*X2)) FD$X
Similarly, we generate alternatives' attributes
# generation of a random covariance matrix of size 3 sigma <- clusterGeneration::genPositiveDefMat(3)$sigma # all the attributes are generated by a multivariate normal distribution of mean (-1, 2, 0) and covariance matrix sigma FD$gen_AT_attributes(mu=c(-1,2,0), sd=sigma) # observation of complex effects between the alternatives' attributes FD$gen_AT_attributes(observation=~Z1+Z2+Z3+I(Z1^2)) FD$Z
and decision makers' preferences
#Generation of beta whose components law's are different: # generation of the variables from 1 to 4 of the alternatives within the group 1 FD$gen_preference_coefficients("student", heterogeneity=TRUE, location=-2, scale=1, df=4, which=c(1:4), group=1) # generation of the variables from 1 to 4 of the alternatives within the group 2 FD$gen_preference_coefficients("student", heterogeneity=FALSE, location=2, scale=1, df=4, which=c(1:4), group=2) # generation of the fifth variable within every group FD$gen_preference_coefficients("normal", heterogeneity=FALSE, mu=0, sd=2, which=5) # rectification, the variable Z2 follows a discrete uniform distribution FD$gen_preference_coefficients("discrete_uniform", heterogeneity=TRUE, a=1, b=5, which="Z2") # generation of the variable Z3 and I(Z1^2) according to the default distribution: the standardized normal distribution FD$gen_preference_coefficients(heterogeneity=TRUE, which=c("Z3", "I(Z1^2)")) FD$beta
Finally, we compute the utility provided to each decision makers by each alternative. To do so, we generate measurement error.
# computation of the decision makers' utility according to the standardized Gumbel distribution FD$utility() # computation of the decision makers' utility according to the discrete uniform distribution FD$utility("discrete_uniform") # It is possible to have correlation between alternatives preference (for both student and normal distributions) FD$utility("normal", mu=0, sd=2) # computation of the decision makers' utility according to a student distribution FD$utility("student", location=0, scale=2, df=4)
Here, we take a look at:
FD$V # the representative utility FD$Epsilon # the measurement error FD$U # the utility of each alternative for each decision maker FD$choice_order # the order of alternative preference for each decision maker FD$choice # the most usefull alternative for each decision maker
A good advantage of the package RUMdesignSimulator consists in its plot method. The method \textbf{$map(...)} returns a scatter plot. On this graph, the x-axis, y-axis and z-axis represent the value of two parameters (attributes and characteristics) and the utility provided by the optimal alternative for any decision maker.
# Drawing a 3D preference mapping: # Map representing the choice of the decision makers and the utility provided by this choice according to the value of Z1 and Z3 FD$map("Z1", "Z3") FD$map("X1", "Z3") FD$map("X1", "X2")
# Generation of designs: # generation of the full factorial design with row data FFD <- FD$design(choice_set_size=2, clustered=0) #by default, name="FuFD", choice_set_size = nb_alternatives View(FFD)
Henceforth, the alternatives, decision makers, preference coefficients and associated utility are entirely setup. In consequence, we can draw experimental designs. Below, we build a full factorial experimental design where the number of alternatives within each choice set is 2. Moreover, the attributes and characteristics are not treated.
Often, the data is treated. Sometimes, the data is clustered. The number of each cluster is called \textit{level}. Furthermore, the clusters are formed by running k-means algorithms. Finally, after clustering, the new value of an attribute or a characteristic is the average of its cluster.
FFD <- FD$design(name="FuFD",choice_set_size=2, clustered=1, nb_levels_DM=c(3, 3, 4, 2), nb_levels_AT=c(3, 2, 2, 4)) # generation of the full factorial design with glustered data
In addition, after clustering, the new value of an attribute or a characteristic may be the numero of its cluster. This is done by defining \textbf{clustered=2}.
# generation of the full factorial design with categorial data FFD1 <- FD$design(choice_set_size=2, clustered=2, nb_levels_DM=c(2, 3, 4, 2), nb_levels_AT=c(2, 2, 2, 2))
Unfortunately, the number of questions asked to each decision maker is most of the time too big to be realistic. In consequence, only a random subset of questions can be asked to the decision makers. The result is a \textit{random fractional factorial design}. The number of question asked to each decision maker is \textbf{nb_question=2}.
FFD2 <- FD$design(name="FrFD", choice_set_size=2, clustered=2, nb_levels_DM=c(2, 3, 4, 2), nb_levels_AT=c(2, 2, 2, 2), nb_questions = 2) # Generation a a random fractional factorial design with categorial data
Yet, we want to express this design in wide format.
FFD3 <- FD$design(name="FrFD", choice_set_size=2, clustered=2, nb_levels_DM=c(2, 3, 4, 2), nb_levels_AT=c(2, 2, 2, 2), nb_questions = 2, format="wide")
Finally, a small summary function calls some elements back.
summary.Exepriment(FD) # a summary of the experimental design
Some user may need more tools than the actual ones. Anticipating future needs, we
organized the R files so that only one file need to be altered.
To add new distributions:
open the file distribution.R
add a new distribution
reference the new distribution into the function generation*, give it a relevant name for calling
To add new designs: open the file designs.R implement a new design reference the new design into the function call_design*, give it a relevant name for calling
For more information, do not hesitate to contact me at antoine.dubois.fr@gmail.com
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.