CEA: Coordinate Exchange algorithm for MNL models.
In idefix: Efficient Designs for Discrete Choice Experiments

View source: R/CEA.R

CEA	R Documentation

Coordinate Exchange algorithm for MNL models.

Description

The algorithm improves an initial start design by considering changes on an attribute-by-attribute basis. By doing this, it tries to minimize the chosen error (A(B) or D(B)-error) based on a multinomial logit model. This routine is repeated for multiple starting designs.

Usage

CEA(
  lvls,
  coding,
  c.lvls = NULL,
  n.sets,
  n.alts,
  par.draws,
  optim = "D",
  alt.cte = NULL,
  no.choice = FALSE,
  start.des = NULL,
  parallel = TRUE,
  max.iter = Inf,
  n.start = 12,
  overlap = NULL,
  n.blocks = 1,
  blocking.iter = 50,
  constraints = NULL
)

Arguments

`lvls`	A numeric vector which contains for each attribute the number of levels.
`coding`	Type of coding that needs to be used for each attribute.
`c.lvls`	A list containing numeric vectors with the attribute levels for each continuous attribute. The default is `NULL`.
`n.sets`	Numeric value indicating the number of choice sets.
`n.alts`	Numeric value indicating the number of alternatives per choice set.
`par.draws`	A matrix or a list, depending on `alt.cte`.
`optim`	A character value to choose between "D" and "A" optimality. The default is `"D"`.
`alt.cte`	A binary vector indicating for each alternative whether an alternative specific constant is desired. The default is `NULL`.
`no.choice`	A logical value indicating whether a no choice alternative should be added to each choice set. The default is `FALSE`.
`start.des`	A list containing one or more matrices corresponding to initial start design(s). The default is `NULL`.
`parallel`	Logical value indicating whether computations should be done over multiple cores. The default is `TRUE`.
`max.iter`	A numeric value indicating the maximum number allowed iterations. The default is `Inf`.
`n.start`	A numeric value indicating the number of random start designs to use. The default is 12.
`overlap`	A numeric value indicating the minimum number of attributes to overlap in every choice sets to create partial profiles. The default is `NULL`.
`n.blocks`	A numeric value indicating the desired number of blocks to create out of the most efficient design.
`blocking.iter`	A numeric value indicating the maximum number of iterations for optimising the blocks. The default value is 50.
`constraints`	A list of constraints to enforce on the attributes and alternatives in every choice set. The default is `NULL`.

Details

Each iteration will loop through all profiles from the initial design, evaluating the change in A(B) or D(B)-error (as specified) for every level in each attribute. The algorithm stops when an iteration occurred without replacing a profile or when max.iter is reached.

By specifying a numeric vector in par.draws, the A- or D-error will be calculated and the design will be optimised locally. By specifying a matrix, in which each row is a draw from a multivariate distribution, the AB/DB-error will be calculated, and the design will be optimised globally. Whenever there are alternative specific constants, par.draws should be a list containing two matrices: The first matrix containing the parameter draws for the alternative specific constant parameters. The second matrix containing the draws for the rest of the parameters.

The AB/DB-error is calculated by taking the mean over A- / D-errors, respectively. It could be that for some draws the design results in an infinite error. The percentage of draws for which this was true for the final design can be found in the output inf.error.

Alternative specific constants can be specified in alt.cte. The length of this binary vector should equal n.alts, were 0 indicates the absence of an alternative specific constant and 1 the opposite.

start.des is a list with one or several matrices corresponding to initial start design(s). In each matrix, each row is a profile. The number of rows equals n.sets * n.alts, and the number of columns equals the number of columns of the design matrix + the number of non-zero elements in alt.cte. Consider that for a categorical attribute with p levels, there are p - 1 columns in the design matrix, whereas for a continuous attribute there is only one column. If start.des = NULL, n.start random initial designs will be generated. If start designs are provided, n.start is ignored.

Note: To make sure the code works well, the names of the variables in the starting design should be aligned with variable names that the function Profiles produces. For example, if attribute 1 is a dummy variable of 3 levels then its corresponding columns should have numbered names such as: var11 and var12, or (if labelled) price1 and price2, for instance.

If no.choice is TRUE, in each choice set an alternative with one alternative specific constant is added. The return value of the A(B) or D(B)-error is however based on the design without the no choice option.

When parallel is TRUE, detectCores will be used to decide upon the number of available cores. That number minus 1 cores will be used to search for efficient designs. The computation time will decrease significantly when parallel = TRUE.

Partial profiles/overlapping attributes

If overlap is set to 1 or more, then partial profiles will be used in the resulting efficient designs. The value of overlap determines the minimum number of attributes to overlap in each choice set. The optimising algorithm will enforce this constraint across all choice sets. Note that the running time may increase significantly, as the algorithm searches through all possible (combinations of) attributes to achieve optimisation.

Blocking

If the value of n.blocks is more than 1, a new list with the specified number of blocks of the best design (one with the least A(B)- or D(B)-error) will be added to the output. The algorithm strives to distribute the choice sets of the best design evenly among the blocks, while maintaining level balance across them. The choice sets are assigned sequentially to the blocks, aiming to maintain the closest possible balance among them up to that stage in the sequence. Hence, the algorithm runs different iterations, during each of which the choice sets in the design are shuffled randomly. The argument blocking.iter specifies the maximum number of these iterations. This functionality is also available as a separate function in Blocks that works with a given design.

Adding constraints to the design

The argument constraints can be used to determine a list of constraints to be enforced on the resulting efficient design. The package offers flexibility in the possible constraints. The basic syntax for the constraint should determine an attribute Y within an alternative X (AltX.AttY) and an operator to be applied on that attribute followed by a list of values or another attribute. In addition to this basic syntax, conditional If statements can be included in the conditions as will be shown in the examples below. The following operators can be used:

=
!=
< or <=
> or >=
AND
OR
+, -, *, / operations for continuous attributes.

For example, if attributes 1, 2 and 3 are continuous attributes, then possible constraints include:

"Alt2.Att1 = list(100, 200)": restrict values of attribute 1 in alternative 2 to 100 and 200.
"Alt1.Att1 > Alt2.Att1": enforce that attribute 1 in alternative 1 to be higher than the attribute's value in alternative 2.
"Alt1.Att1 + Alt1.Att2 < Alt1.Att3": enforce that the sum of attributes 1 and 2 to be less than the value of attribute 3 in alternative 1.
"Alt1.Att1 > Alt1.Att3 OR Alt1.Att2 > Alt1.Att3": either attribute 1 or attribute 2 should be higher than attribute 3 in alternative 1.

For dummy and effect coded attributes, the levels are indicated with the number of the attribute followed by a letter from the alphabet. For example 1A is the first level of attribute 1 and 3D is the fourth level of attribute 3. Examples on constraints with dummy/effect coded variables:

"Alt2.Att1 = list(1A,1B)": restrict attribute 1 in alternative 2 to the reference level (A) and the second level (B).
"Alt1.Att1 = list(1B,1C) AND Alt2.Att2 != list(2A, 2E)": restrict attribute 1 in alternative 1 to the second and third levels, and at the same time, attribute 2 in alternative 2 cannot be the first and fifth levels of the attribute.

Additionally, and as aforementioned, conditional If statements can be included in the conditions. Examples:

"if Alt1.Att1 != Alt2.Att1 then Alt2.Att2 = list(100,200)"
"if Alt1.Att1 = Alt2.Att1 OR Alt1.Att1 = 0 then Alt2.Att1 > 3"

Lastly, more than one constraint can be specified at the same time. For example: constraints = list("if Alt1.Att1 != Alt2.Att1 then Alt2.Att2 = list(100,200)", "Alt1.Att3 = list (3A, 3C)").

To ensure the best use of constraints in optimising designs, please keep in mind the following:

Proper spacing should be respected between the terms, to make sure the syntax translates properly into an R code. To clarify, spaces should be placed before and after the operators listed above. Otherwise, the console will return an error.
Lists should be used for constrained values as shown in the examples above.
Constraints should not be imposed on the no.choice alternative because it is fixed with zeros for all attributes. The no.choice alternative, if included, will be the last alternative in every choice set in the design. Therefore, if no.choice is TRUE and the no.choice alternative number (= n.alts) is included in the constraints, the console will return an Error.
Attention should be given when a starting design that does not satisfy the constraint is provided. It is possible that the algorithm might not find a design that is more efficient and, at the same time, that satisfies the constraints.
With tight constraints, the algorithm might fail to find a design that satisfies all the specified constraints.

Value

Two lists of designs and statistics are returned: First, the list BestDesign contains the design with the lowest A(B)- or D(B)- error. The method print can be used to return this list. Second, the list AllDesigns contains the results of all (provided) start designs. The method summary can be used to return this list.

`design`	A numeric matrix wich contains an efficient design.
`optimality`	`"A"` or `"D"`, depending on the chosen optimality criteria.
`inf.error`	Numeric value indicating the percentage of draws for which the D-error was `Inf`.
`probs`	Numeric matrix containing the probabilities of each alternative in each choice set. If a sample matrix was provided in `par.draws`, this is the average over all draws.
`AB.error`	Numeric value indicating the A(B)-error of the design.
`DB.error`	Numeric value indicating the D(B)-error of the design.
`SD`	The standard deviations of the parameters. Calculated by taking the diagonal of the varcov matrix, averaged over all draws if a sample matrix was provided in `par.draws`.
`level.count`	The count of all levels of each attribute in the design.
`level.overlap`	The count of overlapping levels accross alternatives in every choice set in the design.
`Orthogonality`	Numeric value indicating the degree of orthogonality of the design. The closer the value to 1, the more orthogonal the design is.
`Blocks`	A list showing the created blocks of the best design, along with the level counts in each block. For more details, see function `Blocks`.

Examples


# DB-efficient designs
# 3 Attributes, all dummy coded. 1 alternative specific constant = 7 parameters
mu <- c(1.2, 0.8, 0.2, -0.3, -1.2, 1.6, 2.2) # Prior parameter vector
v <- diag(length(mu)) # Prior variance.
set.seed(123) 
pd <- MASS::mvrnorm(n = 10, mu = mu, Sigma = v) # 10 draws.
p.d <- list(matrix(pd[,1], ncol = 1), pd[,2:7])
CEA(lvls = c(3, 3, 3), coding = c("D", "D", "D"), par.draws = p.d,
n.alts = 2, n.sets = 8, parallel = FALSE, alt.cte = c(0, 1))
# Or AB-efficient design
set.seed(123) 
CEA(lvls = c(3, 3, 3), coding = c("D", "D", "D"), par.draws = p.d,
n.alts = 2, n.sets = 8, parallel = FALSE, alt.cte = c(0, 1), optim = "A")

# DB-efficient design with categorical and continuous factors
# 2 categorical attributes with 4 and 2 levels (effect coded) and 1 
# continuous attribute (= 5 parameters)
mu <- c(0.5, 0.8, 0.2, 0.4, 0.3) 
v <- diag(length(mu)) # Prior variance.
set.seed(123) 
pd <- MASS::mvrnorm(n = 3, mu = mu, Sigma = v) # 10 draws.
CEA(lvls = c(4, 2, 3), coding = c("E", "E", "C"), par.draws = pd,
c.lvls = list(c(2, 4, 6)), n.alts = 2, n.sets = 6, parallel = FALSE)
# The same can be done if A-optimality is chosen
set.seed(123)
CEA(lvls = c(4, 2, 3), coding = c("E", "E", "C"), par.draws = pd,
c.lvls = list(c(2, 4, 6)), n.alts = 2, n.sets = 6, parallel = FALSE, optim = "A")

# DB-efficient design with start design provided.  
# 3 Attributes with 3 levels, all dummy coded (= 6 parameters).
mu <- c(0.8, 0.2, -0.3, -0.2, 0.7, 0.4) 
v <- diag(length(mu)) # Prior variance.
sd <- list(example_design)
set.seed(123)
ps <- MASS::mvrnorm(n = 10, mu = mu, Sigma = v) # 10 draws.
CEA(lvls = c(3, 3, 3), coding = c("D", "D", "D"), par.draws = ps,
n.alts = 2, n.sets = 8, parallel = FALSE, start.des = sd)

# DB-efficient design with partial profiles
# 3 Attributes, all dummy coded. = 6 parameters
mu <- c(1.2, 0.8, 0.2, -0.3, -1.2, 1.6) # Prior parameter vector
v <- diag(length(mu)) # Prior variance.
set.seed(123) 
pd <- MASS::mvrnorm(n = 10, mu = mu, Sigma = v) # 10 draws.
CEA(lvls = c(3, 3, 3), coding = c("D", "D", "D"), par.draws = pd,
n.alts = 2, n.sets = 8, parallel = FALSE, alt.cte = c(0, 0), overlap = 1)
# The same function but asking for blocks (and no overlap)
set.seed(123)
CEA(lvls = c(3, 3, 3), coding = c("D", "D", "D"), par.draws = pd,
n.alts = 2, n.sets = 8, parallel = FALSE, alt.cte = c(0, 0), n.blocks = 2)

# AB-efficient design with constraints
# 2 dummy coded attributes, 1 continuous attribute and 1 effect coded
# attribute (with 4 levels). = 8 parameters
mu <- c(1.2, 0.8, 0.2, 0.5, -0.3, -1.2, 1, 1.6) # Prior parameter vector
v <- diag(length(mu)) # Prior variance.
set.seed(123) 
pd <- MASS::mvrnorm(n = 10, mu = mu, Sigma = v) # 10 draws.
constraints <- list("Alt2.Att1 = list(1A,1B)",
                    "if Alt1.Att3 = list(4) then Alt2.Att4 = list(4C, 4D)")
CEA(lvls = c(3, 3, 2, 4), coding = c("D", "D", "C", "E"), c.lvls = list(c(4,7)), par.draws = pd,
n.alts = 2, n.sets = 8, parallel = FALSE, alt.cte = c(0, 0), optim = "A", constraints = constraints)

idefix documentation built on April 4, 2025, 1:51 a.m.