| gen_design | R Documentation |
Creates an experimental design given a model, desired number of runs, and a data frame of candidate
test points. gen_design chooses points from the candidate set and returns a design that is optimal for the given
statistical model.
gen_design(
candidateset,
model,
trials,
splitplotdesign = NULL,
blocksizes = NULL,
optimality = "D",
augmentdesign = NULL,
repeats = 20,
custom_v = NULL,
varianceratio = 1,
contrasts = contr.simplex,
aliaspower = 2,
minDopt = 0.8,
k = NA,
moment_sample_density = 10,
high_resolution_candidate_set = NA,
parallel = FALSE,
progress = TRUE,
add_blocking_columns = FALSE,
randomized = TRUE,
advancedoptions = NULL,
timer = NULL
)
candidateset |
A data frame of candidate test points; each run of the optimal design will be chosen (with replacement)
from this candidate set. Each row of the data frame is a candidate test point. Each row should be unique.
Usually this is a full factorial test matrix generated for the factors in the model unless there are disallowed combinations of runs.
Factors present in the candidate set but not present in the model are stripped out, and the duplicate entries in the candidate set are removed.
Disallowed combinations can be specified by simply removing them from the candidate set. Disallowed combinations between a
hard-to-change and an easy-to-change factor are detected by comparing an internal candidate set generated by the unique levels
present in the candidate set and the split plot design. Those points are then excluded from the search.
If a factor is continuous, its column should be type |
model |
The statistical model used to generate the test design. |
trials |
The number of runs in the design. |
splitplotdesign |
If |
blocksizes |
Default |
optimality |
Default |
augmentdesign |
Default NULL. A |
repeats |
Default |
custom_v |
Default |
varianceratio |
Default |
contrasts |
Default |
aliaspower |
Default |
minDopt |
Default |
k |
Default |
moment_sample_density |
Default |
high_resolution_candidate_set |
Default |
parallel |
Default |
progress |
Default |
add_blocking_columns |
Default |
randomized |
Default |
advancedoptions |
Default |
timer |
Deprecated: Use |
Split-plot designs can be generated with repeated applications of gen_design; see examples for details.
A data frame containing the run matrix for the optimal design. The returned data frame contains supplementary
information in its attributes, which can be accessed with the get_attribute() and get_optimality() functions.
#Generate the basic factorial candidate set with expand.grid.
#Generating a basic 2 factor candidate set:
basic_candidates = expand.grid(x1 = c(-1, 1), x2 = c(-1, 1))
#This candidate set is used as an input in the optimal design generation for a
#D-optimal design with 11 runs.
design = gen_design(candidateset = basic_candidates, model = ~x1 + x2, trials = 11)
#We can also use the dot formula to automatically use all of the terms in the model:
design = gen_design(candidateset = basic_candidates, model = ~., trials = 11)
#Here we add categorical factors, specified by using "as.factor" in expand.grid:
categorical_candidates = expand.grid(a = c(-1, 1),
b = as.factor(c("A", "B")),
c = as.factor(c("High", "Med", "Low")))
#This candidate set is used as an input in the optimal design generation.
design2 = gen_design(candidateset = categorical_candidates, model = ~a + b + c, trials = 19)
#We can also increase the number of times the algorithm repeats
#the search to increase the probability that the globally optimal design was found.
design2 = gen_design(candidateset = categorical_candidates,
model = ~a + b + c, trials = 19, repeats = 100)
#We can perform a k-exchange algorithm instead of a full search to help speed up
#the search process, although this can lead to less optimal designs. Here, we only
#exchange the 10 lowest variance runs in each search iteration.
if(skpr:::run_documentation()) {
design_k = gen_design(candidateset = categorical_candidates,
model = ~a + b + c, trials = 19, repeats = 100, k = 10)
}
#To speed up the design search, you can turn on multicore support with the parallel option.
#You can also customize the number of cores used by setting the cores option. By default,
#all cores are used.
if(skpr:::run_documentation()) {
options(cores = 2)
design2 = gen_design(categorical_candidates,
model = ~a + b + c, trials = 19, repeats = 1000, parallel = TRUE)
}
#You can also use a higher order model when generating the design:
design2 = gen_design(categorical_candidates,
model = ~a + b + c + a * b * c, trials = 12, repeats = 10)
#To evaluate a response surface design, include center points
#in the candidate set and include quadratic effects (but not for the categorical factors).
quad_candidates = expand.grid(a = c(1, 0, -1), b = c(-1, 0, 1), c = c("A", "B", "C"))
gen_design(quad_candidates, ~a + b + I(a^2) + I(b^2) + a * b * c, 20)
#The optimality criterion can also be changed:
gen_design(quad_candidates, ~a + b + I(a^2) + I(b^2) + a * b * c, 20,
optimality = "I", repeats = 10)
gen_design(quad_candidates, ~a + b + I(a^2) + I(b^2) + a * b * c, 20,
optimality = "A", repeats = 10)
#A blocked design can be generated by specifying the `blocksizes` argument. Passing a single
#number will create designs with blocks of that size, while passing multiple values in a list
#will specify multiple layers of blocking.
#Specify a single layer
gen_design(quad_candidates, ~a + b + c, 21, blocksizes=3, add_blocking_column=TRUE)
#Manually specify the block sizes for a single layer, must add to `trials``
gen_design(quad_candidates, ~a + b + c, 21, blocksizes=c(4,3,2,3,3,3,3),
add_blocking_column=TRUE)
#Multiple layers of blocking
gen_design(quad_candidates, ~a + b + c, 21, blocksizes=list(7,3),
add_blocking_column=TRUE)
#Multiple layers of blocking, specified individually
gen_design(quad_candidates, ~a + b + c, 21, blocksizes=list(7,c(4,3,2,3,3,3,3)),
add_blocking_column=TRUE)
#A split-plot design can be generated by first generating an optimal blocking design using the
#hard-to-change factors and then using that as the input for the split-plot design.
#This generates an optimal subplot design that accounts for the existing split-plot settings.
splitplotcandidateset = expand.grid(Altitude = c(-1, 1),
Range = as.factor(c("Close", "Medium", "Far")),
Power = c(1, -1))
hardtochangedesign = gen_design(splitplotcandidateset, model = ~Altitude,
trials = 11, repeats = 10)
#Now we can use the D-optimal blocked design as an input to our full design.
#Here, we add the easy to change factors from the candidate set to the model,
#and input the hard-to-change design along with the new number of trials. `gen_design` will
#automatically allocate the runs in the blocks in the most balanced way possible.
designsplitplot = gen_design(splitplotcandidateset, ~Altitude + Range + Power, trials = 33,
splitplotdesign = hardtochangedesign, repeats = 10)
#If we want to allocate the blocks manually, we can do that with the argument `blocksizes`. This
#vector must sum to the number of `trials` specified.
#Putting this all together:
designsplitplot = gen_design(splitplotcandidateset, ~Altitude + Range + Power, trials = 33,
splitplotdesign = hardtochangedesign,
blocksizes = c(4, 2, 3, 4, 2, 3, 4, 2, 3, 4, 2), repeats = 10)
#The split-plot structure is encoded into the row names, with a period
#demarcating the blocking level. This process can be repeated for arbitrary
#levels of blocking (i.e. a split-plot design can be entered in as the hard-to-change
#to produce a split-split-plot design, which can be passed as another
#hard-to-change design to produce a split-split-split plot design, etc).
#In the following, note that the model builds up as we build up split plot strata.
splitplotcandidateset2 = expand.grid(Location = as.factor(c("East", "West")),
Climate = as.factor(c("Dry", "Wet", "Arid")),
Vineyard = as.factor(c("A", "B", "C", "D")),
Age = c(1, -1))
#6 blocks of Location:
temp = gen_design(splitplotcandidateset2, ~Location, trials = 6, varianceratio = 2, repeats = 10)
#Each Location block has 2 blocks of Climate:
temp = gen_design(splitplotcandidateset2, ~Location + Climate,
trials = 12, splitplotdesign = temp, blocksizes = 2,
varianceratio = 1, repeats = 10)
#Each Climate block has 4 blocks of Vineyard:
temp = gen_design(splitplotcandidateset2, ~Location + Climate + Vineyard,
trials = 48, splitplotdesign = temp, blocksizes = 4,
varianceratio = 1, repeats = 10)
#Each Vineyard block has 4 runs with different Age:
if(skpr:::run_documentation()) {
splitsplitsplitplotdesign = gen_design(splitplotcandidateset2, ~Location + Climate + Vineyard + Age,
trials = 192, splitplotdesign = temp, blocksizes = 4,
varianceratio = 1, add_blocking_columns = TRUE)
}
#gen_design also supports user-defined optimality criterion. The user defines a function
#of the model matrix named customOpt, and gen_design will attempt to generate a design
#that maximizes that function. This function needs to be in the global environment, and be
#named either customOpt or customBlockedOpt, depending on whether a split-plot design is being
#generated. customBlockedOpt should be a function of the model matrix as well as the
#variance-covariance matrix, vInv. Due to the underlying C + + code having to call back to the R
#environment repeatedly, this criterion will be significantly slower than the built-in algorithms.
#It does, however, offer the user a great deal of flexibility in generating their designs.
#We are going to write our own D-optimal search algorithm using base R functions. Here, write
#a function that calculates the determinant of the information matrix. gen_design will search
#for a design that maximizes this function.
customOpt = function(currentDesign) {
return(det(t(currentDesign) %*% currentDesign))
}
#Generate the whole plots for our split-plot design, using the custom criterion.
candlistcustom = expand.grid(Altitude = c(10000, 20000),
Range = as.factor(c("Close", "Medium", "Far")),
Power = c(50, 100))
htcdesign = gen_design(candlistcustom, model = ~Altitude + Range,
trials = 11, optimality = "CUSTOM", repeats = 10)
#Now define a function that is a function of both the model matrix,
#as well as the variance-covariance matrix vInv. This takes the blocking structure into account
#when calculating our determinant.
customBlockedOpt = function(currentDesign, vInv) {
return(det(t(currentDesign) %*% vInv %*% currentDesign))
}
#And finally, calculate the design. This (likely) results in the same design had we chosen the
#"D" criterion.
design = gen_design(candlistcustom,
~Altitude + Range + Power, trials = 33,
splitplotdesign = htcdesign, blocksizes = 3,
optimality = "CUSTOM", repeats = 10)
#gen_design can also augment an existing design. Input a dataframe of pre-existing runs
#to the `augmentdesign` argument. Those runs in the new design will be fixed, and gen_design
#will perform a search for the remaining `trials - nrow(augmentdesign)` runs.
candidateset = expand.grid(height = c(10, 20), weight = c(45, 55, 65), range = c(1, 2, 3))
design_to_augment = gen_design(candidateset, ~height + weight + range, 5)
#As long as the columns in the augmented design match the columns in the candidate set,
#this design can be augmented.
augmented_design = gen_design(candidateset,
~height + weight + range, 16, augmentdesign = design_to_augment)
#A design's diagnostics can be accessed via the `get_optimality()` function:
get_optimality(augmented_design)
#And design attributes can be accessed with the `get_attribute()` function:
get_attribute(design)
#A correlation color map can be produced by calling the plot_correlation command with the output
#of gen_design()
if(skpr:::run_documentation()) {
plot_correlations(design2)
}
#A fraction of design space plot can be produced by calling the plot_fds command
if(skpr:::run_documentation()) {
plot_fds(design2)
}
#Evaluating the design for power can be done with eval_design, eval_design_mc (Monte Carlo)
#eval_design_survival_mc (Monte Carlo survival analysis), and
#eval_design_custom_mc (Custom Library Monte Carlo)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.