R/gen_design.R

Defines functions gen_design

Documented in gen_design

#'@title Generate optimal experimental designs
#'
#'@description Creates an experimental design given a model, desired number of runs, and a data frame of candidate
#'test points. \code{gen_design} chooses points from the candidate set and returns a design that is optimal for the given
#'statistical model.
#'
#'@param candidateset A data frame of candidate test points; each run of the optimal design will be chosen (with replacement)
#'from this candidate set. Each row of the data frame is a candidate test point. Each row should be unique.
#'Usually this is a full factorial test matrix generated for the factors in the model unless there are disallowed combinations of runs.
#'Factors present in the candidate set but not present in the model are stripped out, and the duplicate entries in the candidate set are removed.
#'Disallowed combinations can be specified by simply removing them from the candidate set. Disallowed combinations between a
#'hard-to-change and an easy-to-change factor are detected by comparing an internal candidate set generated by the unique levels
#'present in the candidate set and the split plot design. Those points are then excluded from the search.
#'If a factor is continuous, its column should be type \code{numeric}. If a factor is categorical, its column should be type \code{factor} or \code{character}.
#'@param model The statistical model used to generate the test design.
#'@param trials The number of runs in the design.
#'@param splitplotdesign If `NULL`, a fully randomized design is generated. If not NULL, a split-plot design is generated, and
#'this argument specifies the design for all of the factors harder to change than the current set of factors.
#'Each row corresponds to a block in which the harder to change factors will be held
#'constant. Each row of \code{splitplotdesign} will be replicated as specified in \code{blocksizes},
#'and the optimal design is found for all of the factors given in the
#'\code{model} argument, taking into consideration the fixed and replicated hard-to-change factors. If \code{blocksizes} is missing,
#'`gen_design` will attempt to allocate the runs in the most balanced design possible,
#'given the number of blocks given in the argument `splitplotdesign` and the total number of `trials`.
#'@param blocksizes Default `NULL`. Specifies the block size(s) for design generation. If only one number is passed, `gen_design()`
#'will create blocks of the specified size, and if the total number of run specified in `trials` is not divisible by the number,
#'`gen_design()` will attempt to allocate the runs in the most balanced design possible. If a list is passed, each entry
#'in the list will specify an additional layer of blocking. If `splitplotdesign` is not `NULL`, this argument
#'specifies the number of subplots within each whole plot (each whole plot corresponding to a row in the `splitplotdesign` data.frame).
#'@param optimality Default `D`. The optimality criterion used in generating the design. Full list of supported criteria: "D", "I", "A", "ALIAS", "G", "T", "E", or "CUSTOM". If "CUSTOM", user must also
#' define a function of the model matrix named `customOpt` in their namespace that returns a single value, which the algorithm will attempt to optimize. For
#' `CUSTOM` optimality split-plot designs, the user must instead define `customBlockedOpt`, which should be a function of the model matrix and the variance-covariance matrix. For
#'information on the algorithm behind Alias-optimal designs, see \emph{Jones and Nachtsheim. "Efficient Designs With Minimal Aliasing." Technometrics, vol. 53, no. 1, 2011, pp. 62-71}.
#'@param augmentdesign Default NULL. A `data.frame` of runs that are fixed during the optimal search process. The columns of `augmentdesign` must match those of the candidate set.
#'The search algorithm will search for the optimal `trials` - `nrow(augmentdesign)` remaining runs.
#'@param repeats Default `20`. The number of times to repeat the search for the best optimal design.
#'@param custom_v Default `NULL`. The user can pass a custom variance-covariance matrix to be used during blocked design generation.
#'@param varianceratio Default `1`. The ratio between the block and run-to-run variance for a given stratum in
#'a split plot/blocked design. This requires a design passed into `splitplotdesign`, so it will be overridden to `1`
#'if no split plot design is entered.
#'@param contrast Default `contr.simplex`, an orthonormal sum contrast. Function used to generate the encoding for categorical variables.
#'@param aliaspower Default `2`. Degree of interactions to be used in calculating the alias matrix for alias optimal designs.
#'@param minDopt Default `0.8`. Minimum value for the D-Optimality of a design when searching for Alias-optimal designs.
#'@param k Default `NA`. For D-optimal designs, this changes the search to a k-exchange algorithm
#'\emph{Johnson and Nachtsheim. "Some Guidelines for Constructing Exact D-Optimal Designs on Convex Design Spaces." Technometrics, vol. 25, 1983, pp. 271-277}.
#' This exchanges only the k lowest variance runs in the design at each search iteration. Lower numbers can result
#' in a faster search, but are less likely tofind an optimal design. Values of `k >= n/4` have been shown empirically to generate similar designs to the full
#' search. When `k == trials`, this results in the default modified Federov's algorithm.
#' A `k` of 1 is a form of Wynn's algorithm \emph{Wynn. "Results in the Theory and Construction of D-Optimum Experimental Designs," Journal of the Royal Statistical Society, Ser. B,vol. 34, 1972, pp. 133-14}.
#'@param parallel Default `FALSE`. If `TRUE`, the optimal design search will use all the available cores. This can lead to a substantial speed-up in the search for complex designs. If the user wants to set the number of cores manually, they can do this by setting options("cores") to the desired number. NOTE: If you have installed BLAS libraries that include multicore support (e.g. Intel MKL that comes with Microsoft R Open), turning on parallel could result in reduced performance.
#'@param timer Default `FALSE`. If `TRUE`, will print an estimate of the optimal design search time.
#'@param add_blocking_columns Default `FALSE`. The blocking structure of the design will be indicated in the row names of the returned
#'design. If `TRUE`, the design also will have extra columns to indicate the blocking structure. If no blocking is detected, no columns will be added.
#'@param randomized Default `TRUE`, due to the intrinsic randomization of the design search algorithm. If `FALSE`,
#'the randomized design will be re-ordered from left to right.
#'@param advancedoptions Default `NULL`. An named list for advanced users who want to adjust the optimal design algorithm parameters. Advanced option names
#'are `design_search_tolerance` (the smallest fractional increase below which the design search terminates), `alias_tie_power` (the degree of the aliasing
#'matrix when calculating optimality tie-breakers), `alias_tie_tolerance` (the smallest absolute difference in the optimality criterion where designs are
#'considered equal before considering the aliasing structure),  `alias_compare`` (which if set to FALSE turns off alias tie breaking completely),
#'`aliasmodel` (provided if the user does not want to calculate Alias-optimality using all `aliaspower` interaction terms),
#'and `progressBarUpdater`` (a function called in non-parallel optimal searches that can be used to update an external progress bar). Finally, there's
#'`g_efficiency_method`, which sets the method used to calculate G-efficiency (default is "random" for a random Monte Carlo sampling of the design space,
#'"optim" for to use simulated annealing, or "custom" to explicitly define the points in the design space, which is the fastest method
#'and the only way to calculate prediction variance with disallowed combinations). With this, there's also `g_efficiency_samples`, which specifies
#'the number of random samples  (default 1000 if `g_efficiency_method = "random"`), attempts at simulated annealing (default 1 if `g_efficiency_method = "optim"`),
#'or a data.frame defining the exact points of the design space if `g_efficiency_method = "custom"`.
#'@return A data frame containing the run matrix for the optimal design. The returned data frame contains supplementary
#'information in its attributes, which can be accessed with the `get_attributes()` and `get_optimality()` functions.
#'@import doRNG
#'@export
#'@details
#'Split-plot designs can be generated with repeated applications of \code{gen_design}; see examples for details.
#'@examples #Generate the basic factorial candidate set with expand.grid.
#'#Generating a basic 2 factor candidate set:
#'basic_candidates = expand.grid(x1 = c(-1, 1), x2 = c(-1, 1))
#'
#'#This candidate set is used as an input in the optimal design generation for a
#'#D-optimal design with 11 runs.
#'design = gen_design(candidateset = basic_candidates, model = ~x1 + x2, trials = 11)
#'
#'#We can also use the dot formula to automatically use all of the terms in the model:
#'design = gen_design(candidateset = basic_candidates, model = ~., trials = 11)
#'
#'#Here we add categorical factors, specified by using "as.factor" in expand.grid:
#'categorical_candidates = expand.grid(a = c(-1, 1),
#'                                      b = as.factor(c("A", "B")),
#'                                      c = as.factor(c("High", "Med", "Low")))
#'
#'#This candidate set is used as an input in the optimal design generation.
#'design2 = gen_design(candidateset = categorical_candidates, model = ~a + b + c, trials = 19)
#'
#'#We can also increase the number of times the algorithm repeats
#'#the search to increase the probability that the globally optimal design was found.
#'design2 = gen_design(candidateset = categorical_candidates,
#'                     model = ~a + b + c, trials = 19, repeats = 100)
#'
#'#We can perform a k-exchange algorithm instead of a full search to help speed up
#'#the search process, although this can lead to less optimal designs. Here, we only
#'#exchange the 10 lowest variance runs in each search iteration.
#'if(skpr:::run_documentation()) {
#'design_k = gen_design(candidateset = categorical_candidates,
#'                     model = ~a + b + c, trials = 19, repeats = 100, k = 10)
#'}
#'
#'#To speed up the design search, you can turn on multicore support with the parallel option.
#'#You can also customize the number of cores used by setting the cores option. By default,
#'#all cores are used.
#'if(skpr:::run_documentation()) {
#'options(cores = 2)
#'design2 = gen_design(categorical_candidates,
#'                     model = ~a + b + c, trials = 19, repeats = 1000, parallel = TRUE)
#'}
#'
#'#You can also use a higher order model when generating the design:
#'design2 = gen_design(categorical_candidates,
#'                     model = ~a + b + c + a * b * c, trials = 12, repeats = 10)
#'
#'#To evaluate a response surface design, include center points
#'#in the candidate set and include quadratic effects (but not for the categorical factors).
#'
#'quad_candidates = expand.grid(a = c(1, 0, -1), b = c(-1, 0, 1), c = c("A", "B", "C"))
#'
#'gen_design(quad_candidates, ~a + b + I(a^2) + I(b^2) + a * b * c, 20)
#'
#'#The optimality criterion can also be changed:
#'gen_design(quad_candidates, ~a + b + I(a^2) + I(b^2) + a * b * c, 20,
#'           optimality = "I", repeats = 10)
#'gen_design(quad_candidates, ~a + b + I(a^2) + I(b^2) + a * b * c, 20,
#'           optimality = "A", repeats = 10)
#'
#'#A blocked design can be generated by specifying the `blocksizes` argument. Passing a single
#'#number will create designs with blocks of that size, while passing multiple values in a list
#'#will specify multiple layers of blocking.
#'
#'#Specify a single layer
#'gen_design(quad_candidates, ~a + b + c, 21, blocksizes=3, add_blocking_column=TRUE)
#'
#'#Manually specify the block sizes for a single layer, must add to `trials``
#'gen_design(quad_candidates, ~a + b + c, 21, blocksizes=c(4,3,2,3,3,3,3),
#'           add_blocking_column=TRUE)
#'
#'#Multiple layers of blocking
#'gen_design(quad_candidates, ~a + b + c, 21, blocksizes=list(7,3),
#'           add_blocking_column=TRUE)
#'
#'#Multiple layers of blocking, specified individually
#'gen_design(quad_candidates, ~a + b + c, 21, blocksizes=list(7,c(4,3,2,3,3,3,3)),
#'           add_blocking_column=TRUE)
#'
#'#A split-plot design can be generated by first generating an optimal blocking design using the
#'#hard-to-change factors and then using that as the input for the split-plot design.
#'#This generates an optimal subplot design that accounts for the existing split-plot settings.
#'
#'splitplotcandidateset = expand.grid(Altitude = c(-1, 1),
#'                                     Range = as.factor(c("Close", "Medium", "Far")),
#'                                     Power = c(1, -1))
#'hardtochangedesign = gen_design(splitplotcandidateset, model = ~Altitude,
#'                                trials = 11, repeats = 10)
#'
#'#Now we can use the D-optimal blocked design as an input to our full design.
#'
#'#Here, we add the easy to change factors from the candidate set to the model,
#'#and input the hard-to-change design along with the new number of trials. `gen_design` will
#'#automatically allocate the runs in the blocks in the most balanced way possible.
#'
#'designsplitplot = gen_design(splitplotcandidateset, ~Altitude + Range + Power, trials = 33,
#'                              splitplotdesign = hardtochangedesign, repeats = 10)
#'
#'#If we want to allocate the blocks manually, we can do that with the argument `blocksizes`. This
#'#vector must sum to the number of `trials` specified.
#'
#'#Putting this all together:
#'designsplitplot = gen_design(splitplotcandidateset, ~Altitude + Range + Power, trials = 33,
#'                              splitplotdesign = hardtochangedesign,
#'                              blocksizes = c(4, 2, 3, 4, 2, 3, 4, 2, 3, 4, 2), repeats = 10)
#'
#'#The split-plot structure is encoded into the row names, with a period
#'#demarcating the blocking level. This process can be repeated for arbitrary
#'#levels of blocking (i.e. a split-plot design can be entered in as the hard-to-change
#'#to produce a split-split-plot design, which can be passed as another
#'#hard-to-change design to produce a split-split-split plot design, etc).
#'#In the following, note that the model builds up as we build up split plot strata.
#'
#'splitplotcandidateset2 = expand.grid(Location = as.factor(c("East", "West")),
#'                                      Climate = as.factor(c("Dry", "Wet", "Arid")),
#'                                      Vineyard = as.factor(c("A", "B", "C", "D")),
#'                                      Age = c(1, -1))
#'#6 blocks of Location:
#'temp = gen_design(splitplotcandidateset2, ~Location, trials = 6, varianceratio = 2, repeats = 10)
#'
#'#Each Location block has 2 blocks of Climate:
#'temp = gen_design(splitplotcandidateset2, ~Location + Climate,
#'                   trials = 12, splitplotdesign = temp, blocksizes = 2,
#'                   varianceratio = 1, repeats = 10)
#'
#'#Each Climate block has 4 blocks of Vineyard:
#'temp = gen_design(splitplotcandidateset2, ~Location + Climate + Vineyard,
#'                   trials = 48, splitplotdesign = temp, blocksizes = 4,
#'                   varianceratio = 1, repeats = 10)
#'
#'#Each Vineyard block has 4 runs with different Age:
#'if(skpr:::run_documentation()) {
#'splitsplitsplitplotdesign = gen_design(splitplotcandidateset2, ~Location + Climate + Vineyard + Age,
#'                                        trials = 192, splitplotdesign = temp, blocksizes = 4,
#'                                        varianceratio = 1, add_blocking_columns = TRUE)
#'}
#'#gen_design also supports user-defined optimality criterion. The user defines a function
#'#of the model matrix named customOpt, and gen_design will attempt to generate a design
#'#that maximizes that function. This function needs to be in the global environment, and be
#'#named either customOpt or customBlockedOpt, depending on whether a split-plot design is being
#'#generated. customBlockedOpt should be a function of the model matrix as well as the
#'#variance-covariance matrix, vInv. Due to the underlying C + + code having to call back to the R
#'#environment repeatedly, this criterion will be significantly slower than the built-in algorithms.
#'#It does, however, offer the user a great deal of flexibility in generating their designs.
#'
#'#We are going to write our own D-optimal search algorithm using base R functions. Here, write
#'#a function that calculates the determinant of the information matrix. gen_design will search
#'#for a design that maximizes this function.
#'
#'customOpt = function(currentDesign) {
#'  return(det(t(currentDesign) %*% currentDesign))
#'}
#'
#'#Generate the whole plots for our split-plot design, using the custom criterion.
#'
#'candlistcustom = expand.grid(Altitude = c(10000, 20000),
#'                             Range = as.factor(c("Close", "Medium", "Far")),
#'                             Power = c(50, 100))
#'htcdesign = gen_design(candlistcustom, model = ~Altitude + Range,
#'                       trials = 11, optimality = "CUSTOM", repeats = 10)
#'
#'#Now define a function that is a function of both the model matrix,
#'#as well as the variance-covariance matrix vInv. This takes the blocking structure into account
#'#when calculating our determinant.
#'
#'customBlockedOpt = function(currentDesign, vInv) {
#'  return(det(t(currentDesign) %*% vInv %*% currentDesign))
#'}
#'
#'#And finally, calculate the design. This (likely) results in the same design had we chosen the
#'#"D" criterion.
#'
#'design = gen_design(candlistcustom,
#'                    ~Altitude + Range + Power, trials = 33,
#'                    splitplotdesign = htcdesign, blocksizes = 3,
#'                    optimality = "CUSTOM", repeats = 10)
#'
#'#gen_design can also augment an existing design. Input a dataframe of pre-existing runs
#'#to the `augmentdesign` argument. Those runs in the new design will be fixed, and gen_design
#'#will perform a search for the remaining `trials - nrow(augmentdesign)` runs.
#'
#'candidateset = expand.grid(height = c(10, 20), weight = c(45, 55, 65), range = c(1, 2, 3))
#'
#'design_to_augment = gen_design(candidateset, ~height + weight + range, 5)
#'
#'#As long as the columns in the augmented design match the columns in the candidate set,
#'#this design can be augmented.
#'
#'augmented_design = gen_design(candidateset,
#'                              ~height + weight + range, 16, augmentdesign = design_to_augment)
#'
#'#A design's diagnostics can be accessed via the `get_optimality()` function:
#'
#'get_optimality(augmented_design)
#'
#'#And design attributes can be accessed with the `get_attribute()` function:
#'
#'get_attribute(design)
#'
#'#A correlation color map can be produced by calling the plot_correlation command with the output
#'#of gen_design()
#'
#'if(skpr:::run_documentation()) {
#'plot_correlations(design2)
#'}
#'
#'#A fraction of design space plot can be produced by calling the plot_fds command
#'if(skpr:::run_documentation()) {
#'plot_fds(design2)
#'}
#'
#'#Evaluating the design for power can be done with eval_design, eval_design_mc (Monte Carlo)
#'#eval_design_survival_mc (Monte Carlo survival analysis), and
#'#eval_design_custom_mc (Custom Library Monte Carlo)
gen_design = function(candidateset, model, trials,
                      splitplotdesign = NULL, blocksizes = NULL, optimality = "D",
                      augmentdesign = NULL, repeats = 20,
                      custom_v = NULL, varianceratio = 1, contrast = contr.simplex,
                      aliaspower = 2, minDopt = 0.8, k = NA,
                      parallel = FALSE, timer = TRUE, add_blocking_columns = FALSE, randomized = TRUE,
                      advancedoptions = NULL) {
  check_model_validity(model)
  #standardize and check optimality inputs
  optimality_uc = toupper(tolower(optimality))
  if (!(optimality_uc %in% c("D", "I", "A", "E", "T", "G", "ALIAS", "CUSTOM"))) {
    stop(sprintf("skpr: %s not recognized as a supported optimality criterion.",optimality))
  } else {
    optimality = optimality_uc
  }

  if(is.na(k)) {
    kexchange = trials
  } else {
    kexchange = k
  }
  if(kexchange > trials || kexchange < 1) {
    kexchange = trials
    warning("`k` must be an integer between `1` and `trials`. Setting to `trials`.")
  }

  if (is.null(advancedoptions$design_search_tolerance)) {
    tolerance = 10e-5
  } else {
    tolerance = advancedoptions$design_search_tolerance
  }

  if (is.null(advancedoptions$alias_compare)) {
    advancedoptions$alias_compare = TRUE
  }
  #Check for progress bar and GUI options
  if (!is.null(advancedoptions)) {
    if (is.null(advancedoptions$GUI)) {
      advancedoptions$GUI = FALSE
    }
    if (!is.null(advancedoptions$progressBarUpdater)) {
      progressBarUpdater = advancedoptions$progressBarUpdater
    } else {
      progressBarUpdater = NULL
    }
     if (!is.null(advancedoptions$aliasmodel)) {
      amodel = advancedoptions$aliasmodel
    } else {
      amodel = NULL
    }
    if(!is.null(advancedoptions$g_efficiency_method) && advancedoptions$g_efficiency_method != "none") {
      if(advancedoptions$g_efficiency_method == "random") {
        if(is.null(advancedoptions$g_efficiency_samples))  {
          advancedoptions$g_efficiency_samples = 1000
        }
      } else if (advancedoptions$g_efficiency_method == "optim") {
        if(is.null(advancedoptions$g_efficiency_samples))  {
          advancedoptions$g_efficiency_samples = 10
        }
      } else if (advancedoptions$g_efficiency_method == "custom") {
        if(is.null(advancedoptions$g_efficiency_samples) || is.numeric(advancedoptions$g_efficiency_samples))  {
          warning("no data.frame passed to advancedoptions$g_efficiency_samples, ignoring and using random sampling")
          advancedoptions$g_efficiency_method == "random"
          advancedoptions$g_efficiency_samples = 1000
        }
        contrastslisttemp = list()
        for (x in names(advancedoptions$g_efficiency_samples[lapply(advancedoptions$g_efficiency_samples, class) %in%
                                                             c("factor", "character")])) {
          contrastslisttemp[[x]] = contrast
        }
        if (length(contrastslisttemp) == 0) {
          advancedoptions$g_efficiency_samples = model.matrix(model,
                                        normalize_design(advancedoptions$g_efficiency_samples))
        } else {
          advancedoptions$g_efficiency_samples = suppressWarnings(model.matrix(model,
                                                         normalize_design(advancedoptions$g_efficiency_samples),
                                                         contrasts.arg = contrastslisttemp))
        }
      } else {
        warning("advancedoptions$g_efficiency_method not recognized, defaulting to random search")
        advancedoptions$g_efficiency_method = "random"
        advancedoptions$g_efficiency_samples = 1000
      }
    } else {
      advancedoptions$g_efficiency_method = "none"
    }
  } else {
    advancedoptions = list()
    advancedoptions$GUI = FALSE
    advancedoptions$g_efficiency_method = "none"
    progressBarUpdater = NULL
  }
  #turn off progress bar for non-interactive sessions
  timer = timer && interactive()

  if(is.null(splitplotdesign)) {
    if(varianceratio != 1) {
      warning("varianceratio cannot be set when split-plot design is not null.")
      varianceratio = 1
    }
  }

  #covert tibbles
  candidateset = as.data.frame(candidateset)
  if (!is.null(splitplotdesign)){
    splitplotdesign = as.data.frame(splitplotdesign)
  }

  #Remove skpr-generated REML blocking indicators if present
  splitplotdesign = remove_skpr_blockcols(splitplotdesign)
  #Throw error if backticks detected
  if (grepl("`", as.character(model)[2], fixed = TRUE)) {
    stop("skpr: skpr does not support backticks in gen_design. Use variable names without backticks and try again.")
  }

  model = convert_model_dots(candidateset, model, splitplotdesign)

  #---- Convert all character vectors to factors in candidate set ----#

  for(i in seq_len(ncol(candidateset))) {
    if(is.character(candidateset[,i])) {
      candidateset[,i] = as.factor(candidateset[,i])
    }
  }

  #----- Rearrange formula terms by order -----#
  model = rearrange_formula_by_order(model, data = candidateset)

  if (is.null(contrast)) {
    contrast = function(n) contr.simplex(n, size = sqrt(n - 1))
  }

  #---- generate contrast list-----#

  contrastslist = list()
  for (x in names(candidateset[lapply(candidateset, class) %in% c("factor", "character")])) {
    contrastslist[[x]] = contrast
  }

  if (length(contrastslist) == 0) {
    contrastslist = NULL
  }

  if (!is.null(splitplotdesign)) {
    contrastslistspd = list()
    for (x in names(splitplotdesign[lapply(splitplotdesign, class) %in% c("factor", "character")])) {
      contrastslistspd[[x]] = contrast
    }

    if (length(contrastslistspd) == 0) {
      contrastslistspd = NULL
    }
    contrastslistsubplot = list()
    for (x in names(candidateset[lapply(candidateset, class) %in% c("factor", "character")])) {
      contrastslistsubplot[[x]] = contrast
    }

    if (length(contrastslistsubplot) == 0) {
      contrastslistsubplot = NULL
    }

    if (model != as.formula("~.")) {
      if (attr(terms.formula(model), "intercept") == 0) {
        nointercept = TRUE
      } else {
        nointercept = FALSE
      }

      model = as.formula(paste0("~", paste(attr(terms.formula(model), "term.labels"), collapse = " + ")))

      wholeplotterms = colnames(splitplotdesign)
      subplotterms = colnames(candidateset)
      subplotterms = subplotterms[!(subplotterms %in% wholeplotterms)]

      splitterms = unlist(strsplit(as.character(model[2]), split = " + ", fixed = TRUE))

      wholeorwholeinteraction = rep(FALSE, length(splitterms))
      for (term in wholeplotterms) {
        regex = paste0("(\\b", term, "\\b)|(\\b", term, ":)|(:", term, "\\b)|(\\b", term, "\\s\\*)|(\\*\\s", term, "\\b)")
        wholeorwholeinteraction = wholeorwholeinteraction | grepl(regex, splitterms, perl = TRUE)
      }
      #get non-whole plot linear terms with no whole plot interactions
      modelnowholeformula = as.formula(paste0("~", paste(splitterms[!wholeorwholeinteraction], collapse = " + ")))

      regularmodel = rep(FALSE, length(splitterms))
      for (term in subplotterms) {
        regex = paste0("(\\b", term, "\\b)|(\\b", term, ":)|(:", term, "\\b)|(\\b", term, "\\s\\*)|(\\*\\s", term, "\\b)")
        regularmodel = regularmodel | grepl(regex, splitterms, perl = TRUE)
      }
      #get only whole plot linear terms or whole-whole interactions
      modelwholeformula = as.formula(paste0("~", paste(splitterms[!regularmodel], collapse = " + ")))

      #get whole:non-whole interaction terms
      wholeinteractionterms = splitterms[regularmodel & wholeorwholeinteraction]

      fullcontrastlist = c(contrastslistsubplot, contrastslistspd)

      if (length(wholeinteractionterms) > 0) {
        # Find columns of model matrices corresponding to the interactions, to determine what to multiply in the
        # Optimal design generation process.
        lineartermsinteraction = unique(unlist(strsplit(wholeinteractionterms, split = "(\\s\\*\\s)|(:)", perl = TRUE)))
        extract_intnames_formula = as.formula(paste0("~", paste(c(lineartermsinteraction, splitterms[!regularmodel], wholeinteractionterms), collapse = " + ")))
        combinedcand = cbind(candidateset[1, , drop = FALSE], splitplotdesign[1, , drop = FALSE])
        allcolnames = suppressWarnings(colnames(model.matrix(extract_intnames_formula, data = combinedcand, contrasts.arg = fullcontrastlist)))
        interactionnames = allcolnames[grepl("(\\s\\*\\s)|(:)", allcolnames, perl = TRUE)]

        #Get correct order of column names
        submodel = as.formula(paste0("~", paste0(splitterms[!wholeorwholeinteraction], collapse = " + ")))
        if (!nointercept) {
          wholemm = suppressWarnings(colnames(model.matrix(modelwholeformula, data = combinedcand, contrasts.arg = fullcontrastlist)))
        } else {
          wholemm = suppressWarnings(colnames(model.matrix(modelwholeformula, data = combinedcand, contrasts.arg = fullcontrastlist))[-1])
        }
        submm = suppressWarnings(colnames(model.matrix(submodel, data = combinedcand, contrasts.arg = contrastslistsubplot))[-1])
        correct_order_colnames = c(wholemm, submm, interactionnames)

        interactioncounter = 1
        interactionlist = list()
        #get model matrix of everything except whole/interactions
        for (interaction_col in interactionnames) {
          term_vals = unlist(strsplit(interaction_col, split = "(\\s\\*\\s)|(:)", perl = TRUE))
          if (any(term_vals %in% submm)) {
            interactionlist[[interactioncounter]] = which(correct_order_colnames %in% term_vals)
            interactioncounter = interactioncounter + 1
          }
        }
      } else {
        interactionlist = list()
      }
    } else {
      interactionlist = list()
      fullcontrastlist = c(contrastslistsubplot, contrastslistspd)
      modelwholeformula = as.formula(paste("~", paste(colnames(splitplotdesign), collapse = " + "), sep = ""))
      modelnowholeformula = as.formula(paste("~", paste(colnames(candidateset)[!(colnames(candidateset) %in% colnames(splitplotdesign))], collapse = " + "), sep = ""))
    }
  }

  fullcandidateset = unique(reduceRunMatrix(candidateset, model))

  if (is.null(splitplotdesign)) {
    candidateset = unique(reduceRunMatrix(candidateset, model, FALSE))
  } else {
    candidateset = unique(reduceRunMatrix(candidateset, modelnowholeformula, FALSE))
  }

  #------Ensure the candidate set has no single-valued columns------#
  if (nrow(candidateset) == 0) {
    stop("skpr:The candidate set has zero rows.")
  }
  if (ncol(candidateset) == 0) {
    stop("skpr:The candidate set has zero columns.")
  }
  for (colno in seq_len(ncol(candidateset))) {
    if (length(unique(candidateset[[colno]])) == 1) {
      stop(paste("skpr: Column", colnames(candidateset)[colno], "of the candidateset contains only a single value."))
    }
  }
  #----Check for augmented design and normalize/equalize factor levels if present----#
  if (!is.null(augmentdesign)) {
    prev_augmented = ifelse(!is.null(attr(augmentdesign,"augmented")), attr(augmentdesign,"augmented"), FALSE)
    if(prev_augmented) {
      augmented_cols_prev = augmentdesign$Block1
    } else {
      augmented_cols_prev = rep(1,nrow(augmentdesign))
    }
    if (!is.null(splitplotdesign)) {
      stop("skpr: Design augmentation not available with split-plot designs.")
    }
    if(prev_augmented) {
      augmentdesign$Block1 = NULL
    }
    if (any(colnames(augmentdesign)[order(colnames(augmentdesign))] != colnames(candidateset)[order(colnames(candidateset))])) {
      stop("skpr: Column names for augmented design and candidate set must be equal.")
    }
    if (ncol(augmentdesign) != ncol(candidateset)) {
      stop("Number of columns in the augmented design must equal number of columns in the candidate set.")
    }
    aug_coltype = unlist(lapply(augmentdesign, class))[order(colnames(augmentdesign))]
    cand_coltype = unlist(lapply(candidateset, class))[order(colnames(candidateset))]
    if (any(aug_coltype != cand_coltype)) {
      warning("Augmented designed column types '", paste0(aug_coltype[aug_coltype != cand_coltype],collapse=", "),
              "' for columns '",paste0(colnames(candidateset)[aug_coltype != cand_coltype],collapse=", "),
              "' don't match candidate set '",paste0(cand_coltype[aug_coltype != cand_coltype],collapse=", "),
              "'--attempting to covert columns to that of the candidate set")
      for(i in seq_len(ncol(augmentdesign))) {
        augmentdesign[,i] = methods::as(augmentdesign[,i],cand_coltype[i])
      }
    }
    if (nrow(augmentdesign) >= trials) {
      stop("skpr: Total number of trials must exceed the number of runs in augmented design.")
    }
    augmentdesign = augmentdesign[, colnames(candidateset)]
    #Check and make sure factor levels are equal
    for (i in seq_len(ncol(augmentdesign))) {
      if (is.character(augmentdesign[, i]) || is.factor(augmentdesign[, i])) {
        candidateset[, i] = factor(candidateset[, i])
        augmentdesign[, i] = factor(augmentdesign[, i], levels = levels(candidateset[, i]))
      }
    }
    #Normalize
    augmentnormalized = normalize_design(augmentdesign, candidateset)
    augmentedrows = nrow(augmentdesign)
  } else {
    augmentedrows = 0
  }

  #------Normalize/Center numeric columns ------#
  candidatesetnormalized = candidateset

  candidatesetnormalized = normalize_design(candidateset, augmentdesign)
  fullcandidatesetnorm = normalize_design(fullcandidateset, augmentdesign)

  if (!is.null(splitplotdesign)) {
    spdnormalized = normalize_design(splitplotdesign)
  }

  splitplot = FALSE
  blocking = FALSE
  #-----generate blocked design with replicates-----#
  if (!is.null(splitplotdesign)) {
    if (!is.null(attr(splitplotdesign, "varianceratios"))) {
      varianceRatios = c(attr(splitplotdesign, "varianceratios"), varianceratio)
    } else {
      varianceRatios = varianceratio
    }
    splitplot = TRUE
    if (is.null(blocksizes)) {
      if (trials < 2 * nrow(splitplotdesign)) {
        warning("Warning: Some blocks in `splitplotdesign` only have one replicate.")
      }
      if (trials %% nrow(splitplotdesign) == 0) {
        blocksizes = trials / nrow(splitplotdesign)
      } else {
        sizevector = c(rep(ceiling(trials / nrow(splitplotdesign)), nrow(splitplotdesign)))
        unbalancedruns = ceiling(trials / nrow(splitplotdesign)) * nrow(splitplotdesign) - trials
        sizevector[(length(sizevector) - unbalancedruns + 1):length(sizevector)] = sizevector[(length(sizevector) - unbalancedruns + 1):length(sizevector)] - 1
        blocksizes = sizevector
      }
    }
    if(is.list(blocksizes)) {
      warning("For split plot designs, only one layer of blocking can be specified via `blocksizes`--extracted first element of list to use as blocking vector")
      blocksizes = blocksizes[[1]]
    }
    if (length(blocksizes) == 1) {
      blocksizes = rep(blocksizes, nrow(splitplotdesign))
    }
    if (trials != sum(blocksizes)) {
      stop("skpr: Blocked replicates does not equal the number of trials input")
    }
    if (nrow(splitplotdesign) != length(blocksizes)) {
      stop("skpr: Need to specify a size for each row in the given split plot design")
    }
    alreadyBlocking = FALSE
    initialrownames = rownames(splitplotdesign)
    blocklist = strsplit(initialrownames, ".", fixed = TRUE)
    blockgroups = list(blocksizes)

    if (any(lapply(blocklist, length) > 1)) {
      alreadyBlocking = TRUE
      initialrownames = rep(rownames(splitplotdesign), blocksizes)
      blocklist = strsplit(initialrownames, ".", fixed = TRUE)
      existingblockstructure = do.call(rbind, blocklist)
      blockgroups = apply(existingblockstructure, 2, blockingstructure)
    }
    withinBlockRun = function(runs) seq_len(runs)

    blockIndicators = rep(seq_len(length(blocksizes)), blocksizes)

    blockvars = colnames(splitplotdesign)
    blocks = list()
    for (i in seq_len(length(blockIndicators))) {
      blocks[[i]] = spdnormalized[blockIndicators[i], ]
    }

    blockRuns = c()
    for (i in seq_len(length(blocksizes))) {
      blockRuns = c(blockRuns, withinBlockRun(blocksizes[i]))
    }

    if (length(blocks[[1]]) > 1) {
      splitPlotReplicateDesign = as.data.frame(do.call(rbind, blocks))
    } else {
      splitPlotReplicateDesign = as.data.frame(unlist(blocks))
    }

    colnames(splitPlotReplicateDesign) = blockvars
    if (alreadyBlocking) {
      rownames(splitPlotReplicateDesign) = paste(initialrownames, blockRuns, sep = ".")
    } else {
      rownames(splitPlotReplicateDesign) = paste(blockIndicators, blockRuns, sep = ".")
    }
    blockMatrixSize = sum(blocksizes)
    if(length(varianceRatios) > 1) {
      V = diag(blockMatrixSize) * varianceRatios[1]
      blockcounter = 2
    } else {
      V = diag(blockMatrixSize)
      blockcounter = 1
    }
    for (block in blockgroups) {
      V[seq_len(block[1]), seq_len(block[1])] =  V[seq_len(block[1]), seq_len(block[1]) ] + varianceRatios[blockcounter]
      placeholder = block[1]
      for (i in seq_len(length(block))[-1]) {
        V[seq(placeholder + 1, placeholder + block[i]),
          seq(placeholder + 1,placeholder + block[i])] = V[seq(placeholder + 1, placeholder + block[i]),
                                                           seq(placeholder + 1, placeholder + block[i])] + varianceRatios[blockcounter]
        placeholder = placeholder + block[i]
      }
      blockcounter = blockcounter + 1
    }
    zlist = list()
    for (i in seq_len(length(blockgroups))) {
      tempblocks = blockgroups[[i]]
      tempnumberblocks = length(tempblocks)
      ztemp = matrix(0, nrow = trials, ncol = tempnumberblocks)
      currentrow = 1
      for (j in seq_len(tempnumberblocks)) {
        ztemp[seq(currentrow, currentrow + tempblocks[j] - 1), j] = varianceRatios[i]
        currentrow = currentrow + tempblocks[j]
      }
      zlist[[i]] = ztemp
    }
  } else if (!is.null(blocksizes) || !is.null(custom_v)) {
    if (length(varianceratio) == 1) {
      if(is.list(blocksizes)) {
        varianceratio = c(1,rep(varianceratio,length(blocksizes)))
      }
    } else {
      if(is.list(blocksizes)) {
        if(length(varianceratio) < length(blocksizes) + 1) {
          stop("skpr: Number of blocking layers (specified in the list passed to `blocksizes` argument) does not equal number of entries in argument `varianceratio`. Set a ratio for each layer or one value for all layers.` ")
        }
      }
    }
    if(is.list(blocksizes)) {
      for(i in seq_len(length(blocksizes))) {
        if (length(blocksizes[[i]]) == 1) {
          if (trials %% blocksizes[[i]] == 0) {
            sizevector = rep(blocksizes[[i]], trials/blocksizes[[i]])
          } else {
            sizevector = rep(blocksizes[[i]],  ceiling(trials/blocksizes[[i]]))
            unbalancedruns = blocksizes[[i]] *  ceiling(trials/blocksizes[[i]]) - trials
            sizevector[seq(length(sizevector) - unbalancedruns + 1, length(sizevector))] = sizevector[seq(length(sizevector) - unbalancedruns + 1,length(sizevector))] - 1
            message("Specified `trials` (",trials,") not divisible by `blocksizes` (",blocksizes[[i]],
                    "). Generated block sizes (for blocking layer ",i ,") are: ",
                    paste(sizevector, collapse = ", "))
          }
          blocksizes[[i]] = sizevector
        } else {
          if(sum(blocksizes[[i]]) != trials) {
            stop("skpr: sum(blocksize[[",i,"]]) layer not equal to number of trials: ", trials)
          }
        }
      }
    } else {
      if (length(blocksizes) == 1) {
        if (trials %% blocksizes == 0) {
          sizevector = rep(blocksizes, trials/blocksizes)
        } else {
          sizevector = rep(blocksizes,  ceiling(trials/blocksizes))
          unbalancedruns = blocksizes *  ceiling(trials/blocksizes) - trials
          sizevector[(length(sizevector) - unbalancedruns + 1):length(sizevector)] = sizevector[(length(sizevector) - unbalancedruns + 1):length(sizevector)] - 1
          message("Specified `trials` (",trials,") not divisible by `blocksizes` (",blocksizes,
                  "). Generated block sizes are: ", paste(sizevector, collapse = ", "))
        }
        blocksizes = sizevector
      }
    }
    blocking = TRUE
    if(!is.null(blocksizes)) {
      if(is.list(blocksizes)) {
        blockMatrixSize = sum(blocksizes[[1]])
        if(length(varianceratio) > 1) {
          V = diag(blockMatrixSize) * varianceratio[1]
          blockcounter = 2
        } else {
          V = diag(blockMatrixSize)
          blockcounter = 1
        }
        for(i in seq_len(length(blocksizes))) {
          blockgroups = list(blocksizes[[i]])
          for (block in blockgroups) {
            V[seq_len(block[1]), seq_len(block[1])] =  V[seq_len(block[1]), seq_len(block[1])] + varianceratio[blockcounter]
            placeholder = block[1]
            for (i in seq_len(length(block))[-1]) {
              V[seq(placeholder + 1, placeholder + block[i]),
                seq(placeholder + 1, placeholder + block[i])] = V[seq(placeholder + 1, placeholder + block[i]),
                                                                  seq(placeholder + 1, placeholder + block[i])] + varianceratio[blockcounter]
              placeholder = placeholder + block[i]
            }
            blockcounter = blockcounter + 1
          }
        }
      } else {
        blockgroups = list(blocksizes)
        blockMatrixSize = sum(blocksizes)
        if(length(varianceratio) > 1) {
          V = diag(blockMatrixSize) * varianceratio[1]
          blockcounter = 2
        } else {
          V = diag(blockMatrixSize)
          blockcounter = 1
        }
        for (block in blockgroups) {
          V[seq_len(block[1]), seq_len(block[1])] =  V[seq_len(block[1]), seq_len(block[1])] + varianceratio[blockcounter]
          placeholder = block[1]
          for (i in seq_len(length(block))[-1]) {
            V[seq(placeholder + 1, placeholder + block[i]),
              seq(placeholder + 1, placeholder + block[i])] = V[seq(placeholder + 1, placeholder + block[i]),
                                                                seq(placeholder + 1, placeholder + block[i])] + varianceratio[blockcounter]
            placeholder = placeholder + block[i]
          }
          blockcounter = blockcounter + 1
        }
      }
    }
    if(!is.null(custom_v)) {
      V = custom_v
    }
  }

  initialreplace = FALSE
  if (trials > nrow(candidateset)) {
    initialreplace = TRUE
  }

  genOutput = vector(mode = "list", length=repeats)

  if (length(contrastslist) == 0) {
    if (is.null(splitplotdesign)) {
      candidatesetmm = model.matrix(model, candidatesetnormalized)
      if (!is.null(augmentdesign)) {
        augmentdesignmm = model.matrix(model, augmentnormalized)
      }
    } else {
      candidatesetmm = model.matrix(modelnowholeformula, candidatesetnormalized)
    }
  } else {
    if (is.null(splitplotdesign)) {
      candidatesetmm = suppressWarnings(model.matrix(model, candidatesetnormalized, contrasts.arg = contrastslist))
      if (!is.null(augmentdesign)) {
        augmentdesignmm = model.matrix(model, augmentnormalized, contrasts.arg = contrastslist)
      }
    } else {
      candidatesetmm = suppressWarnings(model.matrix(modelnowholeformula, candidatesetnormalized, contrasts.arg = contrastslist))
    }
  }

  if (!splitplot) {
    if (det(t(candidatesetmm) %*% candidatesetmm) < 1e-8) {
      is_singular = function() {
        tryCatch({
          test_solve = solve(t(candidatesetmm) %*% candidatesetmm)
          return(FALSE)
        }, error = (function(e) (return(TRUE))))
      }
      if(is_singular()) {
        stop(paste("skpr: The candidate set does not support the specified model - its rank is too low.",
                   "This usually happens if disallowed combinations",
                   "have introduced a perfect correlation between some variables in the candidate set.",
                   "It can also happen if you have specified a quadratic model term but have only two levels",
                   "of that factor in the candidate set, or (more generally) if two of your terms can be ",
                   "expressed as a linear combination of another term. To solve this, you either need to",
                   "choose a different model or add more factor combinations to your candidate set."))
      }
      eigenvals =  eigen(t(candidatesetmm) %*% candidatesetmm, symmetric = TRUE)$values
      condition_val = max(eigenvals)/min(eigenvals)
      if(min(eigenvals) > 0 && condition_val > 100) {
        warning(paste("This candidate set is highly correlated. \n\nYou can calculate a degree of correlation",
                     "by calculating the condition index (k), the ratio of the maximum eigenvalue of the information",
                     "matrix over the minimum eigenvalue. Generally: k < 100 indicates minimal to no multicollinearity,",
                     "100 < k < 1,000 indicates moderate to strong multicollinearity, and k > 1000 indiates severe",
                     "multicollinearity (Montgomery, \"Introduction to linear regression analysis\", 2001). \n\nThe",
                     sprintf(" condition index calculated for your candidate set and model is %0.1f", condition_val)))
      }
    }
  }

  if (is.null(amodel)) {
   if (is.null(splitplotdesign)) {
     amodel = aliasmodel(model, aliaspower)
   } else {
     amodel = aliasmodel(modelnowholeformula, aliaspower)
   }
  }
  if (model == amodel && optimality == "ALIAS") {
    stop(paste0(c("skpr: Alias optimal selected, but full model specified with no aliasing at current aliaspower: ",
                  aliaspower, ". Try setting aliaspower = ", aliaspower + 1), collapse = ""))
  }
  if (length(contrastslist) == 0) {
    aliasmm = model.matrix(amodel, candidatesetnormalized)
  } else {
    suppressWarnings({
      aliasmm = model.matrix(amodel, candidatesetnormalized, contrasts.arg = contrastslist)
    })
  }

  if (!splitplot) {
    factors = colnames(candidatesetmm)
    levelvector = sapply(lapply(candidateset, unique), length)
    classvector = sapply(candidateset, inherits, c("factor", "character"))

    mm = gen_momentsmatrix(factors, levelvector, classvector)
    if (!parallel) {
      if(timer) {
        pb = progress::progress_bar$new(format = "  Searching [:bar] :percent ETA: :eta",
                                        total = repeats, clear = TRUE, width= 60)
      }
      for (i in seq_len(repeats)) {
        if (!is.null(progressBarUpdater)) {
          progressBarUpdater(1 / repeats)
        }
        if(timer) {
          pb$tick()
        }
        randomindices = sample(nrow(candidatesetmm), trials, replace = initialreplace)
        initialdesign = candidatesetmm[randomindices, ]
        if (!is.null(augmentdesign)) {
          initialdesign[seq_len(augmentedrows), ] = augmentdesignmm
        }
        if(!blocking) {
          genOutput[[i]] = genOptimalDesign(initialdesign = initialdesign, candidatelist = candidatesetmm,
                                            condition = optimality, momentsmatrix = mm, initialRows = randomindices,
                                            aliasdesign = aliasmm[randomindices, ],
                                            aliascandidatelist = aliasmm, minDopt = minDopt,
                                            tolerance = tolerance, augmentedrows = augmentedrows, kexchange = kexchange)
        } else {
          genOutput[[i]] = genBlockedOptimalDesign(initialdesign = initialdesign, candidatelist = candidatesetmm,
                                                  condition = optimality, V = V, momentsmatrix = mm, initialRows = randomindices,
                                                  aliasdesign = aliasmm[randomindices, ],
                                                  aliascandidatelist = aliasmm, minDopt = minDopt,
                                                  tolerance = tolerance, augmentedrows = augmentedrows, kexchange = kexchange)
        }
      }
    } else {
      if (is.null(options("cores")[[1]])) {
        numbercores = parallel::detectCores()
      } else {
        numbercores = options("cores")[[1]]
      }
      cl = parallel::makeCluster(numbercores)
      numbercores = length(cl)
      if(timer) {
        pb = progress::progress_bar$new(format = sprintf("  Searching (%d cores) [:bar] :percent ETA: :eta", numbercores),
                                        total = repeats, clear = TRUE, width= 60)
      }
      tryCatch({
        doParallel::registerDoParallel(cl)
        number_updates = max(c(min(c(repeats/(2*numbercores),100)),1))
        parallel_output = list()
        single_batch_number = repeats/number_updates
        total_remaining = repeats
        counter = 1
        while(total_remaining > 0) {
          if(total_remaining < single_batch_number) {
            single_batch_number = total_remaining
          }
          parallel_output[[counter]] = foreach(i = seq_len(single_batch_number), .export = c("genOptimalDesign","genBlockedOptimalDesign")) %dorng% {

            randomindices = sample(nrow(candidatesetmm), trials, replace = initialreplace)
            initialdesign = candidatesetmm[randomindices, ]
            if (!is.null(augmentdesign)) {
              initialdesign[seq_len(augmentedrows), ] = augmentdesignmm
            }
            if(!blocking) {
              genOptimalDesign(initialdesign = initialdesign, candidatelist = candidatesetmm,
                               condition = optimality, momentsmatrix = mm, initialRows = randomindices,
                               aliasdesign = aliasmm[randomindices, ],
                               aliascandidatelist = aliasmm, minDopt = minDopt,
                               tolerance = tolerance, augmentedrows = augmentedrows, kexchange = kexchange)
            } else {
              genBlockedOptimalDesign(initialdesign = initialdesign, candidatelist = candidatesetmm,
                                      condition = optimality, V = V, momentsmatrix = mm, initialRows = randomindices,
                                      aliasdesign = aliasmm[randomindices, ],
                                      aliascandidatelist = aliasmm, minDopt = minDopt,
                                      tolerance = tolerance, augmentedrows = augmentedrows, kexchange = kexchange)
            }
          }
          total_remaining = total_remaining - single_batch_number
          counter = counter + 1
          if(timer) {
            pb$tick(single_batch_number)
          }
          if (!is.null(progressBarUpdater)) {
            progressBarUpdater(single_batch_number / repeats)
          }
        }
      }, finally = {
        tryCatch({
          parallel::stopCluster(cl)
        }, error = function (e) {})
      })
      genOutput = unlist(parallel_output, recursive  = FALSE)
    }
  } else {
    #Set up split-plot inputs
    blockedContrastsList = list()
    for (x in names(splitPlotReplicateDesign[sapply(splitPlotReplicateDesign, class) == "factor"])) {
      blockedContrastsList[[x]] = contrast
    }
    if (length(blockedContrastsList) == 0) {
      blockedmodelmatrix = model.matrix(modelwholeformula, splitPlotReplicateDesign)
    } else {
      blockedmodelmatrix = model.matrix(modelwholeformula, splitPlotReplicateDesign, contrasts.arg = blockedContrastsList)
    }
    levelvector = c(sapply(lapply(splitplotdesign, unique), length), sapply(lapply(candidateset, unique), length))
    classvector = c(sapply(lapply(splitplotdesign, unique), class) == "factor", sapply(lapply(candidateset, unique), class) == "factor")
    if (length(interactionlist) == 0) {
      blockedFactors = c(colnames(blockedmodelmatrix), colnames(candidatesetmm)[-1])
      blockedmm = gen_momentsmatrix(blockedFactors, levelvector, classvector)
    } else {
      blocked_interactions = colnames(blockedmodelmatrix)[grepl(":",colnames(blockedmodelmatrix),fixed=TRUE)]
      potential_blocked_interactions = c()
      for(i in seq_len(length(blocked_interactions))) {
        potential_blocked_interactions = c(potential_blocked_interactions,
                                           potential_permuted_factors(unlist(strsplit(blocked_interactions[[i]],":"))))
      }
      interactionnames = interactionnames[!(interactionnames %in% potential_blocked_interactions)]
      blockedFactors = c(colnames(blockedmodelmatrix), colnames(candidatesetmm)[-1], interactionnames)
      blockedmm = gen_momentsmatrix(blockedFactors, levelvector, classvector)
    }
    disallowedcombdf = disallowed_combinations(fullcandidatesetnorm)
    if (nrow(disallowedcombdf) > 0) {
      for(i in seq_len(ncol(disallowedcombdf))) {
        column_name = names(disallowedcombdf)[i]
        if(is.factor(disallowedcombdf[,i]) & !is.null(splitPlotReplicateDesign[[column_name]])) {
          disallowedcombdf[,i] = factor(disallowedcombdf[,i], levels = levels(splitPlotReplicateDesign[[column_name]]))
        }
      }
      anydisallowed = TRUE
      attr(disallowedcombdf, "contrasts") = attr(blockedmodelmatrix,"contrast")
      disallowedcomb = suppressWarnings(model.matrix(model, disallowedcombdf, contrasts.arg = fullcontrastlist))
      if ("(Intercept)" %in% colnames(disallowedcomb)) {
        disallowedcomb = disallowedcomb[, c(colnames(blockedmodelmatrix), colnames(candidatesetmm)[-1])]
      } else {
        disallowedcomb = disallowedcomb[, c(colnames(blockedmodelmatrix), colnames(candidatesetmm))]
      }
    } else {
      anydisallowed = FALSE
      disallowedcomb = matrix()
    }

    #Finished setting up split-plot inputs
    if (!parallel) {
      if(timer) {
        pb = progress::progress_bar$new(format = "  Searching [:bar] :percent ETA: :eta",
                                        total = repeats, clear = TRUE, width= 60)
      }
      for (i in seq_len(repeats)) {
        if (!is.null(progressBarUpdater)) {
          progressBarUpdater(1 / repeats)
        }
        if(timer) {
          pb$tick()
        }
        randomindices = sample(nrow(candidateset), trials, replace = initialreplace)
        genOutput[[i]] = genSplitPlotOptimalDesign(initialdesign = candidatesetmm[randomindices, -1, drop = FALSE],
                                                 candidatelist = candidatesetmm[, -1, drop = FALSE], blockeddesign = blockedmodelmatrix,
                                                 condition = optimality, momentsmatrix = blockedmm, initialRows = randomindices,
                                                 blockedVar = V, aliasdesign = aliasmm[randomindices, -1, drop = FALSE],
                                                 aliascandidatelist = aliasmm[, -1, drop = FALSE], minDopt = minDopt, interactions = interactionlist,
                                                 disallowed = disallowedcomb, anydisallowed = anydisallowed, tolerance = tolerance, kexchange = kexchange)
      }
    } else {
      if (is.null(options("cores")[[1]])) {
        numbercores = parallel::detectCores()
      } else {
        numbercores = options("cores")[[1]]
      }
      if(timer) {
        pb = progress::progress_bar$new(format = "  Searching [:bar] :percent ETA: :eta",
                                        total = repeats, clear = TRUE, width= 60)
      }
      cl = parallel::makeCluster(numbercores)
      numbercores = length(cl)
      tryCatch({
        doParallel::registerDoParallel(cl)
        number_updates = max(c(min(c(repeats/(2*numbercores),100)),1))
        parallel_output = list()
        single_batch_number = repeats/number_updates
        total_remaining = repeats
        counter = 1
        while(total_remaining > 0) {
          if(total_remaining < single_batch_number) {
            single_batch_number = total_remaining
          }
          parallel_output[[counter]] = foreach(i = seq_len(repeats), .export = c("genSplitPlotOptimalDesign")) %dorng% {

            randomindices = sample(nrow(candidateset), trials, replace = initialreplace)
            genSplitPlotOptimalDesign(initialdesign = candidatesetmm[randomindices, -1, drop = FALSE],
                                    candidatelist = candidatesetmm[, -1, drop = FALSE], blockeddesign = blockedmodelmatrix,
                                    condition = optimality, momentsmatrix = blockedmm, initialRows = randomindices,
                                    blockedVar = V, aliasdesign = aliasmm[randomindices, -1, drop = FALSE],
                                    aliascandidatelist = aliasmm[, -1, drop = FALSE], minDopt = minDopt, interactions = interactionlist,
                                    disallowed = disallowedcomb, anydisallowed = anydisallowed, tolerance = tolerance, kexchange = kexchange)
          }
          total_remaining = total_remaining - single_batch_number
          counter = counter + 1
          if(timer) {
            pb$tick(single_batch_number)
          }
          if (!is.null(progressBarUpdater)) {
            progressBarUpdater(single_batch_number / repeats)
          }
        }
      }, finally = {
        tryCatch({
          parallel::stopCluster(cl)
        }, error = function (e) {})
      })
      genOutput = unlist(parallel_output, recursive  = FALSE)

    }
  }

  designs = list()
  rowindicies = list()
  criteria = list()
  designcounter = 1

  for (i in seq_len(repeats)) {
    if (!is.na(genOutput[[i]]["criterion"])) {
      designs[designcounter] = genOutput[[i]]["modelmatrix"]
      rowindicies[designcounter] = genOutput[[i]]["indices"]
      criteria[designcounter] = genOutput[[i]]["criterion"]
      designcounter = designcounter + 1
    }
  }

  if (length(designs) == 0) {
    stop(paste0("skpr: For a design with ", trials, " trials and ",
                ncol(candidatesetmm) + ifelse(splitplot, ncol(blockedmodelmatrix) - 1 + length(interactionlist), 0),
                " parameters, skpr was not able to find non-singular design within given number of repeats. ",
                "Increase repeats argument and try again. If still no designs are found, reduce the number ",
                "of model parameters or increase the number of trials. This failure to find a design could ",
                "also be the result of disallowed combinations preventing a non-singular design from existing, given the model."))
  }

  if (!is.null(advancedoptions$alias_tie_tolerance) && advancedoptions$alias_tie_tolerance != 0) {
    if (optimality != "D") {
      warning("alias_tie_tolerance only a percentage for D-optimal--otherwise, number refers to raw criteria value. Use at own discretion.")
    }
  }

  if (optimality == "D" || optimality == "T" || optimality == "E" || optimality == "CUSTOM") {
    maxcriteria = max(unlist(criteria), na.rm = TRUE)
    if (is.null(advancedoptions$alias_tie_tolerance) || advancedoptions$alias_tie_tolerance == 0) {
      bestvec = which(unlist(lapply(criteria, (function(x) isTRUE(all.equal(x, maxcriteria))))))
    } else {
      bestvec = which(abs(maxcriteria - unlist(criteria)) < advancedoptions$alias_tie_tolerance)
    }

    if (length(bestvec) > 1 && ncol(candidateset) > 1 && advancedoptions$alias_compare) {
      aliasvalues = list()
      for (i in bestvec) {
        rowindextemp = round(rowindicies[[i]])
        rowindextemp[rowindextemp == 0] = 1
        if (!is.null(augmentdesign)) {
          rowindextemp[seq_len(nrow(augmentdesign))] = 1
        }
        if (!is.null(splitplotdesign)) {
          if (is.null(advancedoptions$alias_tie_power)) {
            amodel2 = aliasmodel(model, 2)
          } else {
            amodel2 = aliasmodel(model, advancedoptions$alias_tie_power)
          }
          suppressWarnings({
            aliasmatrix = model.matrix(amodel2, cbind(splitPlotReplicateDesign, constructRunMatrix(rowindextemp, candidatesetnormalized)), contrasts.arg = fullcontrastlist)[, -1, drop = FALSE]
          })
        } else {
          suppressWarnings({
            aliasmatrix = model.matrix(amodel, constructRunMatrix(rowindextemp, candidatesetnormalized, augmentdesign), contrasts.arg = contrastslist)[, -1, drop = FALSE]
          })
        }
        #Must test for NaN (can't solve some alias matrices for certain models)
        alias_val = calcAliasTrace(designs[[i]], aliasmatrix)
        if(!is.na(alias_val)) {
          aliasvalues[[i]] = alias_val
        }
      }
      #If no best values found, just return first design
      if(length(aliasvalues) > 0) {
        best = bestvec[which.min(unlist(aliasvalues))]
      } else {
        best = bestvec[1]
      }
    } else {
      best = which.max(criteria)
    }
    designmm = designs[[best]]
    rowindex = round(rowindicies[[best]])
  }

  if (optimality == "A" || optimality == "I" || optimality == "ALIAS" || optimality == "G") {
    criteria = criteria[criteria > 0]
    if (length(criteria) == 0) {
      stop("skpr: No non-singular designs found--increase number of repeats.")
    }
    mincriteria = min(unlist(criteria), na.rm = TRUE)
    if (is.null(advancedoptions$alias_tie_tolerance) || advancedoptions$alias_tie_tolerance == 0) {
      bestvec = which(unlist(lapply(criteria, (function(x) isTRUE(all.equal(x, mincriteria))))))
    } else {
      bestvec = which(abs(mincriteria - unlist(criteria)) < advancedoptions$alias_tie_tolerance)
    }
    if (length(bestvec) > 1 && ncol(candidateset) > 1 && advancedoptions$alias_compare) {
      aliasvalues = list()
      for (i in bestvec) {
        rowindextemp = round(rowindicies[[i]])
        rowindextemp[rowindextemp == 0] = 1
        if (!is.null(augmentdesign)) {
          rowindextemp[seq_len(nrow(augmentdesign))] = 1
        }
        if (!is.null(splitplotdesign)) {
          if (is.null(advancedoptions$alias_tie_power)) {
            amodel2 = aliasmodel(model, 2)
          } else {
            amodel2 = aliasmodel(model, advancedoptions$alias_tie_power)
          }
          suppressWarnings({
            aliasmatrix = model.matrix(amodel2, cbind(splitPlotReplicateDesign, constructRunMatrix(rowindextemp, candidatesetnormalized)), contrasts.arg = fullcontrastlist)[, -1, drop = FALSE]
          })
        } else {
          suppressWarnings({
            aliasmatrix = model.matrix(amodel, constructRunMatrix(rowindextemp, candidatesetnormalized, augmentdesign), contrasts.arg = contrastslist)[, -1, drop = FALSE]
          })
        }
        #Must test for NaN (can't solve some alias matrices for certain models)
        alias_val = calcAliasTrace(designs[[i]], aliasmatrix)
        if(!is.na(alias_val)) {
          aliasvalues[[i]] = alias_val
        }
      }
      #If no best values found, just return first design
      if(length(aliasvalues) > 0) {
        best = bestvec[which.min(unlist(aliasvalues))]
      } else {
        best = bestvec[1]
      }
    } else {
      best = which.min(criteria)
    }
    designmm = designs[[best]]
    rowindex = round(rowindicies[[best]])
  }

  rowindex[rowindex == 0] = 1
  if (!is.null(augmentdesign)) {
    rowindex[seq_len(nrow(augmentdesign))] = 1
  }

  if (!splitplot) {
    colnames(designmm) = factors
  } else {
    colnames(designmm) = blockedFactors
  }

  design = constructRunMatrix(rowIndices = rowindex, candidateList = candidateset, augment = augmentdesign)
  design_normalized = constructRunMatrix(rowIndices = rowindex, candidateList = candidatesetnormalized, augment = augmentdesign)

  if (splitplot) {
    design = cbind(splitPlotReplicateDesign, design)
  }

  if(blocking) {
    if(!is.null(blocksizes)) {
      if(is.list(blocksizes)) {
        for(j in rev(seq_len(length(blocksizes)))) {
          block_col_indicator = c()
          for(i in seq_len(length(blocksizes[[j]]))) {
            block_col_indicator = c(block_col_indicator, rep(i, blocksizes[[j]][i]))
          }
          block_col_df = data.frame(Block1=block_col_indicator)
          allattr = attributes(design)
          design = cbind(block_col_df, design)
          attributes(design) = allattr
          colnames(design) = c(paste0("Block", j),colnames(design)[-1])
        }
        if(!add_blocking_columns) {
          design = convert_blockcolumn_rownames(design, TRUE, varianceratio, FALSE)
        }
      } else {
        block_col_indicator = c()
        for(i in seq_len(length(blocksizes))) {
          block_col_indicator = c(block_col_indicator, rep(i, blocksizes[i]))
        }
        block_col_df = data.frame(Block1=block_col_indicator)
        design = cbind(block_col_df, design)
        if(!add_blocking_columns) {
          design = convert_blockcolumn_rownames(design, TRUE, varianceratio, FALSE)
        }
      }
    }
    attr(design, "blocking") = TRUE
  } else {
    attr(design, "blocking") = FALSE
  }
  attr(design, "D") = 100 * DOptimalityLog(designmm)
  attr(design, "A") = tryCatch({AOptimality(designmm)}, error = function(e) {})
  attr(design, "model.matrix") = designmm
  attr(design, "generating.model") = model
  attr(design, "generating.criterion") = optimality
  attr(design, "generating.contrast") = contrast
  attr(design, "contrastslist") = contrastslist

  if (!splitplot) {
    if(blocking) {
      if(add_blocking_columns) {
        rownames(design) = seq_len(nrow(design))
        blockcolnames = colnames(design)[grepl("(Block|block)(\\s?)+[0-9]+$", colnames(design), perl = TRUE) |
                                           grepl("(Whole Plots|Subplots)", colnames(design), perl = TRUE)]
        attr(design, "splitcolumns") = blockcolnames
      }
    } else {
      rownames(design) = seq_len(nrow(design))
    }
    colnames(mm) = colnames(designmm)
    rownames(mm) = colnames(designmm)
    attr(design, "moments.matrix") = mm
    attr(design, "varianceratios") = varianceratio
  } else {
    rownames(design) = rownames(splitPlotReplicateDesign)
    colnames(blockedmm) = colnames(designmm)
    rownames(blockedmm) = colnames(designmm)
    attr(design, "moments.matrix") = blockedmm
    attr(design, "V") = V
    attr(design, "varianceratios") = varianceRatios
    attr(design, "z.matrix.list") = zlist
    finallist = list()
    counterfinallist = 1
    for (row in seq_len(nrow(splitplotdesign))) {
      for (size in seq_len(blocksizes[row])) {
        finallist[[counterfinallist]] = splitplotdesign[row, ]
        counterfinallist = counterfinallist + 1
      }
    }
    finalspddesign = do.call("rbind", finallist)
    for (col in seq_len(ncol(finalspddesign))) {
      if (is.numeric(finalspddesign[, col])) {
        design[, col] = finalspddesign[, col]
      }
    }
  }

  tryCatch({
    if (ncol(designmm) > 2) {
      row_is_zero = apply(t(designmm) %*% designmm,1,(function(x) all(x == 0)))
      if(any(row_is_zero)) {
        terms_zero = paste0(sprintf("`%s`",names(row_is_zero)[row_is_zero]),collapse = " ")
        warning(sprintf("Covariance matrix has singularities due to term(s) %s: adding small offset (1e-10) along diagonal of information matrix to compute correlation matrix",
                        terms_zero))
        correlation.matrix = abs(cov2cor((t(designmm) %*% designmm) + diag(dim(designmm)[2])*1e-10)[-1, -1])
      } else {
        correlation.matrix = abs(cov2cor(t(designmm) %*% designmm)[-1, -1])
      }
      colnames(correlation.matrix) = colnames(designmm)[-1]
      rownames(correlation.matrix) = colnames(designmm)[-1]
      attr(design, "correlation.matrix") = round(correlation.matrix, 8)
      if (amodel != model) {
        aliasmatrix = suppressWarnings({
          model.matrix(aliasmodel(model, aliaspower), design_normalized, contrasts.arg = contrastslist)[, -1]
        })
        A = solve(t(designmm) %*% designmm) %*% t(designmm) %*% aliasmatrix
        attr(design, "alias.matrix") = A
        attr(design, "trA") = sum(diag(t(A) %*% A))
      } else {
        attr(design, "alias.matrix") = "No alias matrix calculated: full model specified"
        attr(design, "trA") = "No alias trace calculated: full model specified"
      }
    }
  }, error = function(e) {})

  if (!splitplot && !blocking) {
    tryCatch({
      if(advancedoptions$g_efficiency_method == "none") {
        attr(design, "G") = "Not Computed"
      } else if(advancedoptions$g_efficiency_method != "custom") {
        attr(design, "G") = calculate_gefficiency(design, calculation_type = advancedoptions$g_efficiency_method,
                                                  randsearches = advancedoptions$g_efficiency_samples)
      } else {
        attr(design, "G") = calculate_gefficiency(design, calculation_type = advancedoptions$g_efficiency_method,
                                                  design_space_mm = advancedoptions$g_efficiency_samples)
      }
      attr(design, "T") = sum(diag(t(designmm) %*% designmm))
      attr(design, "E") = min(unlist(eigen(t(designmm) %*% designmm)["values"]))
      attr(design, "variance.matrix") = diag(nrow(designmm)) * varianceratio
      attr(design, "I") = IOptimality(as.matrix(designmm), momentsMatrix = mm, blockedVar = diag(nrow(designmm)))
    }, error = function(e) {
      if(is.null(attr(design, "G"))) attr(design, "G") = NA
      if(is.null(attr(design, "T"))) attr(design, "T") = NA
      if(is.null(attr(design, "E"))) attr(design, "E") = NA
      if(is.null(attr(design, "variance.matrix"))) attr(design, "variance.matrix") = NA
      if(is.null(attr(design, "I"))) attr(design, "I") = NA
    })
  } else if (splitplot) {
    tryCatch({
      attr(design, "variance.matrix") = V
      vinv = solve(V)
      attr(design, "G") = 100 * (ncol(designmm)) / (nrow(designmm) * max(diag(designmm %*% solve(t(designmm) %*% vinv %*% designmm) %*% t(designmm) %*% vinv)))
      attr(design, "I") = IOptimality(as.matrix(designmm), momentsMatrix = blockedmm, blockedVar = V)
    }, error = function(e) {})
  } else if (blocking) {
    tryCatch({
      attr(design, "variance.matrix") = V
      vinv = solve(V)
      attr(design, "G") = 100 * (ncol(designmm)) / (nrow(designmm) * max(diag(designmm %*% solve(t(designmm) %*% vinv %*% designmm) %*% t(designmm) %*% vinv)))
      attr(design, "I") = IOptimality(as.matrix(designmm), momentsMatrix = mm, blockedVar = V)
    }, error = function(e) {})
  }

  #Re-order factors so levels with the lowest number of factors come first
  for (i in seq_len(ncol(design))) {
    if (!is.numeric(design[[i]])) {
      if(inherits(design[[i]],"factor")) {
        design[i] = factor(design[[i]], levels = levels(design[[i]])[order(table(design[[i]]))])
      } else {
        design[i] = factor(design[[i]], levels = unique(design[[i]])[order(table(design[[i]]))])
      }
    }
  }

  if (optimality == "D") {
    if (splitplot || blocking) {
      attr(design, "optimalsearchvalues") = unlist(criteria)
    } else {
      attr(design, "optimalsearchvalues") = 100 * unlist(criteria)
    }
  }
  if (optimality == "A") {
    if (splitplot || blocking) {
      attr(design, "optimalsearchvalues") = unlist(criteria)
    } else {
      attr(design, "optimalsearchvalues") = 100 / (nrow(designmm) * unlist(criteria) / ncol(designmm))
    }
  }
  if (optimality == "G") {
    attr(design, "optimalsearchvalues") = 100 * (ncol(designmm)) / (nrow(designmm) * unlist(criteria))
  }
  if (optimality %in% c("ALIAS", "I", "E", "T","CUSTOM")) {
    attr(design, "optimalsearchvalues") = unlist(criteria)
  }
  attr(design, "bestiterations") = best

  if(splitplot) {
    attr(design, "splitanalyzable") = FALSE
    attr(design, "splitplot") = TRUE
  } else {
    attr(design, "splitplot") = FALSE
  }

  if(blocking || !is.null(augmentdesign)) {
    attr(design, "blocking") = TRUE
  } else {
    attr(design, "blocking") = FALSE
  }

  #Add split plot columns if splitanalyzable is TRUE
  if (splitplot) {
    finalrownames = rownames(design)
    blocklist = strsplit(finalrownames, ".", fixed = TRUE)
    existingblockstructure = do.call(rbind, blocklist)
    blockgroups = apply(existingblockstructure, 2, blockingstructure)
    blocklengths = lapply(blockgroups, length)
    blockcols = list()
    blocknames = paste0("Block", seq_len(ncol(existingblockstructure) - 1))
    attr(design, "splitcolumns") = blocknames
    if (add_blocking_columns) {
      #Save attributes
      allattr = attributes(design)

      for (level in seq_len(length(blockgroups))) {
        blockcols[[level]] = unlist(lapply(list(blockgroups[[level]]), (function(x) rep(seq_len(blocklengths[[level]]), x))))
      }
      blockcolumns = do.call(cbind, blockcols)
      blockcolumns = blockcolumns[, -ncol(blockcolumns), drop = FALSE]
      for (col in rev(seq_len(ncol(blockcolumns)))) {
        design[blocknames[col]] = blockcolumns[, col]
        design = design[, c(ncol(design), seq_len(ncol(design) - 1))]
      }
      attributes(design) = allattr
      attr(design, "names") = c(paste0("Block", seq_len(ncol(blockcolumns))), allattr$names)
      attr(design, "splitanalyzable") = TRUE
    }
  }
  #Add block cols for augmented designs
  if (!randomized) {
    if(is.null(augmentdesign)) {
      allattr = attributes(design)
      design_order = do.call(order, design)
      design = design[design_order, , drop = FALSE]
      allattr$model.matrix = allattr$model.matrix[design_order,]
      attributes(design) = allattr
    } else {
      allattr = attributes(design)
      noaugmentdesign = design[-seq_len(nrow(augmentdesign)), , drop = FALSE]
      design_order_augment = do.call(order, noaugmentdesign)
      noaugmentdesign = noaugmentdesign[design_order_augment, , drop = FALSE]
      noaugmentmm = allattr$model.matrix[-seq_len(nrow(augmentdesign)), , drop = FALSE]
      noaugmentmm = noaugmentmm[design_order_augment, , drop = FALSE]
      rownames(noaugmentmm) = (nrow(augmentdesign)+1):nrow(design)
      allattr$model.matrix = rbind(augmentdesignmm,noaugmentmm)
      design = rbind(augmentdesign, noaugmentdesign)
      attributes(design) = allattr
    }
  }
  if(!is.null(augmentdesign)) {
    attr(design,"augmented") = TRUE
    designnames = colnames(design)
    allattr = attributes(design)
    augment_block_col = data.frame(Block1 = rep(max(augmented_cols_prev) + 1,nrow(design)))
    augment_block_col[seq_len(nrow(augmentdesign)),] = augmented_cols_prev
    design = cbind(augment_block_col, design)
    attributes(design) = allattr
    colnames(design) = c("Block1",designnames)
    attr(design,"varianceratios") = varianceratio
  } else {
    attr(design,"augmented") = FALSE
  }
  return(design)
}
globalVariables("i")

Try the skpr package in your browser

Any scripts or data that you put into this service are public.

skpr documentation built on July 9, 2023, 7:23 p.m.