R/R6designspace.R

#' A GLMM Design Space
#' 
#' A class-based representation of a "design space" that contains one or more \link[glmmrBase]{Model} objects.
#' @details 
#' An experimental study is comprised of a collection of experimental conditions, which are one or more observations made a pre-specified locations/values
#' of covariates. A design space represents the collection of all possible experimental conditions for the study design and plausible models describing
#' the data generating process. The main purpose of this class is to identify optimal study designs, that is the set of `n` experimental conditions 
#' from all possible experimental conditions that minimise the variance of a parameter of interest across the specified GLMMs.
#' 
#' A `DesignSpace` object is intialised using one or more \link[glmmrBase]{Model} objects. Design objects can be added or removed from the collection. 
#' All designs must have the same number of rows in their design matrices (X and Z) and the same number of experimental conditions. 
#' The DesignSpace functions can modify the linked design objects.
#' @return An environment that is `DesignSpace` class object
DesignSpace <- R6::R6Class("DesignSpace",
                 public = list(
                   #' @field weights A vector denoting the prior weighting of each Design in the design space. Required if robust optimisation is used based on a 
                   #' weighted average variance over the linked designs. If it is not specified in the call to `new()` then designs are assumed
                   #' to have equal weighting.
                   weights = NULL,
                   #' @field experimental_condition A vector indicating the unique identifier of the experimental condition for each observation/row in the matrices X and Z.
                   experimental_condition = NULL,
                   #' @description 
                   #' Create a new Design Space
                   #' 
                   #' Creates a new design space from one or more glmmr designs.
                   #' @details 
                   #' **Initialisation**
                   #' The experimental condition refers to the smallest "unit" of the study design that could be included in the design. For example, in a
                   #' cluster randomised trial, the experimental condition may be single individuals such that we can observed any number of individuals 
                   #' in any cluster period (including none at all). In this case the experimental condition would be equivalent to row number. Alternatively,
                   #' we may have to observe whole cluster periods, and we need to choose which cluster periods to observe, in which case the each observation 
                   #' in a different cluster-period would have the same experimental condition identifier. Finally, we may determine that the whole cluster in 
                   #' all periods (a "sequence") is either observed or not.
                   #' @param ... One or more glmmrBase \link[glmmrBase]{Model} objects. The designs must have an equal number of observations.
                   #' @param weights Optional. A numeric vector of values between 0 and 1 indicating the prior weights to assign to each of the designs. The weights
                   #' are required for optimisation, if a weighted average variance is used across the designs. If not specified then designs are assumed 
                   #' to have equal weighting.
                   #' @param experimental_condition Optional. A vector of the same length as the number of observations in each design indicating the unique
                   #' identifier of the experimental condition that observation belongs to, see Details. If not provided, then it is assumed that all observations
                   #' are separate experimental conditions.
                   #' @return A `DesignSpace` object
                   #' @examples
                   #' \dontshow{
                   #' glmmrBase::setParallel(FALSE) # for the CRAN check
                   #' setParallelOptim(FALSE)
                   #' }
                   #' df <- nelder(~ ((int(2)*t(3)) > cl(3)) > ind(5))
                   #' df$int <- df$int - 1
                   #' des <- Model$new(formula = ~ int + factor(t) - 1+ (1|gr(cl)) + (1|gr(cl,t)),
                   #'                  covariance = c(0.04,0.01),
                   #'                  mean = rep(0,4),
                   #'                  data=df,
                   #'                  family=gaussian())
                   #' ds <- DesignSpace$new(des)
                   #' #add another design
                   #' des2 <- Model$new(formula = ~ int + factor(t) - 1 + (1|gr(cl)) + (1|gr(cl,t)),
                   #'                   covariance = c(0.05,0.8),
                   #'                  mean = rep(0,4),
                   #'                  data=df,
                   #'                  family=gaussian())
                   #' ds$add(des2)
                   #' #report the size of the design
                   #' ds$n()
                   #' #we can access specific designs
                   #' ds$show(2)$n()
                   #' #and then remove it
                   #' ds$remove(2)
                   #' #or we could add them when we construct object
                   #' ds <- DesignSpace$new(des,des2)
                   #' #we can specify weights
                   #' ds <- DesignSpace$new(des,des2,weights=c(0.1,0.9))
                   #' #and add experimental conditions
                   #' ds <- DesignSpace$new(des,des2,experimental_condition = df$cl)                   
                   initialize = function(...,
                                         weights=NULL,
                                         experimental_condition = NULL) {
                     samp.size <- c()
                     i <- 0
                     for (item in list(...)) {
                       i <- i + 1
                       if(!(is(item,"Model")|is(item,"ModelMCML")))stop("Not all Model objects")
                       samp.size[i] <- item$n()
                     }
                     #print(samp.size)
                     if(length(samp.size) > 1 && !all(samp.size==samp.size[1]))stop("designs are of different sizes")
                     samp.size <- unique(samp.size)
                     
                     #if the weights are null assign equal weighting
                     if(!is.null(weights)){
                       if(length(weights)!=length(list(...)))stop("weights not same length as designs")
                       self$weights <- weights
                     } else {
                       if(length(list(...))>1)message("weights not provided, assigning equal weighting. weights can be changed manually in self$weights")
                       self$weights <- rep(1/length(list(...)),length(list(...)))
                     }
                     
                     #if the experimental condition is null assign all separate
                     if(!is.null(experimental_condition)){
                       if(length(experimental_condition)!=samp.size)stop("experimental condition not same size as designs")
                       self$experimental_condition <- experimental_condition
                     } else {
                       message("experimental condition not provided, assuming each observation is a separate experimental condition. experimental condition can be changed manually in self$experimental_condition")
                       self$experimental_condition <- 1:samp.size
                     }
                     
                     for (item in list(...)) {
                       self$add(item)
                     }

                   },
                   #' @description 
                   #' Add a design to the design space
                   #' @param x A `Design` to add to the design space
                   #' @return Nothing
                   #' @examples 
                   #' #See examples for constructing the class
                   add = function(x) {
                     if(length(private$designs)>0 && x$n()!=private$designs[[1]]$n())stop("New design is not same size as designs in this design space.")
                     private$designs <- append(private$designs, list(x))
                     self$weights <- rep(1/length(private$designs),length(private$designs))
                     invisible(self)
                   },
                   #' @description 
                   #' Removes a design from the design space
                   #' @param index Index of the design to remove
                   #' @return Nothing
                   #' @examples 
                   #' #See examples for constructing the class
                   remove = function(index) {
                     if (length(private$designs) == 0) return(NULL)
                     private$designs <- private$designs[-index]
                   },
                   #' @description 
                   #' Print method for the Design Space
                   #' @param ... ignored
                   #' @return Prints to the console all the designs in the design space
                   #' @examples 
                   #' #See examples for constructing the class
                   print = function(){
                     cat(paste0("Design space with ",self$n()[[1]]," design(s): \n"))
                     for(i in 1:length(private$designs)){
                       cat(paste0("=========================================================\nDESIGN ",i,"(weight ",self$weights[i],"):\n"))
                       print(private$designs[[i]])
                     }
                   },
                   #' @description 
                   #' Returns the size of the design space and number of observations
                   #' @examples 
                   #' #See examples for constructing the class
                   n = function(){
                     c("n.designs"=length(private$designs),"n" = private$designs[[1]]$n())
                   },
                   #' @description 
                   #' Approximate c-optimal design of size m
                   #' 
                   #' Algorithms to identify an approximate c-optimal design of size m within the design space.
                   #' @details 
                   #' **Approximate c-Optimal designs**
                   #' The function returns approximate c-optimal design(s) of size m from the design space with N experimental conditions. The objective
                   #' function is
                   #' 
                   #' \deqn{C^TM^{-1}C}
                   #' 
                   #' where M is the information matrix and C is a vector. Typically C will be a vector of zeros with a single 1 in the position of the
                   #' parameter of interest. For example, if the columns of X in the design are an interdept, the treatment indicator, and then time 
                   #' period indicators, the vector C may be `c(0,1,0,0,...)`, such that the objective function is the variance of that parameter. 
                   #' If there are multiple designs in the design space, the C vectors do 
                   #' not have to be the same as the columns of X in each design might differ. All the algorithms included in this package are described in 
                   #' Watson, Hemming, and Girling (2023) <doi:10.1177/09622802231202379> and Watson (2023) <doi:10.1007/s11222-023-10280-w>.
                   #' 
                   #' If the experimental conditions are correlated with one another, then one of three combinatorial algorithms can be used, see 
                   #' Watson and Pan, 2022 <doi:10.1007/s11222-023-10280-w>. The algorithms are: (i) local search, which starts from a random design of size m and then
                   #' makes the best swap between an experimental condition in and out of the design until no variance improving swap can be made; 
                   #' (ii) greedy search, which starts from a design of size p << n and then sequentially adds the best experimental condition until 
                   #' it generates a design of size m; (iii) reverse greedy search, which starts from the complete set of N experimental conditions and 
                   #' sequentially removes the worst experimental condition until it generates a design of size m. Note that only the local search has
                   #' provable bounds on the solution.
                   #'  
                   #' If the experimental conditional are uncorrelated (but there is correlation between observations within the same
                   #' experimental condition) then optionally an alternative algorithm can be used to approximate the optimal design using a second-order 
                   #' cone program (see Sangol, 2015 <doi:10.1016/j.jspi.2010.11.031> and Holland-Letz et al 2011 <doi:10.1111/j.1467-9868.2010.00757.x>). 
                   #' The approximate algorithm will return weights in [0,1] for each unique experimental condition representing
                   #' the "proportion of effort" to spend on each design condition. There are different ways to translate these weights into integer
                   #' values, which are returned see \link[glmmrOptim]{apportion}. Use of the approximate optimal design algorithm can be disabled used `use_combin=TRUE`
                   #' 
                   #' A weights algorithm for cases including when the observations are correlated are also included. This algorithm determines the 
                   #' GLMM estimation weights that minimise the variance. The algorithm is described in Watson, Hemming, and Girling (2023) <doi:10.1177/09622802231202379>
                   #' along with the other algoithms in this package.
                   #' 
                   #' In some cases the optimal design will not be full rank with respect to the design matrix X of the design space. This will result
                   #' in a non-positive definite information matrix, and an error. The program will indicate which columns of X are likely "empty" in the optimal
                   #' design. The user can then optionally remove these columns in the algorithm using the `rm_cols` argument, which will delete the
                   #' specified columns and linked observations before starting the algorithm. 
                   #' 
                   #' The algorithm will also identify robust optimal designs if there are multiple designs in the design space. 
                   #' There are two options for robust optimisation. Both involve a weighted combination of the value of the function from each design, where the weights are specified 
                   #' by the `weights` field in the design space. The weights represent the prior probability or plausibility of each design. 
                   #' The weighted sum is either a sum of the absolute value of the c-optimal criterion or its log (e.g. see Dette, 1993 <doi:10.1214/aos/1176349149>).
                   #' 
                   #' @param m A positive integer specifying the number of experimental conditions to include.
                   #' @param C Either a vector or a list of vectors of the same length as the number of designs, see Details.
                   #' @param attenuate_pars Logical indicating whether to adapt the marginal expecation in non-linear models 
                   #' @param V0 Optional. If a Bayesian c-optimality problem then this should be a list of prior covariance matrices for the model parameters
                   #' the same length as the number of designs.
                   #' @param rm_cols Optional. A list of vectors indicating columns of X to remove from each design, see Details.
                   #' @param keep Logical indicating whether to "keep" the optimal design in the linked design objects and remove any experimental
                   #' conditions and columns that are not part of the optimal design. Irreversible, so that these observations will be lost from the 
                   #' linked design objects. Defaults to FALSE.
                   #' @param verbose Logical indicating whether to reported detailed output on the progress of the algorithm. Default is TRUE.
                   #' @param algo A vector of integers indicating the algorithm(s) to use. 1 = local search, 2 = greedy search, 3 = reverse greedy search.
                   #' Declaring `algo = 1` for example will use the local search. Providing a vector such as `c(3,1)` will execute the algorithms in order,
                   #' so this would run a reverse greedy search followed by a local search. Note that many combinations will be redundant. For example, running
                   #' a greedy search after a local search will not have any effect. One can also use a general weights algorithm called the 'girling' algorithm,
                   #' setting `algo="girling"`.
                   #' @param use_combin Logical. If the experimental conditions are uncorrelated, if this option is TRUE then the hill climbing 
                   #' algorithm will be used, otherwise if it is FALSE, then a fast approximate alternative will be used. See Details
                   #' @param robust_log Logical. If TRUE and there are multiple designs in the design space then the robust criterion will be a sum of the log
                   #' of the c-optimality criterion weighted by the study weights, and if FALSE then it will be a weighted sum of the absolute value.
                   #' @param kr Logical. Whether to use the Kenwood-Roger small sample bias corrected variance matrix for the fixed effect parameters. We do not 
                   #' recommend using this as it can produce some strange results and its behaviour is not well understood.
                   #' @param p Optional. Positive integer specifying the size of the starting design for the greedy algorithm
                   #' @param tol Optional scalar specifying the termination tolerance of the Girling algorithm.
                   #' @return A named list. If using the weighting method then the list contains the optimal experimental weights and a 
                   #' list of exact designs of size `m`, see \link[glmmrOptim]{apportion}. If using a combinatorial algorithm then 
                   #' the list contains the rows in the optimal design, the indexes of the experimental conditions in the optimal design,
                   #' the variance from this design, and the total number of function evaluations. Optionally the linked designs are also modified (see option `keep`).
                   #' @examples
                   #' \dontshow{
                   #' glmmrBase::setParallel(FALSE) # for the CRAN check
                   #' setParallelOptim(FALSE)
                   #' }
                   #' df <- nelder(~(cl(6)*t(5)) > ind(5))
                   #' df$int <- 0
                   #' df[df$t >= df$cl, 'int'] <- 1
                   #' des <- Model$new(
                   #'   formula = ~ factor(t) + int - 1 + (1|gr(cl)),
                   #'   covariance = c(0.05),
                   #'   mean = c(rep(0,5),0.6),
                   #'   data=df,
                   #'   family=gaussian(),
                   #'   var_par = 1
                   #' )
                   #' ds <- DesignSpace$new(des)
                   #' 
                   #' #find the optimal design of size 30 individuals using reverse greedy search
                   #' # change algo=1 for local search, and algo = 2 for greedy search
                   #' opt2 <- ds$optimal(30,C=list(c(rep(0,5),1)),algo=3)
                   #' 
                   #' #let the experimental condition be the cluster
                   #' # these experimental conditions are independent of one another
                   #' ds <- DesignSpace$new(des,experimental_condition = df$cl)
                   #' # now find the optimal 4 clusters to include
                   #' # approximately, finding the weights for each condition
                   #' opt <- ds$optimal(4,C=list(c(rep(0,5),1)))
                   #' # or use the local search algorithm
                   #' opt <- ds$optimal(4,C=list(c(rep(0,5),1)),use_combin = TRUE,algo=1)
                   #' 
                   #' #robust optimisation using two designs
                   #' des2 <- Model$new(
                   #'   formula = ~ factor(t) + int - 1 + (1|gr(cl)*ar1(t)),
                   #'   covariance = c(0.05,0.8),
                   #'   mean = c(rep(0,5),0.6),
                   #'   data=df,
                   #'   family=gaussian(),
                   #'   var_par = 1
                   #' )
                   #' ds <- DesignSpace$new(des,des2)
                   #' #weighted average assuming equal weights using local search
                   #' \donttest{
                   #' opt <- ds$optimal(30,C=list(c(rep(0,5),1),c(rep(0,5),1)),algo=1)
                   #' }
                   optimal = function(m,
                                      C,
                                      attenuate_pars = FALSE,
                                      V0=NULL,
                                      rm_cols=NULL,
                                      keep=FALSE,
                                      verbose=TRUE,
                                      algo = c(1),
                                      use_combin=FALSE,
                                      robust_log = FALSE,
                                      kr = FALSE,
                                      p,
                                      tol = 1e-8){
                     #checks and balances
                     if(keep&verbose)message("linked design objects will be overwritten with the new design")
                     if(!is.null(V0) & length(V0)!=self$n()[[1]])stop("V0 not equal to number of designs")
                     if(is(C,"list")){
                       if(length(C)!=self$n()[[1]])stop("C not equal to number of designs")
                       for(i in 1:length(C)){
                         if(length(C[[i]]) != ncol(private$designs[[i]]$mean$X))stop("C wrong length")
                       }
                     } else if(is(C,"numeric")){
                       if(self$n()[[1]]>1)stop("C not equal to number of designs")
                       if(length(C) != ncol(private$designs[[i]]$mean$X))stop("C wrong length")
                     }
                     
                     if(is(algo,"character") && algo!="girling")stop("Unrecognised algorithm")
                     if(is(algo,"character") && algo=="girling" && self$n()[[1]] > 1)stop("Girling algorithm only available for single design space.")
                     if(is(algo,"numeric") && any(!algo%in%1:3))stop("Combinatorial algorithm(s) must be a combination of 1, 2, and 3")
                     
                     ## update designs if attenuated parameters
                     if(attenuate_pars & packageVersion('glmmrBase') < '0.2.5'){
                       warning("Linear predictor attenuation requires glmmrBase version 0.2.6 or higher.")
                     } else {
                       for(i in 1:self$n()[[1]]){
                         private$designs[[i]]$use_attenuation(attenuate_pars)
                       }
                     }
                     
                     # dispatch to correct algorithm
                     # check if the experimental conditions are correlated or not
                     #loop through each sigma
                     if(verbose)message("Checking experimental condition correlations...")
                     if(length(self$experimental_condition)!=private$designs[[1]]$n())stop("experimental condition not the same length as design")
                     uncorr <- TRUE
                     unique_exp_cond <- unique(self$experimental_condition)
                     for(i in 1:self$n()[[1]]){
                       for(j in unique_exp_cond){
                         if(packageVersion('glmmrBase') < '0.3.0'){
                           S <- private$designs[[i]]$Sigma
                         } else {
                           S <- private$designs[[i]]$Sigma()
                         }
                         uncorr <- all(S[which(self$experimental_condition==j),which(self$experimental_condition!=j)]==0)
                         if(!uncorr)break
                       }
                       if(!uncorr)break
                     }
                     
                     ## need to detect if the experimental conditions are duplicated
                     ## can update this but currently only provides a warning to the user
                     if(uncorr&!use_combin){
                       datahashes <- c()
                       for(j in unique_exp_cond){
                         datalist <- list()
                         for(k in 1:self$n()[[1]]){
                           if(packageVersion('glmmrBase') < '0.3.0'){
                             S <- private$designs[[i]]$Sigma
                             datalist[[k]] <- list(private$designs[[i]]$mean_function$X[self$experimental_condition==j,],
                                                   S[self$experimental_condition==j,self$experimental_condition==j])
                           } else {
                             S <- private$designs[[i]]$Sigma()
                             datalist[[k]] <- list(private$designs[[i]]$mean$X[self$experimental_condition==j,],
                                                   S[self$experimental_condition==j,self$experimental_condition==j])
                           }
                           
                         }
                         datahashes <- c(datahashes, digest::digest(datalist))
                       }
                       
                       if(any(duplicated(datahashes))){
                         unique_hash <- unique(datahashes)
                         n_unique_hash <- length(unique_hash)
                         datahashes <- match(datahashes,unique_hash)
                         message(paste0("Duplicated experimental conditions in the design space, ",n_unique_hash," unique 
experimental conditions, which are uncorrelated. 
use_combin=FALSE so weights will be calculated for each experimental condition separately. Sum of weights for
each condition will be reported below."))
                       }
                     }
                     
                     # make C a list
                     if(!is(C,"list")){
                       C_list <- list()
                       for(i in 1:self$n()[[1]]){
                         C_list[[i]] <- matrix(C,ncol=1)
                       }
                     } else {
                       C_list <- C
                     }
                     
                     # message user about algorithm
                     if(verbose&uncorr&!use_combin)message("Experimental conditions uncorrelated, using second-order cone program")
                     if(verbose&uncorr&use_combin)message("Experimental conditions uncorrelated, but using combinatorial algorithm")
                     if(verbose&!uncorr& is(algo,"numeric"))message("Experimental conditions correlated, using combinatorial search algorithms")
                     if(verbose&!uncorr& is(algo,"character"))message("Experimental conditions correlated, using Girling algorithm")
                     
                     # first is the second order cone program
                     if(uncorr&!use_combin){
                       # this returns the experimental designs to keep
                       if(self$n()[1] > 1)stop("An approximate weighting method is not currently available with a robust criterion. Please use combinatorial algorithms.")
                       idx_out <- private$socp_roptimal(C_list,m)
                       idx_out <- drop(idx_out)
                       if(verbose)cat("Weights for each experimental condition in the optimal design:\n", idx_out)
                       if(any(duplicated(datahashes))){
                         agg_weights <- aggregate(idx_out,list(datahashes),sum)
                         if(verbose)cat("\nSum of weights for unique experimental conditions:\n",agg_weights$x)
                         idx_out <- list(weights = idx_out, unique_weights = agg_weights$x)
                         outlist <- list(weights = idx_out,
                                         designs = apportion(idx_out$unique_weights,m))
                       } else {
                         outlist <- list(weights = idx_out,
                                         designs = apportion(idx_out,m))
                       }
                       
                       return(invisible(outlist))
                     } else {
                       # combinatorial or girling algorithm
                       #initialise from random starting index
                       if(packageVersion('glmmrBase') < '0.3.0'){
                         N <- private$designs[[1]]$mean_function$n()
                       } else {
                         N <- private$designs[[1]]$mean$n()
                       }
                       # set up all the data into the correct format
                       # in future versions, I will refactor this so that it just takes a model pointer and does it all in the C++ class
                       X_list <- private$genXlist()
                       Z_list <- private$genZlist()
                       D_list <- private$genDlist()
                       weights <- self$weights
                       if(!is.null(rm_cols))
                       {
                         if(!is(rm_cols,"list"))stop("rm_cols should be a list")
                         zero_idx <- c()
                         idx_original <- 1:nrow(X_list[[1]])
                         # find all the entries with non-zero values of the given columns in each design
                         for(i in 1:length(rm_cols))
                         {
                           if(!is.null(rm_cols[[i]])){
                             for(j in 1:length(rm_cols[[i]]))
                             {
                               zero_idx <- c(zero_idx,which(X_list[[i]][,rm_cols[[i]][j]]!=0))
                             }
                           }
                         }
                         zero_idx <- sort(unique(zero_idx))
                         expcond <- self$experimental_condition[-zero_idx]
                         uexpcond <- unique(self$experimental_condition[-zero_idx])
                         ncond <- length(uexpcond)
                         idx_original <- idx_original[-zero_idx]
                         if(verbose)message(paste0("removing ",length(zero_idx)," observations"))
                         #update the matrices
                         for(i in 1:length(rm_cols))
                         {
                           X_list[[i]] <- X_list[[i]][-zero_idx,-rm_cols[[i]]]
                           C_list[[i]] <- C_list[[i]][-rm_cols[[i]]]
                           sig_list[[i]] <- sig_list[[i]][-zero_idx,-zero_idx]
                         }
                         N <- nrow(X_list[[1]])
                       } else {
                         expcond <- self$experimental_condition
                         uexpcond <- unique(self$experimental_condition)
                       }
                       ncond <- length(uexpcond)
                       XZ <- cbind(X_list[[1]],Z_list[[1]])
                       XZ.hash <- c()
                       for(i in unique(expcond)){
                         XZ.hash <- c(XZ.hash,digest::digest(XZ[expcond==i,]))
                       }
                       row.hash <- as.numeric(factor(XZ.hash,levels=unique(XZ.hash)))
                       ridx.nodup <- which(!duplicated(row.hash))
                       idx.nodup <- which(expcond %in% unique(expcond)[ridx.nodup])
                       n.uni.obs <- length(idx.nodup)
                       
                       if(is(algo,"character")){
                         if(packageVersion('glmmrBase') < '0.11.0'){
                           ## insert girling algorithm here
                           if(gsub(" ","",private$designs[[1]]$mean$formula) != gsub(" ","",private$designs[[1]]$covariance$formula)){
                             form <- paste0(private$designs[[1]]$mean$formula,"+",private$designs[[1]]$covariance$formula)
                           } else {
                             form <- gsub(" ","",private$designs[[1]]$mean$formula)
                           }
                           
                           modptr <- glmmrBase:::Model__new_w_pars(form,
                                                                   as.matrix(private$designs[[1]]$mean$data[idx.nodup,]),
                                                                   colnames(private$designs[[1]]$mean$data),
                                                                   tolower(private$designs[[1]]$family[[1]]),
                                                                   private$designs[[1]]$family[[2]],
                                                                   private$designs[[1]]$mean$parameters,
                                                                   private$designs[[1]]$covariance$parameters)
                           glmmrBase:::Model__set_var_par(modptr,private$designs[[1]]$var_par)
                           glmmrBase:::Model__update_W(modptr)
                         }
                         
                         totalN <- ifelse(missing(m),nrow(private$designs[[1]]$mean$X),m)
                         if(packageVersion('glmmrBase') < '0.4.5'){
                           w <- glmmrBase:::girling_algorithm(modptr,
                                                              totalN,
                                                              sigma_sq_ = private$designs[[1]]$var_par,
                                                              C_ = C_list[[1]],
                                                              tol_ = tol)
                         } else if(packageVersion('glmmrBase') >= '0.4.5' & packageVersion('glmmrBase') < '0.11.1'){
                           w <- glmmrBase:::girling_algorithm(modptr,
                                                              totalN,
                                                              C_ = C_list[[1]],
                                                              tol_ = tol)
                         } else {
                           w <- glmmrBase:::girling_algorithm(private$designs[[1]]$.__enclos_env__$private$ptr,
                                                              totalN,
                                                              C_ = C_list[[1]],
                                                              tol_ = tol)
                         }
                         
                         rtndata <- private$designs[[1]]$mean$data[idx.nodup,]
                         rtndata$weight <- w
                         return(invisible(rtndata))
                       } else {
                         #combinatorial algorithms
                         
                         #prepare matrices
                         w_diag <- matrix(0,ncol=length(X_list),nrow=n.uni.obs)
                         for(i in 1:length(X_list)){
                           X_list[[i]] <- X_list[[i]][idx.nodup,]
                           Z_list[[i]] <- Z_list[[i]][idx.nodup,]
                           if(packageVersion('glmmrBase') < '0.3.0'){
                             message("Use of this package with glmmrBase <0.3.0 will be prevented in the near future.")
                             if(is.null(rm_cols)){
                               w_diag[,i] <- Matrix::diag(private$designs[[i]]$.__enclos_env__$private$W)[idx.nodup]
                             } else {
                               w_diag[,i] <- Matrix::diag(private$designs[[i]]$.__enclos_env__$private$W)[-zero_idx][idx.nodup]
                             }
                           } else if(packageVersion('glmmrBase') < '0.4.2'){
                             if(is.null(rm_cols)){
                               w_diag[,i] <- Matrix::diag(private$designs[[i]]$w_matrix())[idx.nodup]
                             } else {
                               w_diag[,i] <- Matrix::diag(private$designs[[i]]$w_matrix())[-zero_idx][idx.nodup]
                             }
                           } else {
                             if(is.null(rm_cols)){
                               w_diag[,i] <- 1/Matrix::drop(private$designs[[i]]$w_matrix())[idx.nodup]
                             } else {
                               w_diag[,i] <- 1/Matrix::drop(private$designs[[i]]$w_matrix())[-zero_idx][idx.nodup]
                             }
                           }
                         }
                         max_obs <- unname(table(row.hash))
                         expcond.id <- as.numeric(factor(expcond[idx.nodup],levels = unique(expcond[idx.nodup])))
                         if(algo[1]==1){
                           idx_in <- sort(sample(row.hash,m,replace=FALSE))
                         } else if(algo[1]==2){
                           # find a size p design
                           ispd <- FALSE
                           #n <- nrow(X_list[[1]])
                           while(!ispd){
                             idx_in <- sort(sample(unique(row.hash),p,replace=FALSE))
                             M <- crossprod(X_list[[1]][expcond.id%in%idx_in,])
                             cM <- suppressWarnings(tryCatch(chol(M),error=function(e)NULL))
                             if(!is.null(cM))ispd <- TRUE
                           }
                         } else if(algo[1]==3){
                           idx_in <- row.hash
                         }
                         bayes <- FALSE
                         if(!is.null(V0)){
                           bayes <- TRUE
                           for(i in 1:length(X_list)){
                             if(dim(V0[[i]])[1] != ncol(X_list[[i]]))stop(paste0("V0 wrong dimension for design ",i))
                           }
                         } else {
                           V0 <- list()
                           for(i in 1:length(X_list)){
                             V0[[i]] <- matrix(1)
                           }
                         }
                         dataptr <- CreateOptimData(C_list = C_list, 
                                                    X_list = X_list,
                                                    Z_list = Z_list,
                                                    D_list = D_list,
                                                    w_diag = w_diag,
                                                    V0_list = V0,
                                                    max_obs = max_obs,
                                                    weights = weights, 
                                                    exp_cond = expcond.id)
                         derivptr <- CreateDerivatives()
                         if(kr){
                           message("Kenward-Roger correction in these algorithms is experimental and can produce some weird or nonsensical results.")
                           for(i in 1:length(private$designs)){
                             is_gaussian <- private$designs[[i]]$family[[1]] == "gaussian"
                             AddDesignDerivatives(dptr_ = derivptr,mptr_ = private$designs[[i]]$covariance$.__enclos_env__$private$ptr, is_gaussian)
                           }
                         }
                         ptr <- CreateOptim(dataptr,
                                            derivptr,
                                            idx_in = idx_in, 
                                            n=m,
                                            nmax = N+10,
                                            robust_log = robust_log,
                                            trace=verbose,
                                            kr = kr,
                                            uncorr=FALSE,
                                            bayes=bayes)
                         
                         out_list <- FindOptimumDesign(dptr_ = ptr,type_ = algo)
                         idx_out <- drop(out_list[["idx_in"]] )
                         idx_out_exp <- sort(idx_out)
                         rows_in <- c()
                         for(i in 1:length(idx_out_exp)){
                           uni.hash <- which(row.hash == idx_out_exp[i])
                           if(length(uni.hash)==1){
                             idx_out_exp[i] <- uni.hash
                           } else {
                             idx_out_exp[i] <- uni.hash[!uni.hash %in% idx_out_exp[1:(i-1)]][1]
                           }
                           rows_in <- c(rows_in, which(expcond == idx_out_exp[i]))
                         }
                         if(!is.null(rm_cols)){
                           rows_in <- idx_original[rows_in]
                         } 
                         if(keep){
                           for(i in 1:self$n()[[1]]){
                             private$designs[[i]]$subset_rows(rows_in)
                             if(packageVersion('glmmrBase') < '0.3.0'){
                               ncol <- 1:ncol(private$designs[[i]]$mean_function$X)
                             } else {
                               ncol <- 1:ncol(private$designs[[i]]$mean$X)
                             }
                             
                             if(!is.null(rm_cols))private$designs[[i]]$subset_cols(ncol[-rm_cols[[i]]])
                             private$designs[[i]]$check(verbose=FALSE)
                           }
                         }
                         return(invisible(list(rows = sort(rows_in), exp.cond = sort(idx_out_exp), val = out_list$best_val_vec,
                                               func_calls = out_list$func_calls, mat_ops = out_list$mat_ops)))
                       }
                     }
                   },
                   #' @description 
                   #' Returns a linked design
                   #' @param i Index of the design to return
                   #' @examples 
                   #' #See examples for constructing the class
                   show = function(i){
                     return(private$designs[[i]])
                   }
                 ),
                 private = list(
                   designs = list(),
                   genXlist = function(){
                     X_list <- list()
                     for(i in 1:self$n()[[1]]){
                       if(packageVersion('glmmrBase') < '0.3.0'){
                         X_list[[i]] <- as.matrix(private$designs[[i]]$mean_function$X)
                       } else {
                         X_list[[i]] <- as.matrix(private$designs[[i]]$mean$X)
                       }
                     }
                     return(X_list)
                   },
                   genSlist = function(){
                     S_list <- list()
                     for(i in 1:self$n()[[1]]){
                       if(packageVersion('glmmrBase') < '0.3.0'){
                         S_list[[i]] <- as.matrix(private$designs[[i]]$Sigma)
                       } else {
                         S_list[[i]] <- as.matrix(private$designs[[i]]$Sigma())
                       }
                     }
                     return(S_list)
                   },
                   genZlist = function(){
                     Z_list <- list()
                     for(i in 1:self$n()[[1]]){
                       Z_list[[i]] <- as.matrix(private$designs[[i]]$covariance$Z)
                     }
                     return(Z_list)
                   },
                   genDlist = function(){
                     D_list <- list()
                     for(i in 1:self$n()[[1]]){
                       D_list[[i]] <- as.matrix(private$designs[[i]]$covariance$D)
                     }
                     return(D_list)
                   },
                   socp_roptimal = function(C,
                                            m){
                     X <- private$genXlist()
                     sig <- private$genSlist()
                     weights <- self$weights
                     exp_cond <- self$experimental_condition
                     
                     n_r <- length(sig)
                     constr <- list()
                     n <- nrow(X[[1]])
                     n_ex <- unique(exp_cond)
                     mu <- CVXR::Variable(length(n_ex))
                     z <- CVXR::Variable(n*n_r)
                     
                     for(i in 1:n_r){
                       constr[[i]] <- t(X[[i]])%*%Matrix::t(Matrix::chol(Matrix::solve(sig[[i]])))%*%z[c(1:n + n*(i-1))] == C[[i]]
                     }
                     
                     for(i in n_ex){
                       #build expression
                       cons_str <- "weights[1]*CVXR::p_norm(z[which(exp_cond==i)])"
                       if(n_r > 1){
                         for(j in 1:(n_r-1)){
                           cons_str <- paste0(cons_str," + weights[",j+1,"]*p_norm(z[(which(exp_cond==i)+",j,"*n)])")
                         }
                       }
                       cons_str <- paste0(cons_str, " <= mu[i]")
                       
                       constr[[(length(constr)+1)]] <- eval(str2lang(cons_str))
                     }
                     obj <- sum(mu)
                     prob <- CVXR::Problem(CVXR::Minimize(obj),constr)
                     stopifnot(CVXR::is_dcp(prob))
                     res <- CVXR::psolve(prob)
                     weights <- CVXR::value(mu)
                     # choose the m biggest to keep
                     weights/sum(weights)
                     #order(weights,decreasing = TRUE)[1:m]
                   }
                 )
)

Try the glmmrOptim package in your browser

Any scripts or data that you put into this service are public.

glmmrOptim documentation built on March 31, 2026, 5:07 p.m.