R/fullmatch.R

Defines functions fullmatch.matrix fullmatch.numeric fullmatch.default fullmatch setTryRecovery

Documented in fullmatch setTryRecovery

#' (Internal) Sets up option to try recovery in \code{fullmatch}.
#'
#' @return NULL
#' @keywords internal
setTryRecovery <- function() {
  options("fullmatch_try_recovery" = TRUE)
}


#' Optimal full matching
#'
#' Given two groups, such as a treatment and a control group, and a method of
#' creating a treatment-by-control discrepancy matrix indicating desirability and
#' permissibility of potential matches (or optionally an already created such
#' discrepancy matrix), create optimal full matches of members of the groups.
#' Optionally, incorporate restrictions on matched sets' ratios of treatment to
#' control units.
#'
#' If passing an already created discrepancy matrix, finite entries indicate
#' permissible matches, with smaller discrepancies indicating more desirable
#' matches.  The matrix must have row and column names.
#'
#' If it is desirable to create the discrepancies matrix beforehand (for example,
#' if planning on running several different matching schemes), consider using
#' \code{\link{match_on}} to generate the distances. This generic function has
#' several useful methods for handling propensity score models, computing
#' Mahalanobis distances (and other arbitrary distances), and using user supplied
#' functions. These distances can also be combined with those generated by
#' \code{\link{exactMatch}} and \code{\link{caliper}} to create very nuanced
#' matching specifications.
#'
#' The value of \code{tol} can have a substantial effect on computation time;
#' with smaller values, computation takes longer.  Not every tolerance can be
#' met, and how small a tolerance is too small varies with the machine and with
#' the details of the problem.  If \code{fullmatch} can't guarantee that the
#' tolerance is as small as the given value of argument \code{tol}, then
#' matching proceeds but a warning is issued.
#'
#' By default, \code{fullmatch} will attempt, if the given constraints are
#' infeasible, to find a feasible problem using the same constraints.  This
#' will almost surely involve using a more restrictive \code{omit.fraction} or
#' \code{mean.controls}. (This will never automatically omit treatment units.)
#' Note that this does not guarantee that the returned match has the least
#' possible number of omitted subjects, it only gives a match that is feasible
#' within the given constraints. It may often be possible to loosen the
#' \code{omit.fraction} or \code{mean.controls} constraint and still find a
#' feasible match. The auto recovery is controlled by
#' \code{options("fullmatch_try_recovery")}.
#'
#' In full matching problems permitting many-one matches (\code{min.controls}
#' less than 1), the number of controls contributing to matches can exceed
#' what was requested by setting a value of \code{mean.controls} or
#' \code{omit.fraction}.  I.e., in this setting \code{mean.controls} sets
#' the minimum ratio of number of controls to number of treatments placed
#' into matched sets.
#'
#' If the program detects that (what it thinks is) a large problem,
#' a warning is issued. Unless you have an older computer, there's a good
#' chance that you can handle larger problems (at the cost of increased
#' computation time). To check the large problem threshold, use
#' \code{\link{getMaxProblemSize}}; to re-set it, use
#' \code{\link{setMaxProblemSize}}.
#'
#' @param x Any valid input to \code{match_on}. \code{fullmatch} will use
#' \code{x} and any optional arguments to generate a distance before performing
#' the matching.
#'
#' If \code{x} is a numeric vector, there must also be passed a vector \code{z}
#' indicating grouping. Both vectors must be named.
#'
#' Alternatively, a precomputed distance may be entered. A matrix of
#' non-negative discrepancies, each indicating the permissibility and
#' desirability of matching the unit corresponding to its row (a 'treatment') to
#' the unit corresponding to its column (a 'control'); or, better, a distance
#' specification as produced by \code{\link{match_on}}.
#'
#' @param min.controls The minimum ratio of controls to treatments that is to
#' be permitted within a matched set: should be non-negative and finite.  If
#' \code{min.controls} is not a whole number, the reciprocal of a whole number,
#' or zero, then it is rounded \emph{down} to the nearest whole number or
#' reciprocal of a whole number.
#'
#' When matching within subclasses (such as those created by
#' \code{\link{exactMatch}}), \code{min.controls} may be a named numeric vector
#' separately specifying the minimum permissible ratio of controls to treatments
#' for each subclass.  The names of this vector should include names of all
#' subproblems \code{distance}.
#'
#' @param max.controls The maximum ratio of controls to treatments that is
#' to be permitted within a matched set: should be positive and numeric.
#' If \code{max.controls} is not a whole number, the reciprocal of a
#' whole number, or \code{Inf}, then it is rounded \emph{up} to the
#' nearest whole number or reciprocal of a whole number.
#'
#' When matching within subclasses (such as those created by
#' \code{\link{exactMatch}}), \code{max.controls} may be a named numeric vector
#' separately specifying the maximum permissible ratio of controls to treatments
#' in each subclass.
#'
#' @param omit.fraction Optionally, specify what fraction of controls or treated
#' subjects are to be rejected.  If \code{omit.fraction} is a positive fraction
#' less than one, then \code{fullmatch} leaves up to that fraction of the control
#' reservoir unmatched.  If \code{omit.fraction} is a negative number greater
#' than -1, then \code{fullmatch} leaves up to |\code{omit.fraction}| of the
#' treated group unmatched.  Positive values are only accepted if
#' \code{max.controls} >= 1; negative values, only if \code{min.controls} <= 1.
#' If neither \code{omit.fraction} or \code{mean.controls} are specified, then
#' only those treated and control subjects without permissible matches among the
#' control and treated subjects, respectively, are omitted.
#'
#' When matching within subclasses (such as those created by
#' \code{\link{exactMatch}}), \code{omit.fraction} specifies the fraction of
#' controls to be rejected in each subproblem, a parameter that can be made to
#' differ by subclass by setting \code{omit.fraction} equal to a named numeric
#' vector of fractions.
#'
#' At most one of \code{mean.controls} and \code{omit.fraction} can be non-\code{NULL}.
#'
#' @param mean.controls Optionally, specify the average number of controls per
#' treatment to be matched. Must be no less than than \code{min.controls} and no
#' greater than the either \code{max.controls} or the ratio of total number of
#' controls versus total number of treated. Some controls will likely not be
#' matched to ensure meeting this value. If neither \code{omit.fraction} or
#' \code{mean.controls} are specified, then only those treated and control
#' subjects without permissible matches among the control and treated subjects,
#' respectively, are omitted.
#'
#' When matching within subclasses (such as those created by
#' \code{\link{exactMatch}}), \code{mean.controls} specifies the average number of
#' controls per treatment per subproblem, a parameter that can be made to
#' differ by subclass by setting \code{mean.controls} equal to a named numeric
#' vector.
#'
#' At most one of \code{mean.controls} and \code{omit.fraction} can be non-\code{NULL}.
#'
#' @param tol Because of internal rounding, \code{fullmatch} may
#' solve a slightly different matching problem than the one
#' specified, in which the match generated by
#' \code{fullmatch} may not coincide with an optimal solution of
#' the specified problem.  \code{tol} times the number of subjects
#' to be matched specifies the extent to
#' which \code{fullmatch}'s output is permitted to differ from an
#' optimal solution to the original problem, as measured by the
#' sum of discrepancies for all treatments and controls placed
#' into the same matched sets.
#'
#' @param data Optional \code{data.frame} or \code{vector} to use to get order
#' of the final matching factor. If a \code{data.frame}, the \code{rownames}
#' are used. If a vector, the \code{names} are first tried, otherwise the contents
#' is considered to be a character vector of names. Useful to pass if you want to
#' combine a match (using, e.g., \code{cbind}) with the data that were used to
#' generate it (for example, in a propensity score matching).
#'
#' @param solver Choose which solver to use. Currently implemented are RELAX-IV
#'   and LEMON. Default of \code{""}, a blank string, will use RELAX-IV if the
#'   \strong{rrelaxiv} package is installed, otherwise will use LEMON.
#'
#' To explicitly use RELAX-IV, pass string "RELAX-IV".
#'
#' To use LEMON, pass string "LEMON". Optionally, to specify which algorithm
#' LEMON will use, pass the function \link{LEMON} with argument for the
#' algorithm name, "CycleCancelling", "CapacityScaling", "CostScaling", and
#' "NetworkSimplex". See this site for details on their differences:
#' \url{https://lemon.cs.elte.hu/pub/doc/latest/a00606.html}. CycleCancelling is
#' the default.
#'
#' The CycleCancelling algorithm seems to produce results most closely
#' resembling those of optmatch versions prior to 1.0. We have observed the
#' other LEMON algorithms to produce different results when the
#' \code{mean.controls} is unspecified, or specified in such a way as to produce
#' an infeasible matching problem. When using a LEMON algorithm other than
#' CycleCancelling, we recommend setting the \code{fullmatch_try_recovery}
#' option to \code{FALSE}.
#'
#' @param ... Additional arguments, passed to \code{match_on} (e.g. \code{within})
#' or to specific methods.
#'
#' @return A \code{\link{optmatch}} object (\code{factor}) indicating matched groups.
#'
#' @references
#'  Hansen, B.B. and Klopfer, S.O. (2006), \sQuote{ Optimal full matching and related designs via network flows},
#'  \emph{Journal of Computational and Graphical Statistics}, \bold{15}, 609--627.
#'
#'  Hansen, B.B. (2004), \sQuote{Full Matching in an Observational Study
#'  of Coaching for the SAT}, \emph{Journal of the American
#'  Statistical Association}, \bold{99}, 609--618.
#'
#'  Rosenbaum, P. (1991), \sQuote{A Characterization of Optimal Designs for Observational
#'  Studies}, \emph{Journal of the Royal Statistical Society, Series B},
#'  \bold{53}, 597--610.
#'
#' @example inst/examples/fullmatch.R
#' @keywords nonparametric optimize
#' @export
fullmatch <- function(x,
                      min.controls = 0,
                      max.controls = Inf,
                      omit.fraction = NULL,
                      mean.controls = NULL,
                      tol = .001,
                      data = NULL,
                      solver = "",
                      ...) {

  # if x does not exist then print helpful error msg
  x_str <- deparse(substitute(x))
  if (length(x_str)>1) x_str <- paste(x_str, collapse="")
  data_str <- deparse(substitute(data))
  tryCatch(x, error = function(e) {
    stop(missing_x_msg(x_str, data_str, ...))})

  if (is.null(data)) {
    if (is(x, "InfinitySparseMatrix") |
        is(x, "matrix") |
        is(x, "optmatch.dlist") )
    warning("Without 'data' argument the order of the match is not guaranteed
    to be the same as your original data.")
  }
  if (is(x, "optmatch.dlist")) {
    warning("The use of 'optmatch.dlist' objects created by 'mdist()' is deprecated.\nPlease use 'match_on()' instead.")
  }
  UseMethod("fullmatch")
}

#' @export
fullmatch.default <- function(x,
                              min.controls = 0,
                              max.controls = Inf,
                              omit.fraction = NULL,
                              mean.controls = NULL,
                              tol = .001,
                              data = NULL,
                              solver = "",
                              within = NULL,
                              ...) {

  if (!inherits(x, gsub("match_on.","",methods("match_on")))) {
    stop("Invalid input, must be a potential argument to match_on")
  }

  mfd <- if (!is.null(data)) {
    model.frame(data, na.action=na.pass)
  } else {
    if (inherits(x, "function")) {
      stop("A data argument must be given when passing a function")
    }
    model.frame(x, na.action=na.pass)
  }
  if (!is(mfd, "data.frame")) {
    stop("Please pass data argument")
  }
  m <- match_on(x, within=within, data=mfd, ...)
  out <- fullmatch(m,
                   min.controls=min.controls,
                   max.controls=max.controls,
                   omit.fraction=omit.fraction,
                   mean.controls=mean.controls,
                   tol=tol,
                   data=mfd,
                   solver=solver,
                   ...)
  attr(out, "call") <- match.call()
  out
}

#' @export
fullmatch.numeric <- function(x,
                              min.controls = 0,
                              max.controls = Inf,
                              omit.fraction = NULL,
                              mean.controls = NULL,
                              tol = .001,
                              data = NULL,
                              solver = "",
                              z,
                              within = NULL,
                              ...) {

  m <- match_on(x, within=within, z=z, ...)
  out <- fullmatch(m,
                   min.controls=min.controls,
                   max.controls=max.controls,
                   omit.fraction=omit.fraction,
                   mean.controls=mean.controls,
                   tol=tol,
                   data=data,
                   solver=solver,
                   ...)
  attr(out, "call") <- match.call()
  out
}

#' @export
fullmatch.matrix <- function(x,
                             min.controls = 0,
                             max.controls = Inf,
                             omit.fraction = NULL,
                             mean.controls = NULL,
                             tol = .001,
                             data = NULL,
                             solver = "",
                             within = NULL,
                             hint,
                             ...) {

  hint  <- if (missing(hint)) NULL else nodeinfo(hint)

  ### Checking Input ###

  # this will throw an error if not valid
  validDistanceSpecification(x)

  # note: we might want to move these checks to validDistSpec
  dnms <- dimnames(x)
  if (is.null(dnms) | is.null(dnms[[1]]) | is.null(dnms[[2]])) {
    stop("argument \'x\' must have dimnames")
  }

  if (any(duplicated(unlist(dnms)))){
    stop("dimnames of argument \'x\' contain duplicates")
  }

  if (!is.null(within)) warning("Ignoring non-null 'within' argument.  When using 'fullmatch' with\n pre-formed distances, please combine them using '+'.")

  nmtrt <- dnms[[1]]
  nmctl <- dnms[[2]]

  # note: this next _should_ be unnecessary, the objects should do this
  # but better safe than sorry
  if (!isTRUE(all.equal(dim(x), c(length(nmtrt), length(nmctl))))) {
    stop("argument \'x\' dimensions do not match row and column names")
  }

  if (!is.numeric(min.controls)) {
    stop("argument \'min.controls\' must be numeric")
  }
  if (!is.numeric(max.controls)) {
    stop("argument \'max.controls\' must be numeric")
  }
  if (!is.null(omit.fraction)) {
    # A vector of all NA's is logical, not numeric, so the first condition is needed.
    if (all(is.na(omit.fraction))) {
      omit.fraction <- NULL
    } else if (any(abs(omit.fraction) > 1, na.rm = TRUE) | !is.numeric(omit.fraction)) {
      stop("omit.fraction must be NULL or numeric between -1 and 1")
    }
  }
  if (!is.null(mean.controls)) {
    if (all(is.na(mean.controls))) {
      mean.controls <- NULL
    } else if (any(mean.controls <= 0, na.rm = TRUE) | !is.numeric(mean.controls)) {
      stop("mean.controls must be NULL or numeric greater than 0")
    }
  }

  if (!is.null(omit.fraction) & !is.null(mean.controls)) {
    stop("omit.fraction and mean.controls cannot both be specified")
  }

  # Issue #56: Checking for sane input in data
  if (!is.null(data)) {
    if (!is.vector(data)) {
      dnames <- rownames(data)
    } else {
      dnames <- names(data)
    }
    if (any(!unlist(dimnames(x)) %in% dnames)) {
      stop("Some elements of the distance matrix are not found in the data argument.")
    }
  }

  # problems is guaranteed to be a list of DistanceSpecifictions
  # it may only have 1 entry
  problems <- findSubproblems(x)

  # the number of problems should match the argument lengths for
  # min, max, and omit

  np <- length(problems)
  if (np>1 & is.null(names(problems)))
      stop("Subproblems should have names.")
  subproblemids  <- names(problems)
  if (is.null(subproblemids)) subproblemids  <- character(1L)

    if (is.null(hint)) { hints  <- rep(list(NULL), np)
    } else {
        hints  <- split(hint, hint[['groups']],
                        drop=TRUE # drops levels of hint$groups that aren't represented in hint
                        )
        nohint  <- setdiff(subproblemids, names(hints))
        hints  <- hints[match(subproblemids, names(hints), 0L)]
        if (length(hints)>0) for (ii in 1L:length(hints)) hints[[ii]]  <- new("NodeInfo", hints[[ii]])
        if (length(nohint))
        {
            nullhint  <- rep(list(NULL), length(nohint))
            names(nullhint)  <- nohint
            hints  <- c(hints, nullhint)
            if (length(nohint)==np) warning("Hint lacks information about subproblems of this problem; ignoring.")
            }
        hints  <- hints[match(subproblemids, names(hints))]
            }

  if (length(min.controls) > 1 & np != length(min.controls)) {
      if (is.null(names(min.controls)))
          stop("\'min.controls\' longer than 1 should have names (that match names of subproblems/exact matching categories).")
      min.controls  <- min.controls[match(subproblemids, names(min.controls), nomatch=0)]
      if (length(min.controls)<np)
      stop(paste("\'min.controls\' arg of length>1 lacks entries for ",
              np-length(min.controls), " of ", np, " subproblems", sep = ""))
  }
  if (length(max.controls) > 1 & np != length(max.controls)) {
      if (is.null(names(max.controls)))
          stop("\'max.controls\' longer than 1 should have names (that match names of subproblems/exact matching categories).")
      max.controls  <- max.controls[match(subproblemids, names(max.controls), nomatch=0)]
      if (length(max.controls)<np)
      stop(paste("\'max.controls\' arg of length>1 lacks entries for ",
              np-length(max.controls), " of ", np, " subproblems", sep = ""))
  }
  if (!is.null(omit.fraction) & length(omit.fraction) > 1 & np !=
    length(omit.fraction)) {
      if (is.null(names(omit.fraction)))
          stop("\'omit.fraction\' longer than 1 should have names (that match names of subproblems/exact matching categories).")
      omit.fraction  <- omit.fraction[match(subproblemids, names(omit.fraction), nomatch=0)]
      if (length(omit.fraction)<np)
      stop(paste("\'omit.fraction\' arg of length>1 lacks entries for ",
              np-length(omit.fraction), " of ", np, " subproblems", sep = ""))
  }
  if (!is.null(mean.controls) & length(mean.controls) > 1 & np !=
    length(mean.controls)) {
      if (is.null(names(mean.controls)))
          stop("\'mean.controls\' longer than 1 should have names (that match names of subproblems/exact matching categories).")
      mean.controls  <- mean.controls[match(subproblemids, names(mean.controls), nomatch=0)]
      if (length(mean.controls)<np)
      stop(paste("\'mean.controls\' arg of length>1 lacks entries for ",
              np-length(mean.controls), " of ", np, " subproblems", sep = ""))
  }

  # reset the arguments to be the right length if they are not
  if (length(min.controls) == 1) {
    min.controls <- rep(min.controls, np)
  }
  if (length(max.controls) == 1) {
    max.controls <- rep(max.controls, np)
  }

  if (is.null(omit.fraction)) {
    omit.fraction <- NA
  }
  if (length(omit.fraction) == 1) {
    omit.fraction <- rep(omit.fraction, np)
  }

  if (is.null(mean.controls)) {
    mean.controls <- NA
  }
  if (length(mean.controls) == 1) {
    mean.controls <- rep(mean.controls, np)
  }

  if (any(mean.controls + .Machine$double.eps^0.5 < min.controls, na.rm=TRUE)) {
    stop("mean.controls cannot be smaller than min.controls")
  }

  if (any(mean.controls - .Machine$double.eps^0.5> max.controls, na.rm=TRUE)) {
    stop("mean.controls cannot be larger than max.controls")
  }

  if (any(!is.na(mean.controls))) {
    if (any(mean.controls > lapply(problems, function(p) {x <- subdim(p)[[1]] ;  x[2]/x[1]}), na.rm=TRUE)) {
      stop("mean.controls cannot be larger than the ratio of number of controls to treatments")
    }
  }

  if (any(omit.fraction > 0 & max.controls <= .5, na.rm=TRUE)) {
      stop("positive \'omit.fraction\' with \'max.controls\' <= 1/2 not permitted")
  }

  if (any(omit.fraction < 0 & min.controls >= 2, na.rm=TRUE)) {
      stop("negative \'omit.fraction\' with \'min.controls\' >= 2 not permitted")
  }

  # checks solver and evaluates LEMON() if neccessary
  solver <- handleSolver(solver)

  user.input.mean.controls <- FALSE

  if (any(!is.na(mean.controls) & is.na(omit.fraction))) {
    user.input.mean.controls <- TRUE
    omit.fraction <- 1 - mapply(function(x,y) {z <- subdim(y)[[1]] ; x*z[1]/z[2]}, mean.controls, problems)
  }

  total.n <- sum(dim(x))

  TOL <- tol * total.n

  # a helper to handle a single matching problem. all args required.
  # input error checking happens in the public fullmatch function.
  .fullmatch <- function(d, mnctl, mxctl, omf, hint = NULL, solver) {

    # if the subproblem is completely empty, short circuit
    if (length(d) == 0 || all(is.infinite(d))) {
      x <- dim(d)
      cells.a <- rep(NA, x[1])
      cells.b <- rep(NA, x[2])
      names(cells.a) <- rownames(d)
      names(cells.b) <- colnames(d)
      tmp <- list(cells = c(cells.a, cells.b), err = -1)
      return(tmp)
    }

    ncol <- dim(d)[2]
    nrow <- dim(d)[1]

    tol.frac <-
       if (total.n > 2 * np) {
           (nrow + ncol - 2)/(total.n - 2 * np)
                      } else 1

    # if omf is specified (i.e. not NA), see if is non-negative
    # if omf is not specified, check to see if mxctl is > .5
    if (switch(1 + is.na(omf), omf >= 0,  mxctl > .5)) {
      maxc <- min(mxctl, ncol)
      minc <- max(mnctl, 1/nrow)
      omf.calc <- omf
      flipped  <- FALSE
    } else {
      maxc <- min(1/mnctl, ncol)
      minc <- max(1/mxctl, 1/nrow)
      omf.calc <- -1 * omf
      d <- t(d)
      flipped  <- TRUE
    }

    ## (I'd like to do the following higher in the call stack, and also more
    ##  informatively, obviating subsequent needs for max.cpt, min.cpt,
    ##  omit.fraction etc. Keeping it here for fear of mischief with flipped subproblems.)
    if (is.null(hint)) hint  <- nodes_shell_fmatch(rownames(d), colnames(d))

    temp <- solve_reg_fm_prob(node_info = hint,
                        distspec = d,
                        max.cpt = maxc,
                        min.cpt = minc,
                        tolerance = TOL * tol.frac,
                        solver = solver,
                        omit.fraction = if(!is.na(omf)) { omf.calc }# passes NULL for NA
                        )
      if (!is.null(temp$MCFSolution))
          temp$MCFSolution@subproblems[1L,"flipped"]  <- flipped

    return(temp)
  }

  # a second helper function, that will attempt graceful recovery in situations where the match
  # is infeasible with the given max.controls
  .fullmatch.with.recovery <- function(d.r, mnctl.r, mxctl.r, omf.r, hint.r = NULL, solver) {

    # if the subproblem isn't clearly infeasible, try to get a match
    if (mxctl.r * dim(d.r)[1] >= prod(dim(d.r)[2], 1-omf.r, na.rm=TRUE)) {
      tmp <- .fullmatch(d.r, mnctl.r, mxctl.r, omf.r, hint.r, solver)
      if (!all(is.na(tmp[1]$cells))) {
        # subproblem is feasible with given constraints, no need to recover
        new.omit.fraction <<- c(new.omit.fraction, omf.r)
        return(tmp)
      }
    }
    # if max.control is in [1, Inf), and we're infeasible
    if(is.finite(mxctl.r) & mxctl.r >= 1) {
      # Re-solve with no max.control
      tmp2 <- list(.fullmatch(d.r, mnctl.r, Inf, omf.r, hint.r, solver))
      tmp2.optmatch <- makeOptmatch(d.r, tmp2, match.call(), data)
      trial.ss <- stratumStructure(tmp2.optmatch)
      treats <- as.numeric(unlist(lapply(strsplit(names(trial.ss), ":"),"[",1)))
      ctrls <- as.numeric(unlist(lapply(strsplit(names(trial.ss), ":"),"[",2)))
      num.controls <- sum((pmin(ctrls, mxctl.r)*trial.ss)[treats > 0])
      if(num.controls == 0) {
        # infeasible anyways
        if (!exists("tmp")) {
          tmp <- .fullmatch(d.r, mnctl.r, mxctl.r, omf.r, hint.r, solver)
        }
        new.omit.fraction <<- c(new.omit.fraction, omf.r)
        return(tmp)
      }
      new.omf.r <- 1 - num.controls/dim(d.r)[2]

      # feasible with the new omit fraction
      new.omit.fraction <<- c(new.omit.fraction, new.omf.r)
      return(.fullmatch(d.r, mnctl.r, mxctl.r, new.omf.r, hint.r, solver))
    } else {
      # subproblem is infeasible, but we can't try to fix because no max.controls
      if (!exists("tmp")) {
        tmp <- .fullmatch(d.r, mnctl.r, mxctl.r, omf.r, hint.r, solver)
      }

      new.omit.fraction <<- c(new.omit.fraction, omf.r)
      return(tmp)
    }
  }

  # In case we need to try and recover from infeasible, save the new.omit.fraction's used for output to user
  new.omit.fraction <- numeric(0)

  if (is.null(options()$fullmatch_try_recovery)) {
    warning("The flag fullmatch_try_recovery is unset, setting to TRUE")
    setTryRecovery()
  }

  if (options()$fullmatch_try_recovery) {
    solutions <- mapply(.fullmatch.with.recovery, problems, min.controls, max.controls, omit.fraction, hints, solver, SIMPLIFY = FALSE)
  } else {
    solutions <- mapply(.fullmatch, problems, min.controls, max.controls, omit.fraction, hints, solver, SIMPLIFY = FALSE)
  }

  mout <- makeOptmatch(x, solutions, match.call(), data)

  names(min.controls) <- names(problems)
  names(max.controls) <- names(problems)
  attr(mout, "min.controls") <- min.controls
  attr(mout, "max.controls") <- max.controls

  # length(new.omit.fraction) will be strictly positive if we ever entered .fullmatch.with.recovery
  if(length(new.omit.fraction) > 0) {
    out.omit.fraction <- new.omit.fraction
  } else {
    out.omit.fraction <- omit.fraction
  }
  out.mean.controls <- mapply(function(x,y) (1 - x)*y[2]/y[1], out.omit.fraction, subdim(x))

  names(out.mean.controls) <- names(problems)
  names(out.omit.fraction) <- names(problems)

  if(user.input.mean.controls) {
    attr(mout, "mean.controls") <- out.mean.controls
  } else {
    attr(mout, "omit.fraction") <- out.omit.fraction
  }

  if(length(new.omit.fraction) > 0 &
     !identical(new.omit.fraction, omit.fraction) &
     !all(is.na(new.omit.fraction)) &
     getOption("optmatch_verbose_messaging", FALSE)) {
    if(!any(is.na(new.omit.fraction)) & all(new.omit.fraction == 1)) {
      # If we never got a feasible subproblem
      warning("The problem appears infeasible with the given constraints.")
    } else {
      warning("The problem is infeasible with the given constraints; some units were omitted to allow a match.")
    }
  }

  # save hash of distance
  attr(mout, "hashed.distance") <- disthash <- hash_dist(x)

  attr(mout, "call") <- match.call()

    ## assemble MCF material
    mcfsolutions  <- rep(list(NULL), np)
    names(mcfsolutions)  <- subproblemids
    for (ii in 1L:np) {
      if (!is.null(solutions[[ii]]$MCFSolution))
      {
        mcfsolutions[[ii]]  <- solutions[[ii]]$MCFSolution
        mcfsolutions[[ii]]@subproblems[1L,"hashed_dist"]  <- disthash
        thesubprob  <- subproblemids[ii]
        mcfsolutions[[ii]]@subproblems[1L,"groups"]  <- thesubprob
        mcfsolutions[[ii]]@nodes[,"groups"]  <- factor(thesubprob)
        if (nrow(mcfsolutions[[ii]]@arcs@matches) > 0) {
          mcfsolutions[[ii]]@arcs@matches[,"groups"]  <- factor(thesubprob)
        }
        mcfsolutions[[ii]]@arcs@bookkeeping[,"groups"]  <- factor(thesubprob)
        bookkeeping_nodes  <- c('(_Sink_)', '(_End_)')
        for (bn in bookkeeping_nodes) {
          nlabs <- node.labels(mcfsolutions[[ii]])
          nlabs[nlabs==bn]  <- paste0(bn, thesubprob)
          node.labels(mcfsolutions[[ii]])  <-  nlabs
        }
      }
    }
  mcfsolutions  <- mcfsolutions[!vapply(mcfsolutions, is.null, logical(1))]

  attr(mout, "MCFSolutions")  <- if (length(mcfsolutions)==0) {
    NULL
  } else {
    names(mcfsolutions)[1]  <- "x"
    ##b/c in next line `c()` needs to dispatch on an `x` argument
    do.call("c", mcfsolutions)
  }

  # save solver information
  attr(mout, "solver") <- solver

  if (all(subproblemSuccess(mout) == FALSE)) {
    warning(paste("Matching failed.",
                  "(Restrictions impossible to meet?)\n",
                  "Enter ?matchfailed for more info."))
  } else if (any(subproblemSuccess(mout) == FALSE)) {
    warning(paste("At least one subproblem matching failed.\n",
                  "(Restrictions impossible to meet?)\n",
                  "Enter ?matchfailed for more info."))
  }

  return(mout)
}


#' @export
fullmatch.optmatch.dlist <- fullmatch.matrix
#' @export
fullmatch.InfinitySparseMatrix <- fullmatch.matrix
#' @export
fullmatch.BlockedInfinitySparseMatrix <- fullmatch.matrix

#' @aliases fullmatch
#' @rdname fullmatch
#' @export
full <- fullmatch

Try the optmatch package in your browser

Any scripts or data that you put into this service are public.

optmatch documentation built on Sept. 19, 2024, 9:06 a.m.