R/metapred.R
In metamisc: Meta-Analysis of Diagnosis and Prognosis Research Studies

Documented in fitted.metapred gen metapred perf se subset.metapred tau tau2 variances

# TODO 1: Thomas' ideas
# 2019-2-2: It would be nice to store some default performance estimates resulting from IECV in a "mm_perf" object (as as done in
#           'ccalc()' and 'oecalc'). It would then be easy to plot IECV results for these performance measures. 
#           Some suggested measures are: calibration slope, calibration-in-the-large and c-statistic. These performance
#           measures are fairly generic and should work for most glm-type models, including survival ones.
# @Valentijn: perhaps better to combine param perfFUN, genFUN and selFUN into one parameter with 3 or 4 distinct options
# @Valentijn: I suggest to omit intercept recalibration. The intercept issue can be addressed directly by specifying distinct 
# error functions that do or do not account for mis-calibration in intercept term

# TODO 2: metapred objects
# Add subset.metapred functionality for one.stage models.
# perfFUN = "cal.int" does not work for estFUN = "logistf", as logistf cannot take intercept only models
#     possible fix: use logistfirth intstead, and fix intercept-only models for that function.

# Bug when no family is specified for calibration slope:
# mp.slo <- metapred(DVTipd.reordered, strata = "cluster", perfFUN = "cal.slope", max.steps = 0)
# Error in pred.recal(p = p, y = y, estFUN = estFUN, family = family, which = "slope") : 
#   argument "family" is missing, with no default 

# add more output for metapred
# tol, aka tolerance for stopping.
# metapred.summaries docs (see glm.summaries {stats}), including subset
# add stepnumbers to print.metapred or print.fit
# Write tests for fitted.whatever
# add docs for fitted.whatever.
# check what happens for recalibrated mp.cv in fitted.mp.cv
# add rownames to data, to prevent error/mismatch in fitted.metapred/fitted.mp.cv # Maybe already fixed by as.data.frame()
# add the unchanged model to the next step's model list. Model comparison will be easier. Subsetting will be easier: this allows
#   a 'chain' of "changed" predictors to be added, thereby making the output clearer. 
# change "changed" for global models as well.
# add a new stratified.fit class, which contains mp.stratified.fit (or vice versa), such that subset(fit,
# select = "stratified") can work as intended. Right now, mp.stratified.fit  is a list that may only contain
# mp.stratum.fit objects.

# TODO 3: General
# add function: areTRUE. st.i == cl may otherwise lead to problems, if st.i has NA. Replaced with %in%. Now solved?
# variances for intercept recalibration.
# perf method
# penalties
# performance measurement, current is Brier for binomial
# add more options to penalty step
# Write test for new centerCovs function.

# TODO 4: Maybe later
# One-stage MA: predStep: is  type = response correct?
# centering: now only centers numeric variables. Should it also work for dummies of categorical variables?

# TODO 5: DONE:
# recal.int + interaction gives error. FIXED.
# add fitted.metapred(). DONE.
# categorical variables Done!
# Add summary.metapred. DONE.
# Added is.metapred()
# Added more robust tests for stepwise.
# Automatically remove observations with missing values.
# @Valentijn: I would center covariates by default, this often helps to improve generalizability and also speeds estimation
#   Valentijn: Done.
# @Valentijn: You can change default of meta.method to DL. This is a lot faster and has limited implications on estimated means.
#   Valentijn: Done.
# , cl.name in modelStep
# Remove Reduce() from perfStep, and make perfStep() compatible with multiple data sets (for cvFUN = fixed, or cvFUN = bs)
# predict.metapred: allow to generate predictions if y is not specified. 
#                   Currently, an error it thrown if setting dvt=NA in the example. This error traces back to the use 
#                   of model.matrix, which calls the function model.frame() that fails.
# Fixed by removing the outcome from the formula in the predict method.
# If max.steps and scope = formula are supplied, no stopping reason is given. SHould be fixed now (3 april 2019)

# TODO 6:

# predict.metapred: Allow so specify study for generating study-specific predictions 
# (as our IECV approach will yield a study-specific intercept term and regression coefficients)
# If no study is specified in the data frame, then predictions are based on the pooled coefficients.
# Change listwise deletion of missing data: add options to turn on/off, only remove observations with missing data within
# formula / scope.. 
# metapred currently cannot handle -1 in scope I think.

###### Outline
### Top: high-level functions
#
# metapred
# 
# mp.fit
# 
# mp.step + mp.global
# 
# mp.cv
# 
# mp.cv.val
# mp.cv.dev	+ mp.cv.recal
# 
# mp.cv.meta.fit + mp.recal.meta.fit
# 
# mp.meta.fit
#
# mp.stratified.fit
#
# mp.stratum.fit
#
### bottom : low level functions
###### 

#' Generalized Stepwise Regression for Prediction Models in Clustered Data
#'
#' Generalized stepwise regression for obtaining a prediction model that is validated 
#' with (stepwise) internal-external cross-validation, in or to obtain adequate 
#' performance across data sets. Requires data from individuals in multiple studies.
#' 
#' @author Valentijn de Jong <Valentijn.M.T.de.Jong@gmail.com>
#' 
#' @references Debray TPA, Moons KGM, Ahmed I, Koffijberg H, Riley RD. A framework for developing, implementing, 
#' and evaluating clinical prediction models in an individual participant data meta-analysis. 
#' \emph{Stat Med}. 2013;32(18):3158-80. 
#'
#' @param data data.frame containing the data. Note that \code{metapred} removes observations with missing data \emph{listwise}
#' for all variables in \code{formula} and \code{scope}, to ensure that the same data is used in each model in each step. The
#' outcome variable should be \code{numeric} or coercible to numeric by as.numeric().
#' @param strata Character to specify the name of the strata (e.g. studies or clusters) variable
#' @param formula \code{formula} of the first model to be evaluated. \code{metapred} will start at \code{formula} and update it
#' using terms of \code{scope}. Defaults to full main effects model, where the first column in \code{data} is assumed to be
#' the outcome and all remaining columns (except \code{strata}) predictors. See \link[stats]{formula} for formulas in general.
#' @param estFUN Function for estimating the model in the first stage. Currently "lm", "glm" and "logistfirth" are supported.
#' @param scope \code{formula}. The difference between \code{formula} and \code{scope} defines the range of models examined in the 
#' stepwise search. Defaults to NULL, which leads to the intercept-only model. If \code{scope} is not nested in \code{formula}, 
#' this implies backwards selection will be applied (default). If \code{scope} is nested in \code{formula}, this implies forward 
#' selection will be applied. If equal, no stepwise selection is applied. 
#' @param retest Logical. Should added (removed) terms be retested for removal (addition)? \code{TRUE} implies bi-directional 
#' stepwise search.
#' @param max.steps Integer. Maximum number of steps (additions or removals of terms) to take. Defaults to 1000, which is
#' essentially as many as it takes. 0 implies no stepwise selection.
#' @param center logical. Should numeric predictors be centered around the cluster mean?
#' @param recal.int Logical. Should the intercept be recalibrated in each validation?
#' @param cvFUN Cross-validation method, on the study (i.e. cluster or stratum) level. "l1o" for leave-one-out cross-validation 
#' (default). "bootstrap" for bootstrap. Or "fixed", for one or more data sets which are only used for validation. A user written 
#' function may be supplied as well.
#' @param cv.k Parameter for cvFUN. For \code{cvFUN="bootstrap"}, this is the number of bootstraps. For \code{cvFUN="fixed"}, 
#' this is a vector of the indices of the (sorted) data sets. Not used for \code{cvFUN="l1o"}.
#' @param metaFUN Function for computing the meta-analytic coefficient estimates in two-stage MA. 
#' By default, \link[metafor]{rma.uni}, from the metafor package is used. Default settings are univariate random effects, 
#' estimated with "DL". Method can be passed trough the \code{meta.method} argument.
#' @param meta.method Name of method for meta-analysis. Default is "DL". For more options see \link[metafor]{rma.uni}.
#' @param predFUN Function for predicting new values. Defaults to the predicted probability of the outcome, using the link 
#' function of \code{glm()} or \code{lm()}.
#' @param perfFUN Function for computing the performance of the prediction models. Default: mean squared error 
#' (\code{perfFUN="mse"}).Other options are \code{"var.e"} (variance of prediction error), \code{"auc"} (area under the curve),
#' \code{"cal.int"} (calibration intercept), and \code{"cal.slope"} (multiplicative calibration slope) and \code{"cal.add.slope"}
#' (additive calibration slope).
#' @param genFUN Function or \code{list} of named functions for computing generalizability of the performance. 
#' Default: (absolute) mean (\code{genFUN="abs.mean"}). Choose \code{coef.var} for the coefficient of variation. If a \code{list},
#' only the first is used for model selection.
#' @param selFUN Function for selecting the best method. Default: lowest value for \code{genFUN}. Should be set to
#' "which.max" if high values for \code{genFUN} indicate a good model.
#' @param ... To pass arguments to estFUN (e.g. family = "binomial"), or to other FUNctions.
#'
#' @return A list of class \code{metapred}, containing the final model in \code{global.model}, and the stepwise
#' tree of estimates of the coefficients, performance measures, generalizability measures in \code{stepwise}.
#' 
#' @details Use \link{subset.metapred} to obtain an individual prediction model from a \code{metapred} object.
#' 
#'  Note that \code{formula.changes} is currently unordered; it does not represent the order of changes in the stepwise 
#'  procedure.
#'  
#'  \code{metapred} is still under development, use with care.
#' 
#' @examples 
#' data(DVTipd)
#' 
#'\dontrun{
#' # Explore heterogeneity in intercept and assocation of 'ddimdich'
#' glmer(dvt ~ 0 + cluster + (ddimdich|study), family = binomial(), data = DVTipd)
#'}
#' 
#' # Scope
#' f <- dvt ~ histdvt + ddimdich + sex + notraum
#' 
#' # Internal-external cross-validation of a pre-specified model 'f'
#' fit <- metapred(DVTipd, strata = "study", formula = f, scope = f, family = binomial)
#' fit
#' 
#' # Let's try to simplify model 'f' in order to improve its external validity
#' metapred(DVTipd, strata = "study", formula = f, family = binomial)
#' 
#' # We can also try to build a generalizable model from scratch
#' 
#'\dontrun{
#' # Some additional examples:
#' metapred(DVTipd, strata = "study", formula = dvt ~ 1, scope = f, family = binomial) # Forwards
#' metapred(DVTipd, strata = "study", formula = f, scope = f, family = binomial) # no selection
#' metapred(DVTipd, strata = "study", formula = f, max.steps = 0, family = binomial) # no selection
#' metapred(DVTipd, strata = "study", formula = f, recal.int = TRUE, family = binomial)
#' metapred(DVTipd, strata = "study", formula = f, meta.method = "REML", family = binomial)
#'}
#' # By default, metapred assumes the first column is the outcome.
#' newdat <- data.frame(dvt=0, histdvt=0, ddimdich=0, sex=1, notraum=0)
#' fitted <- predict(fit, newdata = newdat)
#' 
#' @references 
#' de Jong VMT, Moons KGM, Eijkemans MJC, Riley RD, Debray TPA. Developing more generalizable prediction models from pooled 
#' studies and large clustered data sets. \emph{Stat Med}. 2021;40(15):3533--59.
#' 
#' Riley RD, Tierney JF, Stewart LA. Individual participant data meta-analysis: a handbook for healthcare research. 
#' Hoboken, NJ: Wiley; 2021. ISBN: 978-1-119-33372-2.
#' 
#' Schmid CH, Stijnen T, White IR. Handbook of meta-analysis. First edition. Boca Raton: Taylor and Francis; 2020. ISBN: 978-1-315-11940-3.
#'   
#' @seealso  \code{\link{forest.metapred}}  for generating a forest plot of prediction model performance
#' @import stats
#'
#' @importFrom stats formula var
#'
#' @export
metapred <- function(data, strata, formula, estFUN = "glm", scope = NULL, retest = FALSE, max.steps = 1000, 
                     center = FALSE, recal.int = FALSE, cvFUN = NULL, cv.k = NULL,  # tol = 0,
                     metaFUN = NULL, meta.method = NULL, predFUN = NULL, perfFUN = NULL, genFUN = NULL,
                     selFUN = "which.min",
                     ...) {
  call <- match.call()
  
  # Formula
  if (missing(formula) || is.null(formula)) formula <- stats::formula(data[ , -which(colnames(data) == strata)])  
  if (is.null(scope)) scope <- f2iof(formula)
  updates <- getFormulaDiffAsChar(formula, scope)
  
  # Data
  data <- get_all_vars(formula = addg2f(formula, scope, terms = strata), data = data) # drop unnecessary vars
  data <- droplevels(remove.na.obs(as.data.frame(data)))  # drop observations with missings in remaining vars
  
  if (is.factor(data[, f2o(formula)]))
    data[ , f2o(formula)] <- factor_as_binary(data[ , f2o(formula)])
  
  # One- vs two-stage, stratification and centering
  if(is.null(two.stage <- list(...)$two.stage) ) two.stage <- TRUE
  strata.i <- as.vector(data[, strata])
  strata.u <- sort(unique(strata.i))
  if (center)
    data <- centerCovs(data = data, y.name = f2o(formula), cluster.name = strata)
  
  # Functions
  if (is.null(cvFUN))   cvFUN   <- l1o
  if (is.null(metaFUN)) metaFUN <- urma
  if (is.null(perfFUN)) perfFUN <- "mse"
  if (is.null(genFUN))  genFUN  <- abs.mean
  if (is.null(meta.method)) meta.method <- "DL"
  # Change to "-" when perfFUN <- R2 or some other measure for which greater = better.
  
  estFUN.name <- estFUN
  estFUN  <- get.function(estFUN)
  cvFUN   <- get.function(cvFUN)
  # perfFUN <- get(perfFUN)
  # genFUN  <- get(genFUN) # now happens in mp.cv.val
  selFUN  <- get.function(selFUN)
  metaFUN <- get.function(metaFUN)
  
  # genFUN.add <- dots[["genFUN.add"]] 
  # dots[["genFUN.add"]] <- NULL
  
  # Folds
  folds <- cvFUN(strata.u, k = cv.k)
  if (!isTRUE(length(folds[["dev"]]) > 0) || !isTRUE(length(folds[["dev"]][[1]]) > 0))
    stop("At least 1 cluster must be used for development.")
  
  # Fitting
  fit <- mp.fit(formula = formula, data = data, remaining.changes = updates, st.i = strata.i, st.u = strata.u, folds = folds,
                recal.int = recal.int, retest = retest, max.steps = max.steps, tol = 0,
                estFUN = estFUN, metaFUN = metaFUN, meta.method = meta.method, predFUN = predFUN, perfFUN = perfFUN,
                genFUN = genFUN, selFUN = selFUN, ...)
  
  # mp.args <- c(list(formula = formula, data = data, remaining.changes = updates, st.i = strata.i, st.u = strata.u, folds = folds,
  #                   recal.int = recal.int, retest = retest, max.steps = max.steps, tol = 0, 
  #                   estFUN = estFUN, metaFUN = metaFUN, meta.method = meta.method, predFUN = predFUN, perfFUN = perfFUN,
  #                   genFUN = genFUN, selFUN = selFUN), dots)
  # 
  # fit <- do.call(mp.fit, args = mp.args ) 
  
  predFUN <- getPredictMethod(fit$stepwise$s0$cv$unchanged, two.stage = two.stage, predFUN = predFUN)
  formula.final <- fit$global.model$formula
  
  out <- c(fit, list(call = call, strata = strata, data = data, folds = folds, # add nobs and strata.nobs
                     formula.start = formula, scope = scope, formula = formula.final,
                     formula.changes = getFormulaDiffAsChar(formula.final, formula), 
                     # NOTE: formula.changes is currently unordered!
                     options = list(cv.k = cv.k, meta.method = meta.method, recal.int = recal.int,
                                    center = center, max.steps = max.steps, retest = retest, two.stage = two.stage), # add: tol
                     FUN = list(cvFUN = cvFUN, predFUN = predFUN, perfFUN = get.function(perfFUN), metaFUN = metaFUN, genFUN = genFUN, 
                                selFUN = selFUN, estFUN = estFUN, estFUN.name = estFUN.name)))
  class(out) <- c("metapred")
  return(out)
}

# #' The function \code{summary} can be used to obtain an extensive summary of the stepwise process and the final model. 
# #' \code{subset(x, select = "best.cv")} can be used to obtain a summary of the model with best generalizability in the
# #' cross-validation, whereas \code{subset(x, select = "global")} gives a summary of the global model (i.e. fitted on all 
# #' strata).

# For prediction of newdata. 
# Object metapred object
# newdata data.frame, defaults to NULL, which implies the fitted data
# strata character, name of strata variable in newdata. Defaults to name in fitted object.
# type character. Type of prediction. This is intended to override the default of glm and lm.
# ("response" or "link" possible; "terms" not implemented)
# recal.int logical. Recalibrate the intercept before prediction? Defaults to same as development for metapred object,
# center logical, Center covariates, before prediction? Defaults to same as development of metapred object,
# ... For compatibility only.
#' @author Valentijn de Jong
#' @importFrom stats predict
#' @method predict   metapred
#' @export
predict.metapred <- function(object, newdata = NULL, strata = NULL, type = "response", 
                             recal.int = NULL, center = NULL, ...) {
  if (is.null(newdata))
    newdata <- object[["data"]]
  if (is.null(strata))
    strata <- object$strata
  if (is.null(center))
    center <- object$options$center
  if (center)
    newdata <- centerCovs(data = newdata, y.name = f2o(formula(object)), cluster.name = strata)
  if (is.null(recal.int))
    recal.int <- object$options$recal.int
  if (isTRUE(recal.int))
    object <- recalibrate(object = object, newdata = newdata)
  
  object$FUN$predFUN(object = object, newdata = newdata, type = type, ...)[, 1] # [, 1] such that a vector is returned.
}

#' Extract Model Fitted Values
#' 
#' Extract the fitted values of a \code{metapred} object. By default returns fitted values of the model in the 
#' cross-validation procedure.
#' 
#' Function still under development, use with caution.
#' 
#' Only returns type = "response".
#' 
#' @author Valentijn de Jong
#' @importFrom stats fitted
#' @method fitted   metapred
#' @param object object of class metapred 
#' @param select character. Select fitted values from "cv" (default) or from "global" model.
#' @param step character or numeric. Name or number of step to select if \code{select} = "cv". Defaults to best step.
#' @param model character or numeric. Name or number of model to select if \code{select} = "cv". Defaults to
#' best model.
#' @param as.stratified logical. \code{select} = "cv" determines whether returned predictions are stratified in a list 
#' (\code{TRUE}, default) or in their original order (\code{FALSE}).
#' @param type character. Type of fitted value.
#' @param ... For compatibility only.
#' @export
fitted.metapred <- function(object, select = "cv", step = NULL, model = NULL, 
                            as.stratified = TRUE, type = "response", ...) {
  if (isTRUE(select == "cv")) {
    ftd <- fitted(subset.metapred(x = object, select = select, step = step, model = model, type = type, ...))
    if (as.stratified)
      return(ftd)
    ftd.v <- Reduce(rbind, ftd) #as vector
    return(ftd.v[match(rownames(ftd.v), rownames(object[["data"]])) ] ) # return to original ordering.
  }
  
  if (isTRUE(select == "global"))
    return(predict.metapred(object = object, newdata = NULL, type = type, ...))
  stop("select must equal 'cv' or 'global'.")
}

# #' Only returns type = "response".
# #' @author Valentijn de Jong
# #' @importFrom stats residuals
# #' @method residuals   metapred
# #' @export
# residuals.metapred <- function(object, select = "cv", step = NULL, model = NULL, as.stratified = TRUE, ...) {
#   y <- object$data[ , f2o(formula(object))]
#   ftd <- fitted.metapred(object = object, select = select, step = step, 
#                        model = model, as.stratified = as.stratified, ...)
#   if (as.stratified) {
#     ftd <- fitted(mp, as.stratified = TRUE)
#     y <- mp$data[ , f2o(formula(mp))]
#     
#     st.i <- mp[["data"]][[mp[["strata"]]]]
#     # now somehow sort them into a list with similar dim and dimnames as ftd, using st.i
#     
#     stop("To be implemented")
#   } else {
#     y - ftd
#   }
# }

# #' Extract the regression coefficients
# #' 
# #' The \code{coef} function extracts the estimated coefficients of the final model of objects of class \code{"metapred"}.
# #' @param object A fitted \code{metapred} object
# #' @param \ldots Optional arguments (currently not supported).
# #' 
# #' @method coef metapred
# #' @export
#' @author Valentijn de Jong
#' @method coef   metapred
#' @importFrom stats coef
#' @export
coef.metapred <- function(object, stratified = FALSE, ...) {
  if (stratified) 
    return(coef(object$global.model$stratified.fit))
  return(coef(object$global.model))
}

# Note returned formula is for the final model!
#' @author Valentijn de Jong
#' @importFrom stats formula
#' @method formula   metapred
#' @export
formula.metapred <- function(x, ...)
  x$global.model$formula

# Returns family or NULL
#' @author Valentijn de Jong
#' @importFrom stats family
#' @method family   metapred
#' @export
family.metapred <- function(object, ...) {
  if (!is.null((f <- object$stepwise$s0$cv[[1]]$family)))
    f
  else if (!is.null(f <- object$family))
    f
  else
    NULL
}

#' @author Valentijn de Jong
#' @method print   metapred
#' @export
print.metapred <- function(x, ...) {
  cat("Call: ")                       
  print(x$call) # cat cannot handle a call
  cat("\n") 
  print.mp.fit(x)
}

#' @author Valentijn de Jong
#' @method summary   metapred
#' @export
summary.metapred <- function(object, ...) {
  cat("Call: ")                       
  print(object$call) # cat cannot handle a call
  cat("\n") 
  summary.mp.fit(object)
}

# Test whether object is meta.pred
#' @author Valentijn de Jong
#' @importFrom methods is
#' @method is   metapred
#' @export
is.metapred <- function(object)
  inherits(object, "metapred")

# Implementation of the subset method
# #' @author Valentijn de Jong
# #' @method subset   metapred
# #' @export
# old.subset.metapred <- function(x, select = "best.cv", ...) {
#   if (identical(select, "best.cv")) 
#     return(mp.step.get.best(x[["stepwise"]][[x[["best.step"]]]]))
#   if (identical(select, "global"))
#     return(x[["global.model"]])
#   stop("select must equal 'best.cv' or 'global'.")
# }

#' Subsetting metapred fits
#' 
#' Return a model from the cross-validation procedure or the final 'global' model. Caution: This function is 
#' still under development.
#' 
#' @author Valentijn de Jong
#' 
#' @param x metapred object
#' @param select Which type of model to select: "cv" (default), "global", or (experimental) "stratified", 
#' or "stratum".
#' @param step  Which step should be selected? Defaults to the best step. 
#' numeric is converted to name of the step: 0 for an unchanged model, 1 for the first change... 
#' @param model Which model change should be selected? NULL (default, best change) or character name of variable
#' or (integer) index of model change. 
#' @param stratum Experimental. Stratum to return if select = "stratum".
#' @param add Logical. Add data, options and functions to the resulting object? Defaults to \code{TRUE}. 
#' Experimental.
#' @param ... For compatibility only.
#' 
#' @examples 
#' data(DVTipd)
#' DVTipd$cluster <- letters[1:4] # Add a fictional clustering to the data.
#' mp <- metapred(DVTipd, strata = "cluster", formula = dvt ~ histdvt + ddimdich, family = binomial)
#' subset(mp) # best cross-validated model
#' subset(mp, select = "global") # Final model fitted on all strata.
#' subset(mp, step = 1) # The best model of step 1
#' subset(mp, step = 1, model = "histdvt") # The model in which histdvt was removed, in step 1.
#' 
#' @return An object of class \code{mp.cv} for select = "cv" and an object of class \code{mp.global} for select = "global". 
#' In both cases, additional data is added to the resulting object, thereby making it suitable for further methods.
#' @export
subset.metapred <- function(x, select = "cv", step = NULL, model = NULL, stratum = NULL, add = TRUE, ...) {
  # Global model
  if (identical(select, "global"))
    out <- x[["global.model"]]
  else {
    # Same for cv, stratified and stratum:
    # Step
    if (is.null(step))
      step <- x$best.step
    if (is.numeric(step))
      step <- getStepName(step)
    
    # Model
    if (is.null(model))
      model <- x[["stepwise"]][[step]][["best.change"]]
    
    # Select fit:
    if (identical(select, "cv") || identical(select, "iecv"))                        # cv / iecv fit
      out <- x[["stepwise"]][[step]][["cv"]][[model]]
    else if (identical(select, "stratified") || identical(select, "stratified.fit")) # stratified fit
      return(x[["stepwise"]][[step]][["cv"]][[model]][["stratified.fit"]])
    else if (identical(select, "stratum")    || identical(select, "strata"))         # stratum fit
      out <- x[["stepwise"]][[step]][["cv"]][[model]][["stratified.fit"]][[stratum]]
    else
      stop("select must equal 'cv', 'global', 'stratified' or 'stratum'.")
  }   
  
  # This is the real reason for why this function exists. To add this stuff to a mp.cv or mp.global object.
  # Normally those do not have this data, for memory/performance reasons.
  # In hindsight: This is actually a very useful function, as it makes finding a model much easier.
  if (add) {
    out$data    <- x$data
    out$strata  <- x$strata
    out$folds   <- x$folds
    out$FUN     <- x$FUN
    out$options <- x$options
  }
  
  out
}

# Perform fit a model using cross-validation, for metapred
# formula formula to start with
# data data.frame, containing dev and val data
# remaining.changes predictor terms to add or remove
# st.i numeric vector, cluster indicators
# st.u unique values in st.i
# folds list, folds as in utils
# recal.int logical, recalibrate intercept?

# retest logical, should removed (added) predictors be added (removed) again?
# max.steps numeric, maximum number of steps (predictor additions/removals) to be taken
# tol numeric, tolerance, minimum change in generalizability to accept a different model # To be implemented

# estFUN function, used for obtaining predict method
# predFUN function, user supplied predict method, overrides estFUN's predict()
# perfFUN function, function for computing performance, defaults to mse = mean square error
# genFUN function, function for computing generalizability, defaults to absolute mean.
# ... options for predictMethod, perfFUN, and genFUN.
# Returns object of class mp.cv.val, which is a validated mp.cv.dev
mp.fit <- function(formula, data, remaining.changes, st.i, st.u, folds, recal.int = FALSE, 
                   retest = FALSE, max.steps = 3, tol = 0,
                   estFUN = glm, metaFUN = urma, meta.method = "DL", predFUN = NULL, 
                   perfFUN = mse, genFUN = abs.mean, selFUN = which.min,
                   two.stage = TRUE, ...) {
  out <- steps <- list()
  
  ## Step 0
  # As remaining.changes = c("") yields unchanged formula.
  step.count <- 0
  steps[[getStepName(step.count)]] <- mp.step(formula = formula, data = data, remaining.changes = c(""), 
                                              st.i = st.i, st.u = st.u, folds = folds, recal.int = recal.int, 
                                              retest = FALSE, two.stage = two.stage,
                                              estFUN = estFUN, metaFUN = metaFUN, meta.method = meta.method, predFUN = predFUN, 
                                              perfFUN = perfFUN, genFUN = genFUN, selFUN = selFUN, ...)
  steps[[getStepName(step.count)]][["step.count"]] <- step.count
  out[["best.step"]] <- getStepName(step.count)
  out[["stop.reason"]] <- "no changes were requested."
  current.model <- mp.step.get.best(steps[[1]])
  current.model[["remaining.changes"]] <- remaining.changes
  gen.diff <- Inf # TBI
  
  if (!identical(length(remaining.changes), 0L))
    repeat {
      ## Loop management
      if (isTRUE(gen.diff <= tol)) { # TBI
        out[["stop.reason"]] <- "improvement <= tolerance."
        break
      }
      if (isTRUE(length(current.model[["remaining.changes"]]) <= 0)) {
        out[["stop.reason"]] <- "all changes were implemented."
        break
      }
      if (isTRUE(step.count >= max.steps)) {
        out[["stop.reason"]] <- "max.steps was reached."
        break
      }
      step.count <- step.count + 1
      
      ## Perform a step
      steps[[getStepName(step.count)]] <- mp.step(formula = current.model[["formula"]], data = data,
                                                  remaining.changes = current.model[["remaining.changes"]],
                                                  st.i = st.i, st.u = st.u, folds = folds, recal.int = recal.int,
                                                  retest = retest, two.stage = two.stage,
                                                  estFUN = estFUN, metaFUN = metaFUN, meta.method = meta.method, predFUN = predFUN,
                                                  perfFUN = perfFUN, genFUN = genFUN, selFUN = selFUN, ...)
      steps[[getStepName(step.count)]][["step.count"]] <- step.count
      ## Model selection
      # This step
      best.new.model <- mp.step.get.best(steps[[getStepName(step.count)]])
      
      # Overall
      if (mp.which.best.change(list(current.model, best.new.model), selFUN = selFUN) <= 1) { # TBI: 0 for gen.diff < tol
        out[["stop.reason"]] <- "no improvement was possible."
        out[["best.step"]] <- getStepName(step.count - 1)
        break
      } else {
        current.model <- best.new.model
        out[["best.step"]] <- getStepName(step.count)
      }
    }
  
  # Return a global model and the stepwise list
  if (is.null(two.stage) || isTRUE(two.stage)) {
    out[["global.model"]] <- mp.global.2st(current.model, metaFUN = metaFUN, meta.method = meta.method)
    out[["nobs.strata"]] <- sapply(out[["global.model"]][["stratified.fit"]], nobs)
  } else {
    out[["global.model"]] <- mp.global.1st(formula = formula(current.model), data = data,
                                           estFUN = estFUN, ...) # strata = strata, to be added for random effects.
  }
  
  out[["stepwise"]] <- steps
  out[["step.count"]] <- step.count
  out[["nobs"]] <- sum(out[["nobs.strata"]])
  class(out) <- "mp.fit"
  
  out
}

#' @author Valentijn de Jong
#' @method print   mp.fit
#' @export
print.mp.fit <- function(x, ...) {
  for (i in seq_along(x[["stepwise"]])) {
    if (i == 1) {
      cat("Started with model:\n")
      print(x$stepwise[[i]][["start.formula"]])
    } else if (i > 2) {
      cat("\n")
      cat("Continued with model:\n")
      print(x$stepwise[[i]][["start.formula"]])
    }
    print.mp.step(x$stepwise[[i]], show.f = FALSE)
  }
  
  cat("\n")
  cat("Cross-validation stopped after", x[["step.count"]], "steps, as", x[["stop.reason"]])
  cat(" Final model:\n")
  print(x[["global.model"]])
}

#' @author Valentijn de Jong
#' @method summary   mp.fit
#' @export
summary.mp.fit <- function(object, ...) {
  for (s in seq_along(object[["stepwise"]]))
    summary.mp.step(object[["stepwise"]][[s]])
  cat("\n")
  cat("Cross-validation stopped after", object[["step.count"]], "steps, as", object[["stop.reason"]])
  cat(" Final model:\n")
  print(object[["global.model"]])
}

# "Method" for selecting best mp.cv from a mp.cv object
# step mp.step
# Returns mp.cv object, with best value for genFUN.
# mp.step.best.change <- function(step, ...)
#   step[["cv"]][[which(sapply(step[["cv"]], `[[`, "best.change"))]]

# Select best model
mp.which.best.change <- function(cvs, selFUN = which.min, ...)
  selFUN(sapply(cvs, `[[`, "gen"))

mp.step.get.best <- function(step, selFUN = which.min, ...) 
  step[["cv"]][[mp.which.best.change(step[["cv"]], selFUN = selFUN, ...)]]

mp.step.get.change <- function(step, ...)
  mp.step.get.best(step)[["changed"]]

# Perform one step in the fitting process of mp.fit
# formula formula to start with
# data data.frame, containing dev and val data
# remaining.changes predictor terms to add or remove
# st.i numeric vector, cluster indicators
# st.u unique values in st.i
# folds list, folds as in utils
# recal.int logical, recalibrate intercept?
# two.stage logical, is it a two stage model? For future use.
# estFUN function, used for obtaining predict method
# predFUN function, user supplied predict method, overrides estFUN
# perfFUN function, function for computing performance, defaults to mse = mean square error
# genFUN function, function for computing generalizability, defaults to absolute mean.
# ... options for predictMethod, perfFUN, and genFUN.
# Returns object of class mp.cv.val, which is a validated mp.cv.dev
mp.step <- function(formula, data, remaining.changes, st.i, st.u, folds, recal.int = FALSE, 
                    two.stage = TRUE, retest = FALSE,
                    estFUN = glm, metaFUN = urma, meta.method = "DL", predFUN = NULL, 
                    perfFUN = mse, genFUN = abs.mean, selFUN = which.min, ...) {
  cv <- out <- list()
  out[["start.formula"]] <- formula
  
  for (fc in seq_along(remaining.changes) )
  {
    change <- remaining.changes[fc]
    
    # Produce formula for changes and no changes:
    if (identical(remaining.changes, "")) {
      name <- "unchanged"
      new.formula <- formula
    } else {
      name <- as.character(change)
      new.formula <- updateFormula(formula, change) 
    }
    
    # Run
    # change.name is now saved here and used elsewhere. To be simplified..
    cv[[name]] <- mp.cv(formula = new.formula, data = data, st.i = st.i, st.u = st.u,
                        folds = folds, recal.int = recal.int, two.stage = two.stage,
                        estFUN = estFUN, metaFUN = metaFUN, meta.method = meta.method,
                        predFUN = predFUN, perfFUN = perfFUN, genFUN = genFUN, change = change, ...) 
    # Save changes
    cv[[name]][["remaining.changes"]] <- if (retest) remaining.changes else remaining.changes[-fc]
    # cv[[name]][["changed"]] <- change
  }
  
  out[["best.change"]] <- mp.which.best.change(cv, selFUN = selFUN)
  out[["cv"]] <- cv
  
  class(out) <- "mp.step"
  out
}

#' @author Valentijn de Jong
#' @method summary   mp.step
#' @export
summary.mp.step <- function(object, ...) {
  cv <- object[["cv"]]
  
  # Model comparison is handled by print.mp.step()
  if (!is.null(object[["step.count"]]))
    cat("\n", "Step ", object[["step.count"]], ". ", sep = "")
  if (identical(cv[[1]][["changed"]], "")) {
    # cat("\n")
    cat("Tested 1 model in this step. ") 
  } else {
    # cat("\n")
    cat("Tested", length(cv), "models in this step. ")
    print.mp.step(object)
    cat("\n")
  }
  
  # Model coefficients are handled here, using print.mp.cv()
  for (i in seq_along(cv)) {
    ch <- cv[[i]][["changed"]]
    if (identical(ch, "")) ch <- "nothing"
    cat("Changing", ch, "yields:\n")
    print.mp.cv(cv[[i]])
    if (isTRUE(i == object[["best.change"]][[1]]))
      cat("This model has best generalizability in this step.")
    cat("\n")
  }
}

#' @author Valentijn de Jong
#' @method print   mp.step
#' @export
print.mp.step <- function(x, show.f = TRUE, ...) {
  if (show.f) {
    cat("Starting with model:\n")
    print(x[["start.formula"]])
  }
  cat("\n")
  cat("Generalizability:\n")
  print(data.frame(lapply(x[["cv"]], `[[`, "gen.all")))
}

# Turn a cross-validated model into a full or 'global' model
# mp.cv.dev object of class mp.cv.dev
# metaFUN function for meta-analysis
# meta.method option for metaFUN
# ... optional arguments for metaFUN
mp.global.2st <- function(cv.dev, metaFUN = urma, meta.method = "DL") {
  
  out <- c(cv.dev, mp.meta.fit(stratified.fit = cv.dev[["stratified.fit"]], 
                               metaFUN = metaFUN, meta.method = meta.method) )
  
  class(out) <- c("mp.global", class(out))
  out
}

mp.global.1st <- function(formula, data, strata = NULL, estFUN = "glm", ...) {
  # TODO for glmer / lmer: add random intercepts to formula here.
  dots <- list(...)
  dots$two.stage <- NULL
  args <- c(list(formula = formula, data = data), dots)
  fit <- do.call(estFUN, args)
  out <- mp.stratum.fit(fit)
  out$formula <- formula
  out[["nobs.strata"]] <- nobs(out) # To be removed??
  class(out) <- c("mp.global", "mp.stratum.fit", class(out)) # "mp.global.1st" ?! or not?
  out
}

# #' @author Valentijn de Jong
# #' @method family   mp.global
# #' @export
# family.mp.global <- function(object, ...)
#   object$family

# Fitted? Predict?

#' @author Valentijn de Jong
#' @method predict   mp.global
#' @export
predict.mp.global <- function(object, newdata = NULL, strata = NULL, type = "response", 
                              recal.int = NULL, center = NULL, ...) 
  predict.metapred(object = object, newdata = newdata, strata = strata, type = type, 
                   recal.int = recal.int, center = center, ...)

#' @author Valentijn de Jong
#' @method print   mp.global
#' @export
print.mp.global <- function(x, ...) {
  cat("Meta-analytic model of prediction models estimated in", x$n.clusters, "strata. Coefficients:" , "\n")
  print(x[["coefficients"]])
}

#' @author Valentijn de Jong
#' @method summary   mp.global
#' @export
summary.mp.global <- function(object, ...) {
  print.mp.global(object, ...)
  cat("\n")
  print.mp.cv(object, ...)
}

# Make a new meta-model and cross-validate it, for metapred
# formula formula
# data data.frame
# st.i vector, cluster indicators. char or numeric
# st.u unique values in st.i
# recal.int logical, recalibrate intercept?
# estFUN function, for estimating, e.g. glm
# metaFUN function, for producing meta model
# meta.method character, option for metaFUN
# predFUN function, user supplied predict method, overrides estFUN
# perfFUN function, function for computing performance, defaults to mse = mean square error
# genFUN function, function for computing generalizability, defaults to absolute mean.
# ... options for predictMethod, perfFUN, and genFUN.
# Returns object of classes mp.cv, mp.cv.val, mp.cv.dev, which is a list of meta-analytic models developed on dev folds,
# and a validated on val folds 
mp.cv <- function(formula, data, st.i, st.u, folds, recal.int = FALSE, two.stage = TRUE,
                  estFUN = glm, metaFUN = urma, meta.method = "DL", predFUN = NULL, 
                  perfFUN = mse, genFUN = abs.mean, change = NULL, ...) {
  out <- mp.cv.dev(formula = formula, data = data, st.i = st.i, st.u = st.u, folds = folds, two.stage = two.stage,
                   estFUN = estFUN, metaFUN = metaFUN, meta.method = meta.method, change = change, ...)
  
  out <- mp.cv.val(cv.dev = out, data = data, st.i = st.i, folds = folds, recal.int = recal.int, two.stage = two.stage,
                   estFUN = estFUN, predFUN = predFUN, perfFUN = perfFUN, genFUN = genFUN, ...)
  
  class(out) <- c("mp.cv", class(out))
  out
}

# Obtain fitted values of a (\code{subset}ted from metapred) mp.cv object
# object mp.cv
# returns list of fitted values, with length equal to number of strata. 
#' @author Valentijn de Jong
#' @method fitted   mp.cv
#' @export
fitted.mp.cv <- function(object, ...) {
  data <- object[["data"]]
  if (is.null(data))
    stop("Use subset(metapred()) to obtain the cv.")
  strata <- object$strata
  center <- object$options$center
  if (center)
    data <- centerCovs(data = data, y.name = f2o(formula(object)), cluster.name = strata)
  
  fitted.mp.cv.dev(object, data = data, folds = object$folds, st.i = data[[object$strata]], ...)
}

# fitted.mp.cv.val <- function(object, newdata, folds, st.i, recal.int = FALSE, predFUN = NULL, ...) 
#   fitted.mp.cv.dev(object = object, newdata = newdata, folds = folds, st.i = st.i, predFUN = predFUN)

#' @author Valentijn de Jong
#' @method print   mp.cv
#' @export
print.mp.cv <- function(x, ...) {
  print.mp.cv.dev(x, ...)
  cat("\n")
  print.mp.cv.val(x, ...)
}

# Validate a mp.cv.dev
# cv.dev, mp.cv.dev
# data data.frame, containing old and new data
# st.i numeric vector, cluster indicators
# folds list, folds as in utils
# recal.int logical, recalibrate intercept?
# two.stage logical, is it a two stage model?
# estFUN function, used for obtaining predict method
# predFUN function, user supplied predict method, overrides estFUN
# perfFUN function, function for calculating performance, defaults to mse = mean square error
# genFUN function, function for calculating generalizability, defaults to absolute mean.
# add.perfFUN list of functions, additional performance functions. Evaluated but not used for selection.
# add.genFUn list of functions, additional generalizability functions. Evaluated but not used for selection.
# ... options for predictMethod, perfFUN, and genFUN.
# Returns object of class mp.cv.val, which is a validated mp.cv.dev
mp.cv.val <- function(cv.dev, data, st.i, folds, recal.int = FALSE, two.stage = TRUE, 
                      estFUN = glm, predFUN = NULL, perfFUN = mse, 
                      genFUN = abs.mean, plot = F, ...) {
  dots <- list(...)
  pfn <- if (is.character(perfFUN)) perfFUN else "Performance"
  cv.dev[["perf.name"]] <- pfn # To be removed!??!!?
  # perfFUN <- match.fun(perfFUN)
  
  # Recalibrate?
  if (isTRUE(recal.int))
    cv.dev <- mp.cv.recal(cv.dev = cv.dev, newdata = data, estFUN = estFUN, folds = folds)
  
  # Predict outcome
  p <- fitted.mp.cv.dev(object = cv.dev, data = data, folds = folds, st.i = st.i, predFUN = predFUN, two.stage = two.stage)
  cv.dev[["nobs.val"]] <- sapply(p, length)
  
  # Necessary for performance computation
  outcome <- f2o(formula(cv.dev))
  
  # Performance
  if (!is.list(perfFUN)) 
    perfFUN <- list(perfFUN)
  
  # Names of Performance measures
  if (identical(length(names(perfFUN)), length(perfFUN))) {
    perf.names <- names(perfFUN)
  } else if (all(sapply(perfFUN, is.character))) {
    perf.names <- unlist(perfFUN)
  } else
    perf.names <- as.character(seq_along(perfFUN))
  cv.dev[["perf.names"]] <- perf.names
  
  # Multiple performance measures may be calculated.
  perfcalc <- function(perfFUN, cv.dev, folds, outcome, st.i, data, estFUN, p) {
    perfFUN <- match.fun(perfFUN)
    
    perf.full <- perf.str <- list()
    
    for (i in seq_len(cv.dev[["n.cv"]])) {
      perf.str[[length(perf.str) + 1]]   <- getclName(folds[["val"]][[i]])
      perf.full[[length(perf.full) + 1]] <- perfFUN(p[[i]], data[folds[["val"]][[i]] == st.i, outcome], data = data,
                                                    fit = cv.dev[["cv"]][[i]], estFUN = estFUN, ...)
    }
    names(perf.full) <- apply(as.matrix(folds[["val"]]), 1, getclName)
    
    out <- data.frame(val.strata = unlist(perf.str), 
                      estimate = unlist(sapply(perf.full, `[[`, 1)), 
                      se = NA, var = NA, ci.lb = NA, ci.ub = NA, 
                      measure = pfn,
                      n = unlist(cv.dev[["nobs.val"]]),
                      class = unlist(lapply(lapply(perf.full, class), '[[', 1)))
    tryCatch(
      out[["var"]] <- unlist(lapply(perf.full, variances)),  
      error = function(e) print(paste("Skipping variance estimation for", class(perf.full[[1]])[[1]], sep = " ")))
    tryCatch(
      out[["se"]] <- sqrt(out[["var"]]),
      error = function(e) print(paste("Skipping se estimation for", class(perf.full[[1]])[[1]], sep = " ")))
    tryCatch(
      out[, c("ci.lb", "ci.ub")] <- unlist(t(sapply(perf.full, get_confint))),  
      error = function(e) print(paste("Skipping ci estimation for", class(perf.full[[1]])[[1]], sep = " ")))
    
    row.names(out) <- names(cv.dev[["cv"]])
    out[, "estimate"]  <- as.numeric(out[, "estimate"]) 
    class(out) <- c("perf", class(out))
    out # cv.dev[["perf"]] = out
  }
  
  perf.all <- list()
  for (fun.id in seq_along(perfFUN)) # Single brackets intended!
    perf.all[[fun.id]] <- perfcalc(perfFUN[[fun.id]], cv.dev = cv.dev, folds = folds, outcome = outcome, 
                                   st.i = st.i, data = data, estFUN = estFUN, p = p)
  
  names(perf.all) <- perf.names
  cv.dev[["perf.all"]] <- perf.all   # Future compatibility  
  cv.dev[["perf"]] <- perf.all[[1]]  # Backwards compatibility
  
  # Generalizibility
  if (!is.list(genFUN)) 
    genFUN <- list(genFUN)
  
  # Names of generalizability measures
  if (identical(length(names(genFUN)), length(genFUN))) {
    gen.names <- names(genFUN)
  } else if (all(sapply(genFUN, is.character))) {
    gen.names <- unlist(genFUN)
  } else
    gen.names <- as.character(seq_along(genFUN))
  
  # Computation of generalizability
  gen.all <- rep(NA, length(genFUN))
  
  for (fun.id in seq_along(genFUN)) { # Single brackets intended!
    genfun <- match.fun(genFUN[[fun.id]])
    args <- c(list(object = cv.dev[["perf"]],
                   coef = coef(cv.dev[["stratified.fit"]]),
                   title = paste("Model change: ~", cv.dev[["changed"]]),
                   xlab = as.character(pfn)
    ), dots)
    if (!is.null(fam <- cv.dev$family))
      args$family <- fam
    if (two.stage)
      args$coef.se <- se(cv.dev[["stratified.fit"]]) # is compatible with two.stage = T, but not with = F
    
    gv <- do.call(genfun, args = args )
    
    # gv <- genfun(cv.dev[["perf"]], coef = coef(cv.dev[["stratified.fit"]]), coef.se = se(cv.dev[["stratified.fit"]]),
    # title = paste("Model change: ~", cv.dev[["changed"]]), xlab = as.character(pfn), ...)
    # As 'foo[[bar]] <- NULL' is not allowed # Also, the rest of the code does not expect a plot yet.
    gen.all[[fun.id]] <- if (is.null(gv) || !is.numeric(gv) || !identical(length(gv), 1L)) NaN else gv 
  }
  
  names(gen.all) <- gen.names
  cv.dev[["gen.all"]] <- gen.all
  cv.dev[["gen"]] <- gen.all[[1]]
  
  class(cv.dev) <- c("mp.cv.val", class(cv.dev))
  if (plot)
    plot.mp.cv.val(cv.dev, NA, ...)
  cv.dev
}

#' @author Valentijn de Jong
#' @method print   mp.cv.val
#' @export
print.mp.cv.val <- function(x, ...) {
  if (!is.null(x[["perf"]])) {
    cat("Cross-validation at stratum level yields the following performance: \n")
    print(x[["perf"]][ , 1:7])
  }
  if (!is.null(x[["gen.all"]]))
    cat("\n")
  cat("Generalizability:\n")
  print(x[["gen.all"]])
  cat("\n")
}

# Make a new meta-model to be cross-validated for metapred
# formula formula
# data data.frame
# estFUN function for estimation, e.g glm
# st.i vector, cluster indicators. char or numeric
# st.u unique values in st.i
# folds list, folds as in utils
# two.stage logical, is it a two stage method?
# estFUN function, for estimating, e.g. glm
# metaFUN function, for producing meta model
# meta.method character, option for metaFUN
# Returns mp.cv.dev, which is a list of meta-analytic models developed on dev folds
mp.cv.dev <- function(formula, data, st.i, st.u, folds, 
                      estFUN = glm, metaFUN = urma, meta.method = "DL", change = NULL, two.stage = TRUE, ...) {
  out <- list(...) # possibly contains family
  if (!is.null(out$family)) {
    if (is.character(out$family)) # Ensure that family is the output of family(). Taken directly from glm.
      out$family <- get(out$family, mode = "function", envir = parent.frame())
    if (is.function(out$family)) 
      out$family <- out$family()
    if (is.null(out$family$family)) {
      print(family)
      stop("'family' not recognized")
    }
  }
  out[["formula"]] <- formula
  out[["changed"]] <- change
  
  if (two.stage) {
    ## Cluster part 
    out[["stratified.fit"]] <- mp.stratified.fit(formula = formula, data = data, st.i = st.i, st.u = st.u, 
                                                 estFUN = estFUN, ...)  
    out[["n.clusters"]] <- length(out[["stratified.fit"]])
    
    ## cv meta-part
    out[["cv"]] <- mp.cv.meta.fit(stratified.fit = out[["stratified.fit"]], folds = folds, 
                                  metaFUN = metaFUN, meta.method = meta.method)
  } else {
    out[["cv"]] <- mp.1st.fit(formula = formula, data = data, st.i = st.i, st.u = st.u, folds = folds,
                              estFUN = estFUN, ...)
  }
  
  out[["n.cv"]] <- length(out[["cv"]])
  class(out) <- c("mp.cv.dev") 
  out
}

# For some reason I have to manually force it to find the right S3 methods.
#' @author Valentijn de Jong
#' @method print   mp.cv.dev
#' @export
print.mp.cv.dev <- function(x, ...) {
  if (!is.null(x[["stratified.fit"]]))
    print.mp.stratified.fit(x[["stratified.fit"]]) 
  cat("\n")
  print.mp.cv.meta.fit(x[["cv"]])
}

# Obtain fitted values of a mp.cv.dev object
# object mp.cv.dev object.
# Returns list of predicted values, with length equal to number of strata.
#' @author Valentijn de Jong
#' @method fitted   mp.cv.dev
#' @export
fitted.mp.cv.dev <- function(object, data, folds, st.i, predFUN = NULL, two.stage = TRUE, ...) {
  predictMethod <- getPredictMethod(fit = object, two.stage = two.stage, predFUN = predFUN, ...)
  p <- list()
  
  for (i in seq_len(object[["n.cv"]]))
    p[[i]] <- predictMethod(object = object, 
                            newdata = data[folds[["val"]][[i]] == st.i, ], 
                            type = "response",
                            b = coef(object[["cv"]][[i]]),
                            f = formula(object), 
                            two.stage = two.stage, 
                            ...)
  # If some other fold method than l1o is applied, this may otherwise cause an error.
  # better method TBI.
  if (all(unlist(lapply(folds$val, length)) == 1))
    names(p) <- unlist(folds$val)
  p
}

#' @author Valentijn de Jong
#' @method family   mp.cv.dev
#' @export
family.mp.cv.dev <- function(object, ...)
  object$family

# Estimate a one-stage or non-stratified model on the develoment (!) strata.
mp.1st.fit <- function(formula, data, st.i, st.u, folds, estFUN, ...) {
  out <- list()
  
  for (fo in seq_along(folds[["dev.i"]])) {
    fit <- estFUN(formula = formula, 
                  data = data[st.i %in% folds[["dev"]][[fo]], ],
                  ...)
    out[[getclName(folds[["dev"]][[fo]])]] <- mp.stratum.fit(fit) 
  }
  
  class(out) <- c("mp.1st.fit", class(out))
  out
}

# Recalibrate a mp.meta.fit
# mp.meta.fit mp.meta.fit
# newdata # ne data.frame, containing only the validation sample (unlike this example!)
# formula, original formula
# estFUN estimation function, e.g. glm
# Returns same object with updated coefficients.
mp.recal.meta.fit <- function(meta.fit, formula, newdata, estFUN, ...) {
  meta.fit[["formula"]] <- formula
  meta.fit[["orig coef"]] <- coef(meta.fit)
  meta.fit[["coefficients"]] <- recalibrate(meta.fit, newdata = newdata, estFUN = estFUN, ...)[["coefficients"]]
  meta.fit[["formula"]] <- NULL ### necessary ??
  meta.fit
}

# Recalibrate a mp.cv.dev
# mp.cv.dev mp.cv.dev
# newdata data.frame
# formula, original formula
# estFUN estimation function, e.g. glm
# folds folds list, as in utils.
# Returns same object with updated coefficients.
mp.cv.recal <- function(cv.dev, newdata, folds, estFUN) {
  for (i in seq_along(cv.dev[["cv"]]))
    cv.dev[["cv"]][[i]] <- mp.recal.meta.fit(meta.fit = cv.dev[["cv"]][[i]], 
                                             formula = cv.dev[["formula"]],
                                             newdata = newdata[folds[["val.i"]][[i]] == i, ], 
                                             estFUN = estFUN, 
                                             family = if (!is.null(cv.dev$family)) cv.dev$family else NULL)
  cv.dev
}

# Make new cv.models for mp.meta.fit for metapred
# This function selects the right fold for development, and calls the fitting function.
# stratified.fit list of mp.stratum.fit objects
# folds list of fold divisions, as given by l1o, or bootstrap in the utils.
# metaFUN function for estimating meta-analytic models, e.g. urma (in utils)
# meta.method options for metaFUN
# Returns an object of class mp.cv.meta.fit, which is a list of meta-analytic prediction models
mp.cv.meta.fit <- function(stratified.fit, folds, metaFUN = urma, meta.method = "DL") {
  out <- list()
  
  for (fo in seq_along(folds[["dev.i"]]))
    out[[getclName(folds[["dev"]][[fo]])]] <- mp.meta.fit(stratified.fit = stratified.fit[folds[["dev.i"]][[fo]]],
                                                          metaFUN = metaFUN, meta.method = meta.method)
  
  class(out) <- c("mp.cv.meta.fit", stratified.fit[[1]]$stratum.class)
  out
}

#' @author Valentijn de Jong
#' @method coef   mp.cv.meta.fit
#' @export
coef.mp.cv.meta.fit <- function(object, ...) 
  t(as.data.frame(lapply(object, `[[`, "coefficients"), check.names = FALSE))

#' @author Valentijn de Jong
#' @method print   mp.cv.meta.fit
#' @export
print.mp.cv.meta.fit <- function(x, ...) {
  cat("Meta-analytic models, estimated in", length(x), "fold combinations. Coefficients: \n")
  print(coef(x))
  
  if (!is.null(x[[1]][["orig coef"]])) {
    cat("\n")
    cat("Original coefficients before recalibration: \n")
    print(t(as.data.frame(lapply(x, `[[`, "orig coef"), check.names = FALSE) ))
  }
}

# Make new meta model (i.e. model fitted on multiple clusters) for ?? for metapred
# stratified.fit mp.stratified.fit
# metaFUN function for estimating meta-analytic models, e.g. urma (this file)
# meta.method options for metaFUN
# Returns object of class mp.cv.model, which is a meta-analytic prediction model
mp.meta.fit <- function(stratified.fit, metaFUN = urma, meta.method = "DL") {
  out <- list()
  
  b <- coef.mp.stratified.fit(stratified.fit) # again i have to point it to the right method.
  variances <- variances.mp.stratified.fit(stratified.fit)
  vcov <- vcov.mp.stratified.fit(stratified.fit)
  
  meta <- metaFUN(coefficients = b, variances = variances, vcov = vcov, method = meta.method) 
  
  out[["coefficients"]] <- meta$coefficients
  out[["variances"]]    <- meta$variances
  out[["se"]]           <- meta$se
  out[["ci.lb"]]        <- meta$ci.lb
  out[["ci.ub"]]        <- meta$ci.ub
  out[["tau"]]          <- meta$tau
  out[["tau2"]]         <- meta$tau2
  out[["se.tau2"]]      <- meta$se.tau2
  out[["pi"]]           <- c(meta$pi.lb, meta$pi.ub)
  out[["pi.lb"]]        <- meta$pi.lb
  out[["pi.ub"]]        <- meta$pi.ub
  out[["nobs.strata"]]  <- sapply(stratified.fit, nobs)
  out[["nobs"]]         <- sum(out[["nobs.strata"]])
  
  class(out) <- "mp.meta.fit"
  out
}

#' @author Valentijn de Jong
#' @method print   mp.meta.fit
#' @export
print.mp.meta.fit <- function(x, ...) {
  cat("Coefficients: ", "\n")
  print(coef(x))
}

# Make new stratified models for mp.meta.fit for metapred
# formula formula
# data data.frame
# st.i vector, strata indicators. char or numeric
# st.u unique values in st.i
# estFUN function for estimation, e.g glm
# ... other, e.g. family for glm.
# Returns mp.stratified.fit
mp.stratified.fit <- function(formula, data, st.i, st.u, estFUN, ...) {
  out <- list()
  
  # for (st.this in st.u)
  #   out[[getclName(st.this)]] <- mp.stratum.fit(estFUN(formula = formula, data = data[st.i == st.this, ], ...) )
  
  # Replace with: (?)
  # for (st.this in sort(unique(unlist(folds[["dev"]]) )) )
  for (st.this in st.u)
    out[[getclName(st.this)]] <- mp.stratum.fit(estFUN(formula = formula, data = data[st.i %in% st.this, ], ...) )
  
  class(out) <- "mp.stratified.fit"
  out
}

#' @author Valentijn de Jong
#' @method coef   mp.stratified.fit
#' @export
coef.mp.stratified.fit <- function(object, ...)
  as.data.frame(t(as.data.frame(lapply(object, `[[`, "coefficients", drop = FALSE))))

#' @author Valentijn de Jong
#' @method vcov   mp.stratified.fit
#' @export
vcov.mp.stratified.fit <- function(object, ...) {
  l <- lapply(object, vcov)
  array(unlist(l), dim = c(nrow(l[[1L]]), ncol(l[[1L]]), length(l)), 
        dimnames = list(rownames(l[[1L]]), colnames(l[[1L]]), names(l)))
}

#' @author Valentijn de Jong
#' @method vcov   mp.stratum.fit
#' @export
vcov.mp.stratum.fit <- function(object, ...)
  object$vcov

#' @author Valentijn de Jong
#' @method print   mp.stratified.fit
#' @export
print.mp.stratified.fit <- function(x, ...) {
  cat("Prediction models estimated in", length(x), "strata. Coefficients:\n")
  print(coef(x))
}

# Make a new mp.stratum.fit for mp.stratified.fit for metapred.
# fit model fit object, e.g. glm object
# Returns mp.stratum.fit
mp.stratum.fit <- function(fit) {
  out <- list()
  out[["coefficients"]] <- getCoefs(fit)
  out[["variances"]]    <- getVars(fit)
  # out[["covar"]]        <- getCoVars(fit)
  out[["vcov"]]         <- getCoVars(fit)
  out[["nobs"]]         <- nobs(fit, use.fallback = TRUE)
  out[["stratum.class"]]<- class(fit)
  
  class(out) <- "mp.stratum.fit"
  out
}

#' @author Valentijn de Jong
#' @method print   mp.stratum.fit
#' @export
print.mp.stratum.fit <- function(x, ...) {
  cat("Coefficients: ", "\n")
  print(coef(x))
}

# Too many args that might not be available for this method. I have abandoned it.
# predict.mp.stratum.fit <- function(object, newdata = NULL, family = NULL, 
# formula = NULL, method = NULL, type = "response") {
#   if (!is.null(family))
#     object$family <- family
#   if (is.null(formula))
#     stop("predict.mp.stratum.fit needs a formula")
#   else
#     object$formula <- formula
#   if (is.null(method))
#     stop("predict.mp.stratum.fit needs a prediction method")
# 
#   method(object = object, newdata = newdata, type = type)
# }

# As I would have had to implement it a million times:
family.default <- function(object, ...) 
  object$family

#' Standard errors and variances
#' 
#' Obtain standard errors or variances of a model fit
#' 
#' @aliases variances se tau tau2
#' 
#' @author Valentijn de Jong
#' 
#' @usage se(object, ...)
#' variances(object, ...)
#' tau(object, ...)
#' tau2(object, ...)
#' 
#' @param object A model fit object
#' @param ... other arguments
#' 
#' @return For \code{se} the standard errors of \code{object}, and for 
#' \code{variances} the variances. For \code{tau} the heterogeneity of the coefficients.
#' @export
se <- function(object, ...)
  UseMethod("se", object)

#' @export
se.default <- function(object, ...)
  sqrt(variances(object, ...))

#' @export
variances <- function(object, ...)
  UseMethod("variances", object)

#' @export
variances.default <- function(object, ...) {
  if (!is.null(v <- object$var)) # Non-exact match intented :)
    return(v)
  return(diag(vcov(object, ...))) # nrow and ncol are optional arguments! Ignore warning.
}

#' @export
variances.mp.perf <- function(object, ...)
  object$variances

#' @export
# variances.metapred <- function(object, ...)
#   variances(object[["global.model"]])

variances.metapred <- function(object, select = "global", ...)
  variances(subset(object, select = select, ...))

#' @export
variances.mp.cv <- function(object, select = "cv", ...) 
  variances(object[[select, exact = FALSE]])

#' @export
variances.mp.stratified.fit <- function(object, ...)
  t(as.data.frame(lapply(object, variances), check.names = FALSE))

#' @export
variances.mp.1st.fit <- function(object, ...) # Experimental
  t(as.data.frame(lapply(object, variances), check.names = FALSE))

#' @export
variances.mp.cv.meta.fit <- function(object, ...)
  variances.mp.stratified.fit(object, ...)

#' @export
variances.auc <- function(object, ...) 
  pROC::var(object)

#' @export
tau2 <- function(object, ...)
  tau(object, ...)^2

#' @export
tau <- function(object, ...)
  UseMethod("tau")

#' @export
tau.metapred <- function(object, ...)
  tau(subset(object, ...), ...)

#' @export
tau.default <- function(object, ...)
  object$tau

#' @author Valentijn de Jong
#' @method ci   listofperf
#' @importFrom pROC ci
#' @export
ci.listofperf <- function(object, ...) {
  # object <- list(...)$object  
  if (inherits(object[[1]], "recal")) {
    z <- lapply(object, ci.recal)
    return(data.frame(
      theta       = sapply(z, `[[`, 2),
      theta.ci.lb = sapply(z, `[[`, 1),
      theta.ci.ub = sapply(z, `[[`, 3)
    ))
  }
  if (inherits(object[[1]], "lm")) {
    z <- lapply(object, confint)
    return(data.frame(theta       = sapply(object, `[[`, 1),
                      theta.ci.lb = sapply(z, `[[`, 1),
                      theta.ci.ub = sapply(z, `[[`, 2)
    ))
  }
  if (inherits(object[[1]], "auc")) {
    z <- lapply(object, pROC::ci)
    return(data.frame(
      theta       = sapply(z, `[[`, 2),
      theta.ci.lb = sapply(z, `[[`, 1),
      theta.ci.ub = sapply(z, `[[`, 3)
    ))
  }
  if (inherits(object[[1]], "mse")) {
    z <- lapply(object, ci.mse)
    return(data.frame(
      theta       = sapply(z, `[[`, 2),
      theta.ci.lb = sapply(z, `[[`, 1),
      theta.ci.ub = sapply(z, `[[`, 3)
    ))
  }
  stop(paste("ci.listofperf does not recognize objects of class", class(object)))
}
# confint.listofperf <- function(object, ...) {
#   if (inherits(object[[1]], "lm")) {
#     z <- lapply(object, confint, parm = parm, level = level)
#     return(data.frame(theta       = sapply(object, `[[`, 1),
#                       theta.ci.lb = sapply(z, `[[`, 1),
#                       theta.ci.ub = sapply(z, `[[`, 2)
#                       ))
#   }
#   if (inherits(object[[1]], "auc")) {
#     z <- lapply(object, pROC::ci)
#     return(data.frame(
#       theta       = sapply(z, `[[`, 2),
#       theta.ci.lb = sapply(z, `[[`, 1),
#       theta.ci.ub = sapply(z, `[[`, 3)
#     ))
#   }
#   stop("confint.listofperf does not recognize the object.")
# }

# #' @importFrom pROC ci
# #' @export
# confint.auc <- function(object, parm, level, ...) {
#   
# }

#' @export
ci.recal <- function(object, conf = .95, ...) { # To be implemented in some other class. Temporarily for lm.
  ses <- se(object, ...)
  coefs <- coef(object, ...)
  z <- qt(1 - (1 - conf)/2, df = object$df.residual) # z = t distributed
  data.frame("ci.lb" = coefs - z * ses, "estimate" = coefs, "ci.ub" = coefs + z * ses)
}

#' @export
ci.mse <- function(object, conf = .95, ...) {
  ses <- se(object, ...)
  est <- object[["estimate"]]
  z <- qt(1 - (1 - conf)/2, df = object$n - 1) # z = t distributed
  data.frame("ci.lb" = est - z * ses, "estimate" = est, "ci.ub" = est + z * ses)
}


#' Generalizability estimates
#' 
#' Obtain generalizability estimates from a model fit.
#' 
#' @aliases gen generalizability
#' 
#' @author Valentijn de Jong
#' 
#' @usage gen(object, ...)
#' generalizability(object, ...)
#' 
#' @param object A model fit object, either a \link{metapred} or \code{subset(metapred)} object.
#' @param ... By default, the final model is selected. This parameter allows other arguments to be passed 
#' to \link{subset.metapred} such that the generalizability estimates of other steps/models may be
#' returned.. 
#' 
#' @export
gen <- function(object, ...) 
  UseMethod("gen")

#' @export
generalizability <- gen

#' @export
gen.metapred <- function(object, genFUN = 1, ...) 
  gen(subset(object, ...), genFUN = genFUN, ...)

#' @export
gen.mp.cv.val <- function(object, genFUN = 1, ...)
  object$gen.all[[genFUN]]


#' Performance estimates
#' 
#' Obtain performance estimates from a model fit.
#' 
#' @aliases perf performance 
#' 
#' @author Valentijn de Jong
#' 
#' @usage perf(object, ...)
#' performance(object, ...)
#' 
#' @param object A model fit object, either a \link{metapred} or \code{subset(metapred)} object.
#' @param ... By default, the final model is selected. This parameter allows other arguments to be
#' passed to \link{subset.metapred} such that the performance estimates of other steps/models may be
#' returned.. 
#' 
#' @export
perf <- function(object, ...) 
  UseMethod("perf")

#' @export
performance <- perf

#' @export
perf.metapred <- function(object, perfFUN = 1, ...) 
  perf(subset(object, ...), perfFUN = perfFUN, ...)

#' @export
perf.mp.cv.val <- function(object, perfFUN = 1, ...)
  object[["perf.all"]][[perfFUN]]
Any scripts or data that you put into this service are public.
metamisc documentation built on Sept. 25, 2022, 5:05 p.m.
rdrr.io home R language documentation Run R code online
CRAN packages Bioconductor packages R-Forge packages GitHub packages
Note that we can't provide technical support on individual packages. You should contact the package authors for that.
metamisc
Meta-Analysis of Diagnosis and Prognosis Research Studies

R/metapred.R
In metamisc: Meta-Analysis of Diagnosis and Prognosis Research Studies

Documented in fitted.metapred gen metapred perf se subset.metapred tau tau2 variances

Try the metamisc package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

metamisc Meta-Analysis of Diagnosis and Prognosis Research Studies

R/metapred.R In metamisc: Meta-Analysis of Diagnosis and Prognosis Research Studies

Documented in fitted.metapred gen metapred perf se subset.metapred tau tau2 variances

Try the metamisc package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

metamisc
Meta-Analysis of Diagnosis and Prognosis Research Studies

R/metapred.R
In metamisc: Meta-Analysis of Diagnosis and Prognosis Research Studies