R/mln.mean.sd.R

Defines functions mln.mean.sd

Documented in mln.mean.sd

#' Method for unknown non-normal distributions (MLN) approach for estimating the sample mean and standard deviation
#'
#' This function applies the Method for Unknown Non-Normal Distributions (MLN) approach to estimate the sample mean and standard deviation from a study that presents one of the following sets of summary statistics: \itemize{
#' \item S1: median, minimum and maximum values, and sample size
#' \item S2: median, first and third quartiles, and sample size
#' \item S3: median, minimum and maximum values, first and third quartiles, and sample size
#'  }
#'
#' Like the Box-Cox method of McGrath et al. (2020), the MLN method of Cai et al. (2021) assumes that the underlying distribution is normal after applying a suitable Box-Cox transformation with power parameter \eqn{\lambda}. Specifically, the MLN method consists of the following steps, outlined below.
#'
#' First, a maximum likelihood approach is used to estimate the power parameter \eqn{\lambda}, where the methods of Luo et al. (2016) and Wan et al. (2014) are applied to estimate the mean and standard deviation of the distribution of the transformed data. Then, a second round estimate of the mean and standard deviation of the distribution of the transformed data is obtained by maximum likelihood estimation conditional on the estimated power parameter. Finally, the inverse transformation is applied to estimate the sample mean and standard deviation of the original, untransformed data.
#'
#' @param min.val numeric value giving the sample minimum.
#' @param q1.val numeric value giving the sample first quartile.
#' @param med.val numeric value giving the sample median.
#' @param q3.val numeric value giving the sample third quartile.
#' @param max.val numeric value giving the sample maximum.
#' @param n numeric value giving the sample size.
#'
#' @return A object of class \code{mln.mean.sd}. The object is a list with the following components:
#' \item{est.mean}{Estimated sample mean.}
#' \item{est.sd}{Estimated sample standard deviation.}
#' \item{location}{Estimated mean of the Box-Cox transformed data.}
#' \item{scale}{Estimated standard deviation of the Box-Cox transformed data.}
#' \item{shape}{Estimated transformation parameter \eqn{\lambda}.}
#' \item{bc.norm.rvs}{The random variables generated by the Box-Cox (or, equivalently, power-normal) distribution during the Monte Carlo simulation.}
#' \item{...}{Some additional elements.}
#'
#' The results are printed with the \code{\link{print.mln.mean.sd}} function.
#'
#' @examples
#' ## Generate S2 summary data
#' set.seed(1)
#' n <- 100
#' x <- stats::rlnorm(n, 2.5, 1)
#' quants <- stats::quantile(x, probs = c(0.25, 0.5, 0.75))
#' obs.mean <- mean(x)
#' obs.sd <- stats::sd(x)
#'
#' ## Estimate the sample mean and standard deviation using the MLN method
#' mln.mean.sd(q1.val = quants[1], med.val = quants[2], q3.val = quants[3],
#'     n = n)
#'
#' @references Cai S., Zhou J., and Pan J. (2021). Estimating the sample mean and standard deviation from order statistics and sample size in meta-analysis. \emph{Statistical Methods in Medical Research}. \strong{30}(12):2701-2719.
#' @references McGrath S., Zhao X., Steele R., Thombs B.D., Benedetti A., and the DEPRESsion Screening Data (DEPRESSD) Collaboration. (2020). Estimating the sample mean and standard deviation from commonly reported quantiles in meta-analysis. \emph{Statistical Methods in Medical Research}. \strong{29}(9):2520-2537.
#' @references Box G.E.P., and D.R. Cox. (1964). An analysis of transformations. \emph{Journal of the Royal Statistical Society Series B}. \strong{26}(2):211-52.
#' @references Luo D., Wan X., Liu J., and Tong T. (2016). Optimally estimating the sample mean from the sample size, median, mid-range, and/or mid-quartile range. \emph{Statistical Methods in Medical Research}. \strong{27}(6):1785-805
#' @references Wan X., Wang W., Liu J., and Tong T. (2014). Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range. \emph{BMC Medical Research Methodology}. \strong{14}:135.
#' @export

mln.mean.sd <- function(min.val, q1.val, med.val, q3.val, max.val, n) {

  args <- as.list(environment())

  scenario <- get.scenario(min.val = min.val, q1.val = q1.val, med.val = med.val,
                           q3.val = q3.val, max.val = max.val)
  check_errors(min.val = min.val, q1.val = q1.val, med.val = med.val,
               q3.val = q3.val, max.val = max.val, n = n, scenario = scenario)

  h <- floor(0.25 * n + 1)
  j <- floor(0.5 * n + 1)
  k <- floor(0.75 * n + 1)

  boxcoxtrans <- function(lambda, x) {
    if (lambda == 0) {
      return(log(x))
    } else {
      return((x^lambda - 1) / lambda)
    }
  }

  logL.lambda <- function(lambda){
    if (scenario == 'S1'){
      min.val.lambda <- boxcoxtrans(lambda, min.val)
      med.val.lambda <- boxcoxtrans(lambda, med.val)
      max.val.lambda <- boxcoxtrans(lambda, max.val)
      mu <- metaBLUE::Luo.mean(X = c(min.val.lambda, med.val.lambda, max.val.lambda), n = n, type = scenario)$muhat
      sigma <- metaBLUE::Wan.std(X = c(min.val.lambda, med.val.lambda, max.val.lambda), n = n, type = scenario)$sigmahat
      logL <- sum(stats::dnorm(c(min.val.lambda, med.val.lambda, max.val.lambda), mu, sigma, log = TRUE)) +
        (j - 2) * log(stats::pnorm(med.val.lambda, mu, sigma)-stats::pnorm(min.val.lambda, mu, sigma)) +
        (n - j - 1) * log(stats::pnorm(max.val.lambda, mu, sigma) - stats::pnorm(med.val.lambda, mu, sigma))
    } else if (scenario == 'S2'){
      q1.val.lambda <- boxcoxtrans(lambda, q1.val)
      med.val.lambda <- boxcoxtrans(lambda, med.val)
      q3.val.lambda <- boxcoxtrans(lambda, q3.val)
      mu <- metaBLUE::Luo.mean(X = c(q1.val.lambda, med.val.lambda, q3.val.lambda), n = n, type = scenario)$muhat
      sigma <- metaBLUE::Wan.std(X = c(q1.val.lambda, med.val.lambda, q3.val.lambda), n = n, type = scenario)$sigmahat
      logL <- sum(stats::dnorm(c(q1.val.lambda, med.val.lambda, q3.val.lambda), mu, sigma, log = TRUE)) +
        (h - 1) * stats::pnorm(q1.val.lambda, mu, sigma, log.p = TRUE) + (j - h - 1) * log(stats::pnorm(med.val.lambda, mu, sigma)-stats::pnorm(q1.val.lambda, mu, sigma)) +
        (k - j - 1) * log(stats::pnorm(q3.val.lambda, mu, sigma) - stats::pnorm(med.val.lambda, mu, sigma)) + (n - k) * stats::pnorm(q3.val.lambda, mu, sigma, lower.tail = FALSE, log.p = TRUE)
    } else if (scenario == 'S3'){
      min.val.lambda <- boxcoxtrans(lambda, min.val)
      q1.val.lambda <- boxcoxtrans(lambda, q1.val)
      med.val.lambda <- boxcoxtrans(lambda, med.val)
      q3.val.lambda <- boxcoxtrans(lambda, q3.val)
      max.val.lambda <- boxcoxtrans(lambda, max.val)
      mu <- metaBLUE::Luo.mean(X = c(min.val.lambda, q1.val.lambda, med.val.lambda, q3.val.lambda, max.val.lambda), n = n, type = scenario)$muhat
      sigma <- metaBLUE::Wan.std(X = c(min.val.lambda, q1.val.lambda, med.val.lambda, q3.val.lambda, max.val.lambda), n = n, type = scenario)$sigmahat
      logL <- sum(stats::dnorm(c(min.val.lambda, q1.val.lambda, med.val.lambda, q3.val.lambda, max.val.lambda), mu, sigma, log = TRUE)) +
        (h - 2) * log(stats::pnorm(q1.val.lambda, mu, sigma) - stats::pnorm(min.val.lambda, mu, sigma)) + (j - h - 1) * log(stats::pnorm(med.val.lambda, mu, sigma) - stats::pnorm(q1.val.lambda, mu, sigma)) +
        (k - j - 1)*log(stats::pnorm(q3.val.lambda, mu, sigma) - stats::pnorm(med.val.lambda, mu, sigma)) + (n - k - 1) * log(stats::pnorm(max.val.lambda, mu, sigma) - stats::pnorm(q3.val.lambda, mu, sigma))
    }
    return(logL)
  }

  #the MLE of lambda
  opt <- tryCatch({suppressWarnings(stats::optimize(f = logL.lambda, interval = c(0, 10),
                                                    tol = 0.001, maximum = TRUE))},
                  error = NULL)
  if (is.null(opt)) {
    stop("Optimization algorithm for finding lambda did not converge.")
  }
  lambda.hat <-round.lambda(opt$maximum)

  if (scenario == 'S1'){
    min.val.lambda <- boxcoxtrans(lambda.hat, min.val)
    med.val.lambda <- boxcoxtrans(lambda.hat, med.val)
    max.val.lambda <- boxcoxtrans(lambda.hat, max.val)
  } else if (scenario == 'S2'){
    q1.val.lambda <- boxcoxtrans(lambda.hat, q1.val)
    med.val.lambda <- boxcoxtrans(lambda.hat, med.val)
    q3.val.lambda <- boxcoxtrans(lambda.hat, q3.val)
  } else if (scenario == 'S3'){
    min.val.lambda <- boxcoxtrans(lambda.hat, min.val)
    q1.val.lambda <- boxcoxtrans(lambda.hat, q1.val)
    med.val.lambda <- boxcoxtrans(lambda.hat, med.val)
    q3.val.lambda <- boxcoxtrans(lambda.hat, q3.val)
    max.val.lambda <- boxcoxtrans(lambda.hat, max.val)
  }


  #-------------------------------------------------------------------------------
  #estimating mu and sigma after Box-Cox transformation
  if (scenario == 'S1'){
    quants <- c(min.val.lambda, med.val.lambda, max.val.lambda)
  } else if (scenario == 'S2'){
    quants <- c(q1.val.lambda, med.val.lambda, q3.val.lambda)
  } else if (scenario == 'S3'){
    quants <- c(min.val.lambda, q1.val.lambda, med.val.lambda, q3.val.lambda, max.val.lambda)
  }
  mean.LW <- metaBLUE::Luo.mean(X = quants, n = n, type = scenario)$muhat
  sd.LW <- metaBLUE::Wan.std(X = quants, n = n, type = scenario)$sigmahat

  logL <- function(theta){
    mu <- theta[1]
    sigma <- theta[2]
    if (scenario == 'S1'){
      logL <- sum(stats::dnorm(c(min.val.lambda, med.val.lambda, max.val.lambda), mu, sigma, log = TRUE)) +
        (j - 2) * log(stats::pnorm(med.val.lambda, mu, sigma)-stats::pnorm(min.val.lambda, mu, sigma)) +
        (n - j - 1) * log(stats::pnorm(max.val.lambda, mu, sigma) - stats::pnorm(med.val.lambda, mu, sigma))
    } else if (scenario == 'S2'){
      logL <- sum(stats::dnorm(c(q1.val.lambda, med.val.lambda, q3.val.lambda), mu, sigma, log = TRUE)) +
        (h - 1) * stats::pnorm(q1.val.lambda, mu, sigma, log.p = TRUE) + (j - h - 1) * log(stats::pnorm(med.val.lambda, mu, sigma)-stats::pnorm(q1.val.lambda, mu, sigma)) +
        (k - j - 1) * log(stats::pnorm(q3.val.lambda, mu, sigma) - stats::pnorm(med.val.lambda, mu, sigma)) + (n - k) * stats::pnorm(q3.val.lambda, mu, sigma, lower.tail = FALSE, log.p = TRUE)
    } else if (scenario == 'S3'){
      logL <- sum(stats::dnorm(c(min.val.lambda, q1.val.lambda, med.val.lambda, q3.val.lambda, max.val.lambda), mu, sigma, log = TRUE)) +
        (h - 2) * log(stats::pnorm(q1.val.lambda, mu, sigma) - stats::pnorm(min.val.lambda, mu, sigma)) + (j - h - 1) * log(stats::pnorm(med.val.lambda, mu, sigma) - stats::pnorm(q1.val.lambda, mu, sigma)) +
        (k - j - 1)*log(stats::pnorm(q3.val.lambda, mu, sigma) - stats::pnorm(med.val.lambda, mu, sigma)) + (n - k - 1) * log(stats::pnorm(max.val.lambda, mu, sigma) - stats::pnorm(q3.val.lambda, mu, sigma))
    }
    return(-logL)
  }

  est.MLE <- tryCatch({suppressWarnings(stats::optim(par = c(mean.LW, sd.LW),
                                                     fn = logL))},
                      error = NULL)
  if (is.null(est.MLE)) {
    stop("Optimization algorithm for finding the MLE of mu and sigma did not converge.")
  }
  mu.lambda <- est.MLE$par[1]
  sigma.lambda <- est.MLE$par[2]

  data.lambda <- stats::rnorm(10000, mu.lambda, sigma.lambda)
  if (lambda.hat != 0){
    data.lambda <- data.lambda[(data.lambda > - 1 / lambda.hat) & (data.lambda < 2 * mu.lambda + 1 / lambda.hat)]
  }
  if (lambda.hat == 0){
    mln.norm.rvs <- exp(data.lambda)
  } else {
    mln.norm.rvs <- (lambda.hat * data.lambda + 1)^(1 / lambda.hat)
  }

  output <- list(est.mean = mean(mln.norm.rvs),
                 est.sd = stats::sd(mln.norm.rvs),
                 location = mu.lambda, scale = sigma.lambda,
                 shape = lambda.hat, mln.norm.rvs = mln.norm.rvs,
                 args = args, scenario = scenario)
  class(output) <- "mln.mean.sd"
  return(output)
}

Try the estmeansd package in your browser

Any scripts or data that you put into this service are public.

estmeansd documentation built on June 19, 2022, 1:05 a.m.