R/hp.R
In BayesMortalityPlus: Bayesian Mortality Modelling

Documented in hp

#' @title Bayesian Heligman-Pollard curve for mortality table graduation
#'
#' @description
#'
#' This function fits the Heligman-Pollard (HP) model following a Bayesian framework using
#' Markov chain Monte Carlo techniques to sample the posterior distribution.
#' Three model options are available: The Binomial and the Poisson models consider nine parameters,
#' whereas the Log-Normal model considers eight parameters for modelling the HP curve.
#'
#' @usage
#' hp(x, Ex, Dx, model = c("binomial", "lognormal", "poisson"),
#'  M = NULL, bn = NULL, thin = 10, m = rep(NA, 8), v = rep(NA, 8),
#'  inits = NULL, K = NULL, sigma2 = NULL, prop.control = NULL,
#'  reduced_model = FALSE)
#'
#' @param x Numeric vector of the ages.
#' @param Ex Numeric vector with the exposure by age.
#' @param Dx Numeric vector with the death counts by age.
#' @param model Character string specifying the model to be adopted. The options are: "binomial", "lognormal" or "poisson".
#' @param M Positive integer indicating the number of iterations of the MCMC run. The default value is 50000 iterations. For the reduced model, the default is 30000 iterations.
#' @param bn Non-negative integer indicating the number of iteration to be discarded as the burn-in period. The default value is half of the M value, rounded.
#' @param thin Positive integer specifying the period for saving samples. The default value is 10.
#' @param m Numeric vector with the mean of the prior distributions for (A, B, C, D, E, F, G, H).
#' @param v Numeric vector with the variance of the prior distributions for (A, B, C, D, E, F, G, H).
#' @param inits Numeric vector with the initial values for the parameters (A, B, C, D, E, F, G, H).
#' @param K Number that specifies the extra parameter 'K' value for binomial and poisson models. It is considered only if model = "binomial" or model = "poisson". The default value is the optimal value.
#' @param sigma2 Positive number that specifies initial value of sigma2 in lognormal model. It is considered only if model = "lognormal".
#' @param prop.control Positive number that is considered for tuning the acceptance rate of MCMC.
#' @param reduced_model Logical value which determines if reduced model should be addressed. If 'TRUE' (default), the first term of the HP curve (infant mortality) is not considered.
#'
#' @details
#' The binomial model assumes that Dx, the death count for the age x, has a binomial distribution:
#' Bin(Ex, qx), where qx is probability of death in age x. The poisson model assumes that Dx has a
#' Poisson distribution: Po(Ex*qx). Both models consider the nine parameters
#' HP curve, that was proposed by Heligman and Pollard (1980):
#'
#'  \eqn{HP_x = A^{(x+B)^C} + De^{(-E(log(x/F))^2)} + GH^x/(1+KGH^x)}
#'
#'  \eqn{qx = 1-e^{(-HP_x)}}
#'
#' This approximation ensures that qx, which is a probability, is in the correct range.
#'
#' The Log-Normal model assumes that the log odds of death qx/(1-qx) has Normal distribution
#' with a constant variance for all the ages. This model was proposed by Dellaportas et al.(2001)
#' and they consider the eighth parameters HP curve as follows:
#'
#'  \eqn{log(qx/(1-qx)) = log(A^{(x+B)^C} + De^{-E(log(x/F))^2} + GH^x) + \epsilon_x},
#'
#' where \eqn{\epsilon_x} has independent distributions Normal(0, sigma2) for all ages. More details
#' of this model are available in Dellaportas, Smith e Stavropoulos (2001).
#'
#' The reduced model does not consider the first term of the HP curve: \eqn{A^{(x+B)^C}}. In this
#' case, A, B and C are fixed as zero.
#'
#' All parameters, with the exception of the extra parameter K of the Binomial and the Poisson models
#' that is the estimated optimal value, are estimated by the MCMC methods. Gibbs sampling for sigma2 and
#' Metropolis-Hastings for parameters A, B, C, D, E, F, G and H. Informative prior distributions
#' should help the method to converge quickly.
#'
#' @return A `HP` class object.
#' \item{summary}{A table with summaries measures of the parameters.}
#' \item{post.samples}{A list with the chains generated by MCMC.}
#' \item{data}{A table with the data considered in fitted model.}
#' \item{info}{A list with some informations of the fitted model like prior distributions mean and variance, initial values.}
#'
#' @references Dellaportas, P., Smith, A. F., and Stavropoulos, P. (2001). “Bayesian Analysis of Mortality Data.” \emph{Journal of the Royal Statistical Society: Series A (Statistics in Society)} 164 (2). Wiley Online Library: 275–291.
#' @references Heligman, Larry, and John H Pollard. (1980). “The Age Pattern of Mortality.” \emph{Journal of the Institute of Actuaries (1886-1994)} 107 (1). JSTOR: 49–80.
#'
#' @examples
#' ## Importing mortality data from the USA available on the Human Mortality Database (HMD):
#' data(USA)
#'
#' ## Selecting the exposure and death count of the 2010 male population ranging from 0 to 90 years old
#' USA2010 = USA[USA$Year == 2010,]
#' x = 0:90
#' Ex = USA2010$Ex.Male[x+1]
#' Dx = USA2010$Dx.Male[x+1]
#'
#' ## Fitting binomial model
#' \donttest{fit = hp(x = x, Ex = Ex, Dx = Dx)
#' print(fit)
#' fit$summary
#' fit$info
#' }
#' ## Fitting lognormal model
#' ## Specifying number of iterations, burn-in, thinning and the initial value of sigma2
#' fit = hp(x = x, Ex = Ex, Dx = Dx, model = "lognormal",
#'          M = 1000, bn = 0, thin = 10, sigma2 = 0.1)
#' summary(fit)
#'
#' ## Fitting poisson model
#' ## Specifying the prior distribution parameters for B and E and fixing K as 0.
#' \donttest{fit = hp(x = x, Ex = Ex, Dx = Dx, model = "poisson",
#'          m = c(NA, 0.08, NA, NA, 9, NA, NA, NA),
#'          v = c(NA, 1e-4, NA, NA, 0.1, NA, NA, NA), K = 0)
#' summary(fit)
#' }
#'
#' ## Using other functions available in the package:
#' ## plotting (See "?plot.HP" in the BayesMortalityPlus package for more options):
#' plot(fit)
#'
#' ## qx estimation (See "?fitted.HP" in the BayesMortalityPlus package for more options):
#' fitted(fit)
#'
#' ## chain's plot (See "?plot_chain" for more options):
#' plot_chain(fit)
#'
#'@seealso [fitted.HP()], [plot.HP()], [print.HP()] and [summary.HP()] for `HP` methods to native R functions [fitted()],
#'[plot()], [print()] and [summary()].
#'
#'[expectancy.HP()] and [Heatmap.HP()] for `HP` methods to compute and visualise the truncated life expectancy
#'via [expectancy()] and [Heatmap()] functions.
#'
#'[hp_close()] for close methods to expand the life tables and [hp_mix()] for mixing life tables.
#'
#'[plot_chain()] to visualise the markov chains, respectively.
#'
#'
#' @include hp_binomial.R
#' @include hp_lognormal.R
#' @include hp_poisson.R
#' @include hp_binomial_reduced.R
#' @include hp_lognormal_reduced.R
#' @include hp_poisson_reduced.R
#' @include fun_aux.R
#'
#' @importFrom MASS mvrnorm
#' @import progress
#' @import stats
#'
#' @export
hp <- function(x, Ex, Dx, model = c("binomial", "lognormal", "poisson"), M = NULL, bn = NULL,
               thin = 10, m = rep(NA_real_, 8), v = rep(NA_real_, 8), inits = NULL, K = NULL, sigma2 = NULL,
               prop.control = NULL, reduced_model = FALSE){

  ##############################################################################################
  ### Validation
  model = match.arg(model)

  x <- unique(trunc(x)); tam_x = length(x); tam_ex = length(Ex); tam_dx = length(Dx)
  if(tam_x != tam_ex || tam_x != tam_dx || tam_ex != tam_dx){ stop("x, Ex, Dx must be the same length.") }
  if(sum(x < 0, na.rm = T) > 0 || sum(Ex < 0, na.rm = T) > 0 || sum(Dx < 0, na.rm = T) > 0){ stop("x, Ex, Dx must be nonnegative numbers.") }

  x0 <- min(x, na.rm = T)
  if(x0 >= 15){ if(!reduced_model) warning("Lower age >= 15. We recommend to use reduced_model = TRUE.", immediate. = T) }

  if(is.null(M)){
    if(reduced_model){
      M = 30000
    }else{
      M = 50000
    }
  }

  if(is.null(bn)){
    bn = round(M/2)
  }

  if(bn > M){ stop("The total number of iterations (M) must be greater than the burnin period (bn).") }

  M = as.integer(trunc(M)); bn = as.integer(trunc(bn)); thin = trunc(thin)
  if(M < 1 || bn < 0 || thin < 1){ stop("M, bn and thin must be positive numbers.") }

  #### priori validation

  ## Means
  m_default <- c(1/51, 0.5, 2/12, 1/51, 3/0.5, 25, 1/101, 100/90)
  if(length(m) != 8) {stop("Length of m must be equal to 8.") }

  m <- ifelse(is.na(m), m_default, m)

  lower = c(0, 0, 0, 0, 0, 15, 0, 0)
  upper = c(1, 1, 1, 1, Inf, 110, 1, Inf)

  if(any(m[c(1:5,7,8)] <= 0) || any(m[c(1:4,7)] >= 1) || m[6] <= 15 || m[6] >= 110){param = LETTERS[1:8]; stop(sprintf("The means are not in the correct interval for the parameters.", paste(param[which((m <= lower || m >= upper))], collapse = ", ")))}

  ## Var
  v_default <- c(50/((51^2)*(52)), 1/12, 20/((12^2)*13), 50/((51^2)*(52)), 3/0.25, 100, 100/((101^2)*(102)), 1/81)
  if(length(v) != 8) {stop("Length of v must be equal to 8.") }

  v <- ifelse(is.na(v), v_default, v)

  if(any(v <= 0)){ stop("At least one of the variance is invalid. All variance must be positives.") }
  if(any(v[c(1:4,7)] >= m[c(1:4,7)]*(1-m[c(1:4,7)]))) {{param = LETTERS[c(1:4,7)]; stop(sprintf("Variance of the parameter(s) %s too large.", paste(param[which(v[c(1:4,7)] >= m[c(1:4,7)]*(1-m[c(1:4,7)]))], collapse = ", ")))}}

  ## prop.control
  prop.control <- ifelse(is.null(prop.control), 1, prop.control)
  if(prop.control <= 0){ stop("prop.control must be a positive number.") }

  ##############################################################################################
  ### MCMC

  if(model == "binomial"){
    if(reduced_model){
      mcmc = hp_binomial_red(x = x, Ex = Ex, Dx = Dx, M = M, bn = bn, thin = thin, m = m, v = v,
                             inits = inits, prop.control = prop.control, K = K)
    }else{
      mcmc = hp_binomial(x = x, Ex = Ex, Dx = Dx, M = M, bn = bn, thin = thin, m = m, v = v,
                         inits = inits, prop.control = prop.control, K = K)
    }
  }else if(model == "lognormal"){
    if(reduced_model){
      mcmc = hp_lognormal_red(x = x, Ex = Ex, Dx = Dx, M = M, bn = bn, thin = thin, m = m, v = v,
                              inits = inits, prop.control = prop.control, sigma2 = sigma2)
    }else{
      mcmc = hp_lognormal(x = x, Ex = Ex, Dx = Dx, M = M, bn = bn, thin = thin, m = m, v = v,
                          inits = inits, prop.control = prop.control, sigma2 = sigma2)
    }
  }else if(model == "poisson"){
    if(reduced_model){
      mcmc = hp_poisson_red(x = x, Ex = Ex, Dx = Dx, M = M, bn = bn, thin = thin, m = m, v = v,
                            inits = inits, prop.control = prop.control, K = K)
    }else{
      mcmc = hp_poisson(x = x, Ex = Ex, Dx = Dx, M = M, bn = bn, thin = thin, m = m, v = v,
                        inits = inits, prop.control = prop.control, K = K)
    }
  }else{stop("Invalid model.")
  }

  ##############################################################################################
  ### house keeping

  param = c("A", "B", "C", "D", "E", "F", "G", "H", "K")
  if(model == "lognormal"){ param = param[-9] }

  #### data
  Qx = 1-exp(-Dx/Ex)
  data = data.frame(x = x, Ex = Ex, Dx = Dx, qx = Qx)

  #### final samples
  mcmc_theta = mcmc$theta.post
  sigma2 = mcmc$sigma2

  #### summary
  accept = 100*t(data.frame(
    a = mcmc$cont[1]/M,
    b = mcmc$cont[2]/M,
    c = mcmc$cont[3]/M,
    d = mcmc$cont[4]/M,
    e = mcmc$cont[5]/M,
    f = mcmc$cont[6]/M,
    g = mcmc$cont[7]/M,
    h = mcmc$cont[8]/M,
    k = mcmc$cont[9]/M
  ))
  if(model == "lognormal"){ accept = accept[-9,] }

  resumo = data.frame(
    mean = apply(mcmc_theta, 2, mean),
    sd = apply(mcmc_theta, 2, sd),
    t(apply(mcmc_theta, 2, quantile, probs = c(0.025, 0.50, 0.975))),
    "Accept %" = round(accept, 1),
    row.names = param
  )
  colnames(resumo) <- c("mean", "sd", "2.5%", "50.0%", "97.5%", "Accept %")

  #### prioris
  prior.dist <- t(data.frame(means = m, vars = v)); colnames(prior.dist) = param[-9]

  ### returns
  return(structure(list(summary = round(resumo, 6),
                        post.samples = list(mcmc_theta = mcmc_theta, sigma2 = sigma2),
                        data = data,
                        info = list(model = model,
                                    reduced = reduced_model,
                                    inits = mcmc$inits,
                                    prior.dist = prior.dist,
                                    prop.control = prop.control)),
                   class = "HP"))
}