grbkw: Gradient of the Negative Log-Likelihood for the BKw...

View source: R/RcppExports.R

grbkwR Documentation

Gradient of the Negative Log-Likelihood for the BKw Distribution

Description

Computes the gradient vector (vector of first partial derivatives) of the negative log-likelihood function for the Beta-Kumaraswamy (BKw) distribution with parameters alpha (\alpha), beta (\beta), gamma (\gamma), and delta (\delta). This distribution is the special case of the Generalized Kumaraswamy (GKw) distribution where \lambda = 1. The gradient is typically used in optimization algorithms for maximum likelihood estimation.

Computes the gradient vector (vector of first partial derivatives) of the negative log-likelihood function for the Beta-Kumaraswamy (BKw) distribution with parameters alpha (\alpha), beta (\beta), gamma (\gamma), and delta (\delta). This distribution is the special case of the Generalized Kumaraswamy (GKw) distribution where \lambda = 1. The gradient is typically used in optimization algorithms for maximum likelihood estimation.

Usage

grbkw(par, data)

Arguments

par

A numeric vector of length 4 containing the distribution parameters in the order: alpha (\alpha > 0), beta (\beta > 0), gamma (\gamma > 0), delta (\delta \ge 0).

data

A numeric vector of observations. All values must be strictly between 0 and 1 (exclusive).

Details

The components of the gradient vector of the negative log-likelihood (-\nabla \ell(\theta | \mathbf{x})) for the BKw (\lambda=1) model are:

-\frac{\partial \ell}{\partial \alpha} = -\frac{n}{\alpha} - \sum_{i=1}^{n}\ln(x_i) + \sum_{i=1}^{n}\left[x_i^{\alpha} \ln(x_i) \left(\frac{\beta(\delta+1)-1}{v_i} - \frac{(\gamma-1) \beta v_i^{\beta-1}}{w_i}\right)\right]

-\frac{\partial \ell}{\partial \beta} = -\frac{n}{\beta} - (\delta+1)\sum_{i=1}^{n}\ln(v_i) + \sum_{i=1}^{n}\left[\frac{(\gamma-1) v_i^{\beta} \ln(v_i)}{w_i}\right]

-\frac{\partial \ell}{\partial \gamma} = n[\psi(\gamma) - \psi(\gamma+\delta+1)] - \sum_{i=1}^{n}\ln(w_i)

-\frac{\partial \ell}{\partial \delta} = n[\psi(\delta+1) - \psi(\gamma+\delta+1)] - \beta\sum_{i=1}^{n}\ln(v_i)

where:

  • v_i = 1 - x_i^{\alpha}

  • w_i = 1 - v_i^{\beta} = 1 - (1-x_i^{\alpha})^{\beta}

  • \psi(\cdot) is the digamma function (digamma).

These formulas represent the derivatives of -\ell(\theta), consistent with minimizing the negative log-likelihood. They correspond to the general GKw gradient (grgkw) components for \alpha, \beta, \gamma, \delta evaluated at \lambda=1. Note that the component for \lambda is omitted. Numerical stability is maintained through careful implementation.

The components of the gradient vector of the negative log-likelihood (-\nabla \ell(\theta | \mathbf{x})) for the BKw (\lambda=1) model are:

-\frac{\partial \ell}{\partial \alpha} = -\frac{n}{\alpha} - \sum_{i=1}^{n}\ln(x_i) + \sum_{i=1}^{n}\left[x_i^{\alpha} \ln(x_i) \left(\frac{\beta(\delta+1)-1}{v_i} - \frac{(\gamma-1) \beta v_i^{\beta-1}}{w_i}\right)\right]

-\frac{\partial \ell}{\partial \beta} = -\frac{n}{\beta} - (\delta+1)\sum_{i=1}^{n}\ln(v_i) + \sum_{i=1}^{n}\left[\frac{(\gamma-1) v_i^{\beta} \ln(v_i)}{w_i}\right]

-\frac{\partial \ell}{\partial \gamma} = n[\psi(\gamma) - \psi(\gamma+\delta+1)] - \sum_{i=1}^{n}\ln(w_i)

-\frac{\partial \ell}{\partial \delta} = n[\psi(\delta+1) - \psi(\gamma+\delta+1)] - \beta\sum_{i=1}^{n}\ln(v_i)

where:

  • v_i = 1 - x_i^{\alpha}

  • w_i = 1 - v_i^{\beta} = 1 - (1-x_i^{\alpha})^{\beta}

  • \psi(\cdot) is the digamma function (digamma).

These formulas represent the derivatives of -\ell(\theta), consistent with minimizing the negative log-likelihood. They correspond to the general GKw gradient (grgkw) components for \alpha, \beta, \gamma, \delta evaluated at \lambda=1. Note that the component for \lambda is omitted. Numerical stability is maintained through careful implementation.

Value

Returns a numeric vector of length 4 containing the partial derivatives of the negative log-likelihood function -\ell(\theta | \mathbf{x}) with respect to each parameter: (-\partial \ell/\partial \alpha, -\partial \ell/\partial \beta, -\partial \ell/\partial \gamma, -\partial \ell/\partial \delta). Returns a vector of NaN if any parameter values are invalid according to their constraints, or if any value in data is not in the interval (0, 1).

Returns a numeric vector of length 4 containing the partial derivatives of the negative log-likelihood function -\ell(\theta | \mathbf{x}) with respect to each parameter: (-\partial \ell/\partial \alpha, -\partial \ell/\partial \beta, -\partial \ell/\partial \gamma, -\partial \ell/\partial \delta). Returns a vector of NaN if any parameter values are invalid according to their constraints, or if any value in data is not in the interval (0, 1).

Author(s)

Lopes, J. E.

References

Cordeiro, G. M., & de Castro, M. (2011). A new family of generalized distributions. Journal of Statistical Computation and Simulation,

Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1-2), 79-88.

(Note: Specific gradient formulas might be derived or sourced from additional references).

Cordeiro, G. M., & de Castro, M. (2011). A new family of generalized distributions. Journal of Statistical Computation and Simulation,

Kumaraswamy, P. (1980). A generalized probability density function for double-bounded random processes. Journal of Hydrology, 46(1-2), 79-88.

(Note: Specific gradient formulas might be derived or sourced from additional references).

See Also

grgkw (parent distribution gradient), llbkw (negative log-likelihood for BKw), hsbkw (Hessian for BKw, if available), dbkw (density for BKw), optim, grad (for numerical gradient comparison), digamma.

grgkw (parent distribution gradient), llbkw (negative log-likelihood for BKw), hsbkw (Hessian for BKw, if available), dbkw (density for BKw), optim, grad (for numerical gradient comparison), digamma.

Examples


# Assuming existence of rbkw, llbkw, grbkw, hsbkw functions for BKw

# Generate sample data
set.seed(123)
true_par_bkw <- c(alpha = 2, beta = 3, gamma = 1, delta = 0.5)
if (exists("rbkw")) {
  sample_data_bkw <- rbkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4])
} else {
  sample_data_bkw <- rgkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4], lambda = 1)
}
hist(sample_data_bkw, breaks = 20, main = "BKw(2, 3, 1, 0.5) Sample")

# --- Find MLE estimates ---
start_par_bkw <- c(1.5, 2.5, 0.8, 0.3)
mle_result_bkw <- stats::optim(par = start_par_bkw,
                               fn = llbkw,
                               gr = grbkw, # Use analytical gradient for BKw
                               method = "BFGS",
                               hessian = TRUE,
                               data = sample_data_bkw)

# --- Compare analytical gradient to numerical gradient ---
if (mle_result_bkw$convergence == 0 &&
    requireNamespace("numDeriv", quietly = TRUE)) {

  mle_par_bkw <- mle_result_bkw$par
  cat("\nComparing Gradients for BKw at MLE estimates:\n")

  # Numerical gradient of llbkw
  num_grad_bkw <- numDeriv::grad(func = llbkw, x = mle_par_bkw, data = sample_data_bkw)

  # Analytical gradient from grbkw
  ana_grad_bkw <- grbkw(par = mle_par_bkw, data = sample_data_bkw)

  cat("Numerical Gradient (BKw):\n")
  print(num_grad_bkw)
  cat("Analytical Gradient (BKw):\n")
  print(ana_grad_bkw)

  # Check differences
  cat("Max absolute difference between BKw gradients:\n")
  print(max(abs(num_grad_bkw - ana_grad_bkw)))

} else {
  cat("\nSkipping BKw gradient comparison.\n")
}

# --- Optional: Compare with relevant components of GKw gradient ---
# Requires grgkw function
if (mle_result_bkw$convergence == 0 && exists("grgkw")) {
  # Create 5-param vector for grgkw (insert lambda=1)
  mle_par_gkw_equiv <- c(mle_par_bkw[1:4], lambda = 1.0)
  ana_grad_gkw <- grgkw(par = mle_par_gkw_equiv, data = sample_data_bkw)
  # Extract components corresponding to alpha, beta, gamma, delta
  ana_grad_gkw_subset <- ana_grad_gkw[c(1, 2, 3, 4)]

  cat("\nComparison with relevant components of GKw gradient:\n")
  cat("Max absolute difference:\n")
  print(max(abs(ana_grad_bkw - ana_grad_gkw_subset))) # Should be very small
}




# Assuming existence of rbkw, llbkw, grbkw, hsbkw functions for BKw

# Generate sample data
set.seed(123)
true_par_bkw <- c(alpha = 2, beta = 3, gamma = 1, delta = 0.5)
if (exists("rbkw")) {
  sample_data_bkw <- rbkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4])
} else {
  sample_data_bkw <- rgkw(100, alpha = true_par_bkw[1], beta = true_par_bkw[2],
                         gamma = true_par_bkw[3], delta = true_par_bkw[4], lambda = 1)
}
hist(sample_data_bkw, breaks = 20, main = "BKw(2, 3, 1, 0.5) Sample")

# --- Find MLE estimates ---
start_par_bkw <- c(1.5, 2.5, 0.8, 0.3)
mle_result_bkw <- stats::optim(par = start_par_bkw,
                               fn = llbkw,
                               gr = grbkw, # Use analytical gradient for BKw
                               method = "BFGS",
                               hessian = TRUE,
                               data = sample_data_bkw)

# --- Compare analytical gradient to numerical gradient ---
if (mle_result_bkw$convergence == 0 &&
    requireNamespace("numDeriv", quietly = TRUE)) {

  mle_par_bkw <- mle_result_bkw$par
  cat("\nComparing Gradients for BKw at MLE estimates:\n")

  # Numerical gradient of llbkw
  num_grad_bkw <- numDeriv::grad(func = llbkw, x = mle_par_bkw, data = sample_data_bkw)

  # Analytical gradient from grbkw
  ana_grad_bkw <- grbkw(par = mle_par_bkw, data = sample_data_bkw)

  cat("Numerical Gradient (BKw):\n")
  print(num_grad_bkw)
  cat("Analytical Gradient (BKw):\n")
  print(ana_grad_bkw)

  # Check differences
  cat("Max absolute difference between BKw gradients:\n")
  print(max(abs(num_grad_bkw - ana_grad_bkw)))

} else {
  cat("\nSkipping BKw gradient comparison.\n")
}

# --- Optional: Compare with relevant components of GKw gradient ---
# Requires grgkw function
if (mle_result_bkw$convergence == 0 && exists("grgkw")) {
  # Create 5-param vector for grgkw (insert lambda=1)
  mle_par_gkw_equiv <- c(mle_par_bkw[1:4], lambda = 1.0)
  ana_grad_gkw <- grgkw(par = mle_par_gkw_equiv, data = sample_data_bkw)
  # Extract components corresponding to alpha, beta, gamma, delta
  ana_grad_gkw_subset <- ana_grad_gkw[c(1, 2, 3, 4)]

  cat("\nComparison with relevant components of GKw gradient:\n")
  cat("Max absolute difference:\n")
  print(max(abs(ana_grad_bkw - ana_grad_gkw_subset))) # Should be very small
}




gkwreg documentation built on April 16, 2025, 1:10 a.m.