SemiEstimate Examples

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(SemiEstimate)

Introduction to Implicit Profiling

Preliminaries

Assume we have a convex objective function $\mathcal{L}(\theta, \lambda)$, where $\theta$ is the parametric component with fixed dimension and $\lambda$ is the parameter for finite-dimensional approximation of nonparametric component whose dimension may grow with the sample size. Further assume $\theta$ and $\lambda$ are bundled in this objective function, i.e., $\theta$ and $\lambda$ cannot be clearly separated. Typical examples of this case include the semiparametric transformation model and the semiparametric GARCH-in-mean model, which are discussed in details in Section 4 and 5. It is worth noting that, $\lambda$ can also be a smooth function with infinite dimension. Then one can apply semiparametric methods to estimate $\lambda$.

To estimate $\theta$ and $\lambda$, let $\Psi(\theta, \lambda)$ and $\Phi(\theta, \lambda)$ denote the estimation equations with respect $\theta$ and $\lambda$, respectively. Specifically, $\Psi(\theta, \lambda)$ is the derivative of $\mathcal{L}(\theta, \lambda)$ with respect to $\theta$. When $\lambda$ is the nuisance parameter, then $\Phi(\theta, \lambda)$ is the derivative of $\mathcal{L}(\theta, \lambda)$ with respect to $\lambda$ . In the case that $\lambda$ is the smooth function, then $\Phi(\theta, \lambda)$ denotes the semiparametric estimation formula, such as the kernel smooth method or spline function. Then, to estimate $\theta$ and $\lambda$, we need to solve: $$\Psi(\theta,\lambda) = \mathbf{0}$$ $$\Phi(\theta, \lambda) = \mathbf{0}.\nonumber$$

\noindent Let $\mathbf{G}(\theta, \lambda) = (\Psi(\theta, \lambda)^{\top}, \Phi(\theta, \lambda)^{\top})^{\top}$. The entire updating algorithm (e.g, the Newton-Raphson method) requires $\mathbf{G}(\theta, \lambda)=\mathbf{0}$. Let $\beta = (\theta^{\top}, \lambda^{\top})^{\top}$. Then the updating formula is $\beta^{(k+1)} = \beta^{(k)} - (\partial \mathbf{G}(\beta^{(k)})/\partial \beta)^{-1}\mathbf{G}(\beta^{(k)}) $. However, due to the bundled relationship of $\theta$ and $\lambda$, the two parameters are often difficult to separate. This would result in a dense Hessian matrix $\partial \mathbf{G}/\partial \beta$. Consequently, the calculation of the inverse of the Hessian matrix often suffers from high computational cost, which makes the entire updating algorithm computationally very inefficient.

To solve this prolambdaem, many recursive updating methods have been applied \citep{nawata1994estimation,terzija2003improved,jiang2020recursive}. Basically, the recursive method breaks the connection of $\theta$ and $\lambda$ in the Hessian matrix. In other words, it considers the second-order derivative of the objective function with respect to each parameter, separately. Specifically, the updating formulas for the recursive method is given below: $$\lambda^{(k+1)} = \lambda^{(k)} - \left(\frac{\partial \Phi(\theta^{(k)}, \lambda^{(k)})}{ \partial \lambda}\right)^{-1}\Phi(\theta^{(k)}, \lambda^{(k)})$$ $$\theta^{(k+1)} = \theta^{(k)} - \left(\frac{\partial \Psi(\theta^{(k)}, \lambda^{(k+1)})}{ \partial \theta}\right)^{-1}\Psi(\theta^{(k)}, \lambda^{(k+1)}).$$

Although the update for $\lambda$ is still a sub-prolambdaem with growing dimension, the sub-prolambdaem Hessian $\partial \Phi/\partial \lambda$ is often sparse by the design of the finite dimensional approximation of the nonparametric component --- different element in $\lambda$ usually corresponds to the value of the nonparametric component at different locations. Consequently, inverting $\partial \Phi/\partial \lambda$ can be much faster than inverting $\partial \mathbf{G}/\partial \beta$. The updating formula in \eqref{eq: simple iteration} iterates each parameter without considering the interaction with the other parameter. In other words, it makes approximation to the true second-order derivative $\partial \mathbf{G}/\partial \beta$ by setting the off-diagonal lambdaocks zero. For example, when updating $\theta$, the derivative $\partial \Psi/\partial \theta$ considers $\lambda$ as a constant and does not take into account the current value of $\lambda$. Under the situation that $\theta$ and $\lambda$ are strongly correlated with each other, the recursive method would definitely loss information and thus result in sub-optimal update directions.

Implicit Profiling Algorithm

Both the entire updating method and recursive method are computationally inefficient. To address this issue, we propose an implicit profiling method. It is notalambdae that, $\theta$ is the only parameter of interest. Therefore, we only focus on the efficient estimation of $\theta$. Recall that, in the recursive method, the updating formula for $\theta$ has lost some information by treating $\lambda$ as a constant, which makes the estimation of $\theta$ inefficient. To address this prolambdaem, we propose an implicit profiling (IP) method. Specifically, we regard $\lambda$ as the function of $\theta$, which we denote by $\lambda(\theta)$. Then, the second-order derivative of the objective function respect to $\theta$ can be derived as follows:

$$\frac{\partial \Psi(\theta, \lambda(\theta))}{\partial \theta} = \frac{\partial \Psi(\theta, \lambda(\theta))}{\partial \theta} + \frac{\partial \Psi(\theta, \lambda(\theta))}{\partial\lambda(\theta)}\frac{\partial\lambda(\theta)}{\partial\theta}$$

where the derivative relationship $\partial\lambda/\partial\theta$ can be othetaained by solving $\partial \Phi(\theta, \lambda(\theta))/\partial\lambda = \mathbf{0}$. We refer to \eqref{eq:ip_update} as the implicit profiling Hessian matrix of $\theta$, which decides the updating direction of $\theta$ in the implicit profiling method. Based on \eqref{eq:ip_update}, we can update $\theta$ and $\lambda$ iteratively using the following updating formulas:

$$\lambda^{(k+1)} = \lambda^{(k)} - \left(\frac{\partial \Phi(\theta^{(k)}, \lambda^{(k)})}{\partial \lambda}\right)^{-1}\Phi(\theta^{(k)}, \lambda^{(k)})$$ $$\theta^{(k+1)} = \theta^{(k)} -\left( \frac{\partial \Psi(\theta^{(k)}, \lambda^{(k+1)})}{\partial \theta} + \frac{\partial \Psi(\theta^{(k)}, \lambda^{(k+1)})}{\partial \lambda}\frac{\partial\lambda^{(k+1)}}{\partial\theta}\right)^{-1}\Psi(\theta^{(k)}, \lambda^{(k+1)})$$

The complete algorithm of the implicit profiling method is present in Algorithm \ref{ag:generalization IP}.

The Implicit Profiling Algorithm

  1. Initialize $\theta^{(0)}$;
  2. Solve $\lambda^{(0)}$ from the equation $\Phi(\theta^{(0)}, \lambda^{(0)}) = \mathbf{0}$;
  3. REPEAT UNTIL Convergence
  4. Update $\lambda$ from $$\lambda^{(k+1)} = \lambda^{(k)} - \left(\frac{\partial \Phi(\theta^{(k)}, \lambda^{(k)})}{\partial \lambda}\right)^{-1}\Phi(\theta^{(k)}, \lambda^{(k)});\nonumber$$
  5. Solve the implicit gradient $\mathbf{d}^{(k+1)} = \partial\lambda^{(k+1)}(\theta^{(k)})/\partial\theta$ from $$\frac{d\Phi(\theta^{(k)},\lambda^{(k+1)})}{d\theta} = \frac{\partial\Phi(\theta^{(k)}, \lambda^{(k+1)})}{\partial \theta} + \frac{\partial \Phi(\theta^{(k)}, \lambda^{(k+1)})}{\partial\lambda}\mathbf{d}^{(k+1)} = \mathbf{0}\nonumber$$
  6. Compute the implicit profiling Hessian: $$\mathbb{H}^{(k+1)} = \frac{\partial \Psi(\theta^{(k)}, \lambda^{(k+1)})}{\partial \theta} + \frac{\partial \Psi(\theta^{(k)}, \lambda^{(k+1)})}{\partial \lambda}\mathbf{d}^{(k+1)}.$$
  7. Update $\theta$ from $$\theta^{(k+1)} = \theta^{(k)} - \mathbb{H}^{-1}\Psi(\theta^{(k)}, \lambda^{(k+1)});\nonumber$$

For the initialization of $\lambda$, it is recommended to solve the equation $\Phi(\theta^{(0)}, \lambda^{(0)}) = \mathbf{0}$, which can help to improve the convergence speed. However, this calculation may also require high computational cost. In the case that the computational cost is not acceptalambdae, one can also randomly choose an initial value. In each updating iteration, we first update $\lambda$, which is denoted by $\lambda^{(k+1)}$. Then, we calculate the implicit profiling Hessian matrix of $\theta$ using the newly updated $\lambda^{(k+1)}$, and then get an updated value $\theta^{(k+1)}$. Repeat the iteration steps until convergence, which leads to the final estimates of $\theta$ and $\lambda$.

It is notalambdae that, the implicit profiling method accounts for the interaction between $\theta$ and $\lambda$ by treating $\lambda$ as a function of $\theta$. Consequently, the resulting estimator of $\theta$ should be equal to the estimate implemented by the entire updating method. We summarize this finding in the following two propositions.

Assume the objective function $\mathcal{L}$ is strictly convex. Convergent point of implicit profiling method and that of Newton-Raphson method are identical.

For any local quadratic prolambdaem $Q$, implicit profiling method reaches its minimal within two steps.

The detailed proof of the two propositions are given in Appendix A.1 and A.2, respectively. Propositions 1 and 2 estalambdaished that the implicit profiling method shared the theoretical properties of the Newton-Raphson method. By Proposition 1, implicit profiling only converges at the minimum of the convex loss. By Proposition 2, the convergence is guaranteed when the Newton-Raphson method converges, and the number of iterations taken before converges is comparalambdae to that of the Newton-Raphson method. Later in the experiments, we found that the fewer number of iterations is the driving factor for implicit profiling method's advantage in run time compared to other iterative methods. In many cases, single iteration of the implicit profiling can be faster than that of the Newton-Raphson method when the dependence structure of $\theta$, $\lambda$ and the loss function enalambdaes the implicit profiling to simplify the whole Hessian matrix calculation and global value searching of the Newton-Raphson method. Together with the control on the number of iterations, the implicit profiling method is computationally more efficient than the Newton-Raphson method.

toy example

The average number of iterative steps consumed by the Newton-Raphson method, the naive iteration method and the implicit proling method.

Code

# =====================================
# ======= z = x^2 + y^2 + alpha xy=====
# =====================================
# test simple jac


# provided jocabiean and mathematical
Phi_fn <- function(theta, lambda, alpha) 2 * theta + alpha * lambda
Psi_fn <- function(theta, lambda, alpha) 2 * lambda + alpha * theta
res <- semislv(1, 1, Phi_fn, Psi_fn, method = "implicit", alpha = 1)
res <- semislv(1, 1, Phi_fn, Psi_fn, jac = list(Phi_der_theta_fn = function(theta, lambda, alpha) 2, Phi_der_lambda_fn = function(theta, lambda, alpha) alpha, Psi_der_theta_fn = function(theta, lambda, alpha) alpha, Psi_der_lambda_fn = function(theta, lambda, alpha) 2), method = "implicit", alpha = 1)
res <- semislv(1, 1, Phi_fn, Psi_fn, jac = list(Phi_der_theta_fn = function(theta, lambda, alpha) 2, Phi_der_lambda_fn = function(theta, lambda, alpha) alpha, Psi_der_theta_fn = function(theta, lambda, alpha) alpha, Psi_der_lambda_fn = function(theta, lambda, alpha) 2), method = "iterative", alpha = 1)

Newton <- function(beta0, alpha) {
        H_GS <- matrix(c(2, alpha, alpha, 2), nrow = 2, ncol = 2) # Hessian Matrix
        beta_GS <- beta0
        step_GS <- 0
        series <- NULL
        t0 <- Sys.time()
        f <- function(x, y) {
                return(c(2 * x + alpha * y, 2 * y + alpha * x))
        } # the target function

        while (TRUE) {
                bscore <- f(beta_GS[1], beta_GS[2])
                if (all(abs(bscore) < 1e-7)) break
                beta_GS <- beta_GS - solve(H_GS, bscore) # NR iteration
                step_GS <- step_GS + 1
                series <- rbind(series, beta_GS)
        }

        run_time_GS <- Sys.time() - t0
        return(list(
                beta = beta_GS, step = step_GS, run_time = run_time_GS,
                series = series
        ))
}

get_fit_from_raw <- function(raw_fit) {
        fit <- list()
        fit$beta <- c(raw_fit$parameters$theta, raw_fit$parameters$lambda)
        fit$step <- raw_fit$step
        fit$run_time <- raw_fit$run.time
        series <- c(res$iterspace$initials$theta, res$iterspace$initials$lambda)
        purrr::walk(raw_fit$res_path, function(x) series <<- rbind(series, c(x$parameters$theta, x$parameters$lambda)))
        fit$series <- series
        fit
}


run_Ip <- function(intermediates, theta, lambda, alpha) {
        x <- theta
        y <- lambda
        yscore <- 2 * y + alpha * x
        intermediates$y_delta <- -yscore / 2 # IP lambda iteration
        y <- y + intermediates$y_delta
        xscore <- 2 * x + alpha * y
        intermediates$x_delta <- -2 * xscore / (4 - alpha^2) # IP theta iteration
        intermediates
}

theta_delta_Ip <- function(intermediates) {
        intermediates$theta_delta <- intermediates$x_delta
        intermediates
}

lambda_delta_Ip <- function(intermediates) {
        intermediates$lambda_delta <- intermediates$y_delta
        intermediates
}

run_It <- function(intermediates, theta, lambda, alpha) {
        x <- theta
        y <- lambda
        yscore <- 2 * y + alpha * x
        intermediates$y_delta <- -yscore / 2 # IP lambda iteration
        y <- y + intermediates$y_delta
        xscore <- 2 * x + alpha * y
        intermediates$x_delta <- -xscore / 2 #-2 * xscore / (4 - alpha^2) # IP theta iteration
        intermediates
}

theta_delta_It <- function(intermediates) {
        intermediates$theta_delta <- intermediates$x_delta
        intermediates
}

lambda_delta_It <- function(intermediates) {
        intermediates$lambda_delta <- intermediates$y_delta
        intermediates
}

Simulation

j <- 1
step_all <- list()
series_all <- list()
direction_all <- list()
for (k in c(0, 0.2, 0.4, 0.6, 0.8, 1.0, 1.2, 1.4, 1.6, 1.8)) {
        step <- list()
        series <- list()
        direction <- list()
        length <- 10
        theta <- seq(0, 2 * base::pi, length.out = length)
        alpha <- k
        for (i in 1:10) {
                C <- i^2
                x <- (sqrt(C * (1 - alpha / 2)) * cos(theta) + sqrt(C * (1 + alpha / 2)) * sin(theta)) / sqrt(2 - alpha^2 / 2)
                y <- (sqrt(C * (1 - alpha / 2)) * cos(theta) - sqrt(C * (1 + alpha / 2)) * sin(theta)) / sqrt(2 - alpha^2 / 2)
                sub_step <- matrix(nrow = 3, ncol = length)
                sub_series <- list()
                k1 <- list()
                k2 <- list()
                k3 <- list()
                sub_direction <- list()
                for (ii in 1:length) {
                        beta0 <- c(x[ii], y[ii])
                        Newton_fit <- Newton(beta0, alpha)
                        Ip_raw_fit <- semislv(theta = beta0[1], lambda = beta0[2], Phi_fn, Psi_fn, jac = list(Phi_der_theta_fn = function(theta, lambda, alpha) 2, Phi_der_lambda_fn = function(theta, lambda, alpha) alpha, Psi_der_theta_fn = function(theta, lambda, alpha) alpha, Psi_der_lambda_fn = function(theta, lambda, alpha) 2), method = "implicit", alpha = alpha, control = list(max_iter = 100, tol = 1e-7))
                        Ip_fit <- get_fit_from_raw(Ip_raw_fit)
                        It_raw_fit <- semislv(theta = beta0[1], lambda = beta0[2], Phi_fn, Psi_fn, jac = list(Phi_der_theta_fn = function(theta, lambda, alpha) 2, Phi_der_lambda_fn = function(theta, lambda, alpha) alpha, Psi_der_theta_fn = function(theta, lambda, alpha) alpha, Psi_der_lambda_fn = function(theta, lambda, alpha) 2), method = "iterative", alpha = alpha, control = list(max_iter = 100, tol = 1e-7))
                        It_fit <- get_fit_from_raw(It_raw_fit)
                        sub_step[, ii] <- c(Newton_fit$step, It_fit$step, Ip_fit$step)
                        k1[[ii]] <- Newton_fit$series
                        k2[[ii]] <- It_fit$series
                        k3[[ii]] <- Ip_fit$series
                        sub_direction[[ii]] <- It_fit$direction
                }
                step[[i]] <- sub_step
                sub_series[["Newton"]] <- k1
                sub_series[["It"]] <- k2
                sub_series[["Ip"]] <- k3
                series[[i]] <- sub_series
                direction[[i]] <- sub_direction
        }
        step_all[[j]] <- step
        series_all[[j]] <- series
        direction_all[[j]] <- direction
        j <- j + 1
}

Output

Newton_Raphson = NULL
IT_step = NULL
IP_step = NULL
for (i in 1:10) {
        k <- step_all[[i]]
        s <- NULL
        for (j in 1:length(k)) {
                s <- cbind(s, k[[j]])
        }
        tmp = apply(s, 1, mean)
        print(tmp)
        Newton_Raphson = c(Newton_Raphson, tmp[1])
        IT_step = c(IT_step, tmp[2])
        IP_step = c(IP_step, tmp[3])
}

k <- step_all[[9]]
k <- lapply(k, apply, 1, mean)
s <- NULL
for (i in 1:10) {
        s <- rbind(s, k[[i]])
}
print(s)
Newton_Raphson = s[, 1]
IP_step = s[, 3]
IT_step = s[, 2]

Risk Prediction Model

Code

library(SemiEstimate)
library(nleqslv)
# Data processure
sim.gen.STM <- function(n, p, beta0, sigmaZ, Nperturb = 0) {
  Z <- mvrnorm(n, rep(0, p), sigmaZ^2 * (0.2 + 0.8 * diag(1, p)))
  u <- runif(n)
  T <- exp((log(u / (1 - u)) - Z %*% beta0) / 3) * 4 # h^-1 (g^-1(u) - beta'Z)
  C <- runif(n, 0, 12)
  delta <- (T <= C)

  return(list(
    delta = delta, C = C,
    Z = Z
  ))
}

f = function(parameters, delta, Z, KC, N, p)
{ 
  # parameters contain beta and h
  beta = parameters[1:p]
  h = parameters[-(1:p)]
  pi = function(x) return(1/(1+exp(-x)))

  line = Z %*% beta
  # line is a N*1 vector

  reg = outer(h, line, "+")[,,1]
  reg_pi = pi(reg)
  # reg is a N*N matrix

  dif = as.vector(delta) - t(reg_pi)
  f1 = Z * diag(dif)
  f1 = apply(f1, 2, sum)
  f2 = KC * dif
  f2 = apply(f2, 2, sum)

  return(c(f1, f2))
}

global <- function(init, f, delta, Z, KC, N, p) {
  t0 <- Sys.time()
  out <- nleqslv(init, f,
                          delta = delta, Z = Z, KC = KC,
                          N = N, p = p, method = "Newton", global = "none"
  )
  run.time <- Sys.time() - t0
  return(list(Model = out, run.time = run.time))
}

expit = function(d) return(1/(1+exp(-d)))

pi <- function(x) 1 / (1 + exp(-x))

Psi_fn <- function(theta, lambda, delta, Z, KC, N, p) {
  line <- Z %*% beta
  reg <- outer(lambda, line, "+")[, , 1]
  reg_pi <- pi(reg)
  dif <- as.vector(delta) - t(reg_pi)
  apply(Z * diag(dif), 2, sum)
}

Phi_fn <- function(theta, lambda, delta, Z, KC, N, p) {
  line <- Z %*% beta
  reg <- outer(lambda, line, "+")[, , 1]
  reg_pi <- pi(reg)
  dif <- as.vector(delta) - t(reg_pi)
  apply(KC * dif, 2, sum)
}

# ===========================================
# ================ Implicit =================
# ===========================================

run_time.ip <- function(intermediates, theta, lambda, delta, Z, KC, Zd, KCd) {
  beta <- theta
  hC <- lambda
  expit <- function(d) {
    return(1 / (1 + exp(-d)))
  }
  lp <- drop(Z %*% beta)
  # print(hC[hC.flag])
  gij <- expit(outer(hC, lp, "+"))
  tmp <- KC * gij
  wZbar <- tmp * (1 - gij)
  hscore <- apply(tmp, 1, sum) - KCd
  hHess <- apply(wZbar, 1, sum)

  dhC <- hscore / hHess
  dhC <- sign(dhC) * pmin(abs(dhC), 1)
  intermediates$dhC <- dhC
  hC <- hC - dhC
  Zbar <- (wZbar %*% Z) / hHess

  gi <- expit(hC + lp)
  bscore <- drop(t(Z) %*% (delta - gi))

  bHess <- t(gi * (1 - gi) * Z) %*% (Zbar - Z)
  dinit <- solve(bHess, bscore)

  intermediates$dinit <- dinit
  intermediates
}

theta_delta.ip <- function(intermediates) {
  intermediates$theta_delta <- -intermediates$dinit
  intermediates
}

lambda_delta.ip <- function(intermediates) {
  intermediates$lambda_delta <- -intermediates$dhC
  intermediates
}


# ======================================
# ============= Iterative ==============
# ======================================

run_time.it <- function(intermediates, lambda, theta, Z, KC, Zd, KCd) {
  beta <- theta
  hC <- lambda
  expit <- function(d) {
    return(1 / (1 + exp(-d)))
  }

  f_beta <- function(beta, hC, gi) {
    temp_pi <- t(Z) %*% gi
    return(Zd - temp_pi)
  }

  jacf_beta <- function(beta, hC, gi) {
    bHess <- t(gi * (1 - gi) * Z) %*% Z
    return(-bHess)
  }

  f_hC <- function(hC, beta, gij, temp) {
    return(apply(temp, 1, sum) - KCd)
  }

  jacf_hC <- function(hC, beta, gij, temp) {
    wZbar <- temp * (1 - gij)
    hscore <- apply(temp, 1, sum) - KCd
    hHess <- apply(wZbar, 1, sum)
    hHess <- diag(hHess)
    return(hHess)
  }
  intermediates$gi <- expit(hC + drop(Z %*% beta))
  intermediates$temp_beta <- nleqslv(beta, f_beta,
                                              jac = jacf_beta,
                                              hC = hC, gi = intermediates$gi, method = "Newton",
                                              global = "none", control = list(maxit = 1)
  )
  intermediates$gij <- matrix(0, nrow = n, ncol = n)
  intermediates$gij[1:n, ] <- expit(outer(hC, Z %*% intermediates$temp_beta$x, "+"))
  intermediates$temp <- KC * intermediates$gij
  intermediates$temp_hC <- nleqslv(hC, f_hC,
                                            jac = jacf_hC,
                                            beta = temp_beta$x, gij = intermediates$gij, temp = intermediates$temp, method = "Newton",
                                            global = "none", control = list(maxit = 1)
  )
  intermediates
}
theta_delta.it <- function(intermediates, theta) {
  intermediates$theta_delta <- intermediates$temp_beta$x - theta
  intermediates
}


lambda_delta.it <- function(intermediates, lambda) {
  intermediates$lambda_delta <- intermediates$temp_hC$x - lambda
  intermediates
}

Simulation

$N = 500$

#nrep <- 100 # 100
#N <- 500
#p <- 10
#beta0 <- c(0.7, 0.7, 0.7, -0.5, -0.5, -0.5, 0.3, 0.3, 0.3, 0)
#sigmaZ <- 1
#set.seed(20210806)
#
#compare <- list()
#time <- matrix(nrow = nrep, ncol = 3)
#step <- matrix(nrow = nrep, ncol = 3)
#mse_global <- 0
#mse_IP <- 0
#mse_IT <- 0
#
#for (i in 1:nrep) {
#  dat <- sim.gen.STM(N, p, beta0, sigmaZ) ## don't change
#  h <- sd(dat$C) / (sum(dat$delta))^0.25
#  KC <- dnorm(as.matrix(dist(dat$C / h, diag = T, upper = T))) / h
#  KCd <- drop(KC %*% dat$delta)
#  theta0 <- rep(0, ncol(dat$Z))
#  n <- nrow(dat$Z)
#  Zd <- drop(t(dat$Z) %*% dat$delta)
#  lambda0 <- rep(0, n)
#  ## place my initial value
#  out_global <- global(rep(0, N + p), f, dat$delta, dat$Z, KC, N, p)
#  ## return beta, runtime, step
#  intermediates.ip <- list(hC = lambda0, beta = theta0)
#  intermediates.it <- list(hC = lambda0, beta = theta0)
#  out_ip <- semislv(theta = theta0, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "implicit", diy = TRUE, Z = dat$Z, delta = dat$delta, KC = KC, n = n, KCd = KCd, Zd = Zd, run_time = run_time.ip, theta_delta = theta_delta.ip, lambda_delta = lambda_delta.ip, intermediates = intermediates.ip, control = list(max_iter = 100, tol = 1e-7))

#  out_it <- semislv(theta = theta0, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "iterative", diy = TRUE, Z = dat$Z, delta = dat$delta, KC = KC, n = n, KCd = KCd, Zd = Zd, run_time = run_time.it, theta_delta = theta_delta.it, lambda_delta = lambda_delta.it, intermediates = intermediates.it, control = list(max_iter = 100, tol = 1e-7))

#  mse_global <- mse_global + sum((out_global$Model$x[1:p] - beta0)^2)
#  mse_IP <- mse_IP + sum((out_ip$theta - beta0)^2)
#  mse_IT <- mse_IT + sum((out_it$theta - beta0)^2)
#  time[i, ] <- c(out_global$run.time, out_ip$run.time, out_it$run.time) # out_global$run.time,
#  step[i, ] <- c(out_global$Model$iter, out_ip$step, out_it$step) # out_global$Model$iter,
#  compare[[i]] <- rbind(out_global$Model$x[1:p], out_ip$theta, out_it$theta) # out_global$Model$x[1:p],
  ## cat(
  ##         "step", i, sum(abs(out_global$Model$x[1:p] - out_ip$theta)),
  ##         sum(abs(out_global$Model$x[1:p] - out_it$theta)), "\n"
  ## )
#  cat("iter", i, "\n")
#}
# apply(time, 2, mean)
# apply(step, 2, mean)
# sqrt(mse_global <- mse_global / nrep)
# sqrt(mse_IP <- mse_IP / nrep)
# sqrt(mse_IT <- mse_IT / nrep)

$N = 1000$

# N <- 1000
# time2 <- matrix(nrow = nrep, ncol = 3)
# step2 <- matrix(nrow = nrep, ncol = 3)
# compare2 <- list()
# mse_global2 <- 0
# mse_IP2 <- 0
# mse_IT2 <- 0
# for (i in 1:nrep) {
#   dat <- sim.gen.STM(N, p, beta0, sigmaZ) ## don't change
#   h <- sd(dat$C) / (sum(dat$delta))^0.25
#   KC <- dnorm(as.matrix(dist(dat$C / h, diag = T, upper = T))) / h
#   KCd <- drop(KC %*% dat$delta)
#   theta0 <- rep(0, ncol(dat$Z))
#   n <- nrow(dat$Z)
#   Zd <- drop(t(dat$Z) %*% dat$delta)
#   lambda0 <- rep(0, n)
#   ## place my initial value
#   out_global <- global(rep(0, N + p), f, dat$delta, dat$Z, KC, N, p)
#   ## return beta, runtime, step
#   
#   intermediates.ip <- list(hC = lambda0, beta = theta0)
#   
#   intermediates.it <- list(hC = lambda0, beta = theta0)
#   out_ip <- semislv(theta = theta0, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "implicit", diy = TRUE, Z = dat$Z, delta = dat$delta, KC = KC, n = n, KCd = KCd, Zd = Zd, run_time = run_time.ip, theta_delta = theta_delta.ip, lambda_delta = lambda_delta.ip, intermediates = intermediates.ip, control = list(max_iter = 100, tol = 1e-7))
#   out_it <- semislv(theta = theta0, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "iterative", diy = TRUE, Z = dat$Z, delta = dat$delta, KC = KC, n = n, KCd = KCd, Zd = Zd, run_time = run_time.it, theta_delta = theta_delta.it, lambda_delta = lambda_delta.it, intermediates = intermediates.it, control = list(max_iter = 100, tol = 1e-7))
#   mse_global2 <- mse_global2 + sum((out_global$Model$x[1:p] - beta0)^2)
#   mse_IP2 <- mse_IP2 + sum((out_ip$theta - beta0)^2)
#   mse_IT2 <- mse_IT2 + sum((out_it$theta - beta0)^2)
#   time2[i, ] <- c(out_global$run.time, out_ip$run.time, out_it$run.time) # out_global$run.time,
#   step2[i, ] <- c(out_global$Model$iter, out_ip$step, out_it$step) # out_global$Model$iter,
#   compare2[[i]] <- rbind(out_global$Model$x[1:p], out_ip$theta, out_it$theta) # out_global$Model$x[1:p],
  ## cat(
  ##         "step", i, sum(abs(out_global$Model$x[1:p] - out_ip$theta)),
  ##         sum(abs(out_global$Model$x[1:p] - out_it$theta)), "\n"
  ## )
#   cat("iter", i, "\n")
# }
# apply(time2, 2, mean)
# apply(step2, 2, mean)
# sqrt(mse_global2 <- mse_global2 / nrep)
# sqrt(mse_IP2 <- mse_IP2 / nrep)
# sqrt(mse_IT2 <- mse_IT2 / nrep)

GARCH-M

Code

library(SemiEstimate)
library(splines2)
library(BB)

# ===========================
# ====== Back Fitting =======
# ===========================

# sigma estimation
series_cal <- function(y, init, sigma_1) {
  N <- length(y)
  sigma <- vector(length = N)
  sigma[1] <- sigma_1
  for (i in 2:N) {
    sigma[i] <- init[1] + init[2] * y[i - 1]^2 + init[3] * sigma[i - 1]
  }
  return(sigma)
}

spline_matrix <- function(sigma, knots, n_partitions) {
  m <- cbind(sigma, sigma^2)
  for (i in 1:n_partitions) {
    k <- sigma - knots[i]
    k[which(k < 0)] <- 0
    m <- cbind(m, k^2)
  }
  return(m)
}

# QML
M <- function(init_est, y, epsilon, sigma_1) {
  sigma <- series_cal(y, init_est, sigma_1)

  k1 <- -1 / 2 * sum(log(sigma))
  k2 <- -1 / 2 * sum(epsilon^2 / sigma)
  return(-(k1 + k2))
}

# Backfitting
bf <- function(y, init = rep(1, 3),
               sigma_1 = var(y), tol = 1e-5, maxiter = 20, lower = 1e-3, upper = 1,
               judge_k = F) {
  key <- init
  t1 <- Sys.time()
  iter <- 0
  step <- 1
  N <- length(y)
  n_partitions <- floor(N^(3 / 20))

  judge <- TRUE 
  judge_covergence <- TRUE 

  while (judge) {
    sigma <- series_cal(y, init, sigma_1)

    k <- range(sigma)
    knots <- seq(k[1], k[2], length.out = n_partitions + 2)
    knots <- knots[c(-1, -length(knots))]

    sigma_m <- bSpline(sigma, knots = knots, degree = 2)
    eta_out <- lm(y ~ sigma_m)
    eta <- predict(eta_out)

    epsilon <- y - eta


    init_out <- BBoptim(init, M,
                        y = y, sigma_1 = sigma_1, epsilon = epsilon, lower = lower,
                        upper = upper,
                        control = list(maxit = 1500, gtol = tol, ftol = tol^2)
    )

    if (init_out$convergence > 0) judge_covergence <- FALSE
    if (judge_k & (init_out$iter > 500)) {
      judge_covergence <- FALSE
    }


    if (max(abs(init_out$par - init)) < tol) judge <- FALSE

    cat(step, init - init_out$par, init_out$convergence, "\n")

    init <- init_out$par
    iter <- iter + init_out$iter
    step <- step + 1
    if (step > maxiter) judge <- FALSE
  }
  if (step > maxiter) judge_covergence <- FALSE
  sigma <- series_cal(y, init, sigma_1)
  run.time <- Sys.time() - t1
  result <- list()
  result$beta <- init
  result$eta <- eta
  result$sigma <- sigma
  result$run.time <- run.time
  result$step <- step
  result$iter <- iter
  result$judge_covergence <- judge_covergence

  return(result)
}

# =================================
# =========== SP-MBP ==============
# =================================


series_cal <- function(y, init, sigma_1) {
  N <- length(y)
  sigma <- vector(length = N)
  sigma[1] <- sigma_1
  for (i in 2:N) {
    sigma[i] <- init[1] + init[2] * y[i - 1]^2 + init[3] * sigma[i - 1]
  }
  return(sigma)
}


spline_matrix <- function(sigma, knots, n_partitions) {
  m <- cbind(sigma, sigma^2)
  for (i in 1:n_partitions) {
    k <- sigma - knots[i]
    k[which(k < 0)] <- 0
    m <- cbind(m, k^2)
  }
  return(m)
}


# QML
M_sp <- function(init_est, y, epsilon, sigma_1, Psi2, n_partitions) {
  sigma <- series_cal(y, init_est, sigma_1)
  k1 <- -1 / 2 * sum(log(sigma))
  k2 <- -1 / 2 * sum(epsilon^2 / sigma)

  k <- range(sigma)
  knots <- seq(k[1], k[2], length.out = n_partitions + 2)
  knots <- knots[c(-1, -length(knots))]
  sigma_m <- bSpline(sigma, knots = knots, degree = 2)

  eta_out <- lm(y ~ sigma_m)
  eta <- predict(eta_out)

  k3 <- Psi2 %*% init_est

  return(-(k1 + k2 + k3))
}

# Psi_2
Psi_2_B <- function(y, init, sigma, epsilon, knots) {
  init_dsigma <- rep(0, 3)
  dsigma <- matrix(0, nrow = length(y), ncol = 3)
  dsigma[1, ] <- init_dsigma

  for (i in 2:length(sigma)) {
    dsigma[i, ] <- c(1, y[i - 1]^2, sigma[i - 1]) + init[3] * dsigma[(i - 1), ]
  }

  sigma_d <- dbs(sigma, knots = knots, degree = 2)
  eta_d <- lm(y ~ sigma_d)
  eta_d <- predict(eta_d)
  eta_d <- eta_d * dsigma

  output <- apply(epsilon / sigma * eta_d, 2, sum)

  return(output)
}

# SPMBP
spmbp_B <- function(y, init = rep(1, 3),
                    sigma_1 = var(y), tol = 1e-5, maxiter = 20, lower = 1e-3, upper = 1,
                    judge_k = F) {
  key <- init
  t1 <- Sys.time()
  iter <- 0
  step <- 1
  N <- length(y)
  n_partitions <- floor(N^(3 / 20))

  judge <- TRUE
  judge_covergence <- TRUE

  while (judge) {
    sigma <- series_cal(y, init, sigma_1)


    k <- range(sigma)
    knots <- seq(k[1], k[2], length.out = n_partitions + 2)
    knots <- knots[c(-1, -length(knots))]
    sigma_m <- bSpline(sigma, knots = knots, degree = 2)

    eta_out <- lm(y ~ sigma_m)
    eta <- predict(eta_out)

    epsilon <- y - eta

    Psi2 <- Psi_2_B(y, init, sigma, epsilon, knots)

    init_out <- BBoptim(init, M_sp,
                        y = y, sigma_1 = sigma_1, epsilon = epsilon,
                        n_partitions = n_partitions,
                        Psi2 = Psi2, lower = 1e-3, upper = upper,
                        control = list(maxit = 1500, gtol = tol, ftol = tol^2)
    )

    if (init_out$convergence > 0) judge_covergence <- FALSE
    if (judge_k & (init_out$iter > 500)) {
      judge_covergence <- FALSE
    }

    if (max(abs(init_out$par - init)) < tol) judge <- FALSE

    cat(step, init - init_out$par, init_out$convergence, "\n")
    init <- init_out$par

    step <- step + 1
    iter <- iter + init_out$iter
    if (step > maxiter) judge <- FALSE
  }
  if (step > maxiter) judge_covergence <- FALSE

  sigma <- series_cal(y, init, sigma_1)
  run.time <- Sys.time() - t1
  result <- list()
  result$beta <- init
  result$eta <- eta
  result$sigma <- sigma
  result$run.time <- run.time
  result$step <- step
  result$iter <- iter
  result$judge_covergence <- judge_covergence

  return(result)
}


# ===========================
# ====== IP-GARCH ===========
# ===========================

# QML
M_ip_BB <- function(init_est, y, sigma_1, n_partitions) {
  sigma <- series_cal(y, init_est, sigma_1)
  k <- range(sigma)
  knots <- seq(k[1], k[2], length.out = n_partitions + 2)
  knots <- knots[c(-1, -length(knots))]
  sigma_m <- bSpline(sigma, knots = knots, degree = 2)

  eta_out <- lm(y ~ sigma_m)
  eta <- predict(eta_out)

  epsilon <- y - eta

  k1 <- -1 / 2 * sum(log(sigma))
  k2 <- -1 / 2 * sum(epsilon^2 / sigma)

  return(-(k1 + k2))
}

IP_GARCH_BB <- function(y, init = rep(1, 3),
                        sigma_1 = var(y), tol = 1e-5, maxiter = 20, lower = 1e-3, upper = 1,
                        judge_k = F) {
  key <- init
  t1 <- Sys.time()
  iter <- 0
  step <- 1
  N <- length(y)
  n_partitions <- floor(N^(3 / 20))

  judge <- TRUE
  judge_covergence <- TRUE

  init_out <- BBoptim(init, M_ip_BB,
                      y = y, sigma_1 = sigma_1,
                      n_partitions = n_partitions, lower = 1e-3, upper = upper,
                      control = list(
                        maxit = 1500, ftol = tol^2,
                        gtol = tol
                      )
  )



  if (init_out$convergence > 0) judge_covergence <- FALSE
  if (judge_k & init_out$iter > 500) {
    judge_covergence <- FALSE
  }

  if (sum((init_out$par - init)^2) < tol) judge <- FALSE

  cat(step, init - init_out$par, init_out$convergence, "\n")
  init <- init_out$par

  step <- step + 1
  iter <- iter + init_out$iter
  if (step > maxiter) judge <- FALSE

  if (step > maxiter) judge_covergence <- FALSE

  sigma <- series_cal(y, init, sigma_1)
  run.time <- Sys.time() - t1
  result <- list()
  result$beta <- init
  # result$eta = eta
  result$sigma <- sigma
  result$run.time <- run.time
  result$step <- step
  result$iter <- iter
  result$judge_covergence <- judge_covergence

  return(result)
}



IP_GARCH_BB <- function(intermediates, data, theta) {
  tol <- 1e-5
  y <- data
  init <- theta
  sigma_1 <- var(y)
  upper <- 1
  intermediates$theta_1 <- init
  # estimate sigma
  series_cal <- function(y, init, sigma_1) {
    N <- length(y)
    sigma <- vector(length = N)
    sigma[1] <- sigma_1
    for (i in 2:N) {
      sigma[i] <- init[1] + init[2] * y[i - 1]^2 + init[3] * sigma[i - 1]
    }
    return(sigma)
  }

  # calc the spline matric
  spline_matrix <- function(sigma, knots, n_partitions) {
    m <- cbind(sigma, sigma^2)
    for (i in 1:n_partitions) {
      k <- sigma - knots[i]
      k[which(k < 0)] <- 0
      m <- cbind(m, k^2)
    }
    return(m)
  }

  # Calculate the Psi
  M <- function(init_est, y, epsilon, sigma_1) {
    sigma <- series_cal(y, init_est, sigma_1)
    k1 <- -1 / 2 * sum(log(sigma))
    k2 <- -1 / 2 * sum(epsilon^2 / sigma)
    return(-(k1 + k2))
  }

  M_ip_BB <- function(init_est, y, sigma_1, n_partitions) {
    sigma <- series_cal(y, init_est, sigma_1)
    k <- range(sigma)
    knots <- seq(k[1], k[2], length.out = n_partitions + 2)
    knots <- knots[c(-1, -length(knots))]
    sigma_m <- bSpline(sigma, knots = knots, degree = 2)

    eta_out <- lm(y ~ sigma_m)
    eta <- predict(eta_out)

    epsilon <- y - eta

    k1 <- -1 / 2 * sum(log(sigma))
    k2 <- -1 / 2 * sum(epsilon^2 / sigma)

    return(-(k1 + k2))
  }
  N <- length(y)
  n_partitions <- floor(N^(3 / 20))
  print("---init----")
  print(init)

  init_out <- BBoptim(init, M_ip_BB,
                          y = y, sigma_1 = sigma_1,
                          n_partitions = n_partitions, lower = 1e-3, upper = upper,
                          control = list(
                            maxit = 1500, ftol = tol^2,
                            gtol = tol
                          )
  )

  intermediates$theta <- init_out$par

  sigma <- series_cal(y, init, sigma_1)
  intermediates$sigma_delta <- sigma - intermediates$sigma

  intermediates$iter <- init_out$iter
  intermediates
}

get_result_from_raw <- function(raw_fit) {
  result <- list()
  result$beta <- raw_fit$theta
  result$sigma <- raw_fit$parameters$lambda
  result$run.time <- raw_fit$run.time
  result$iter <- ip_raw$iterspace$jac_like$intermediates$iter
  result$judge_covergence <- result$iter < 500
  result
}

theta_delta <- function(intermediates) {
  intermediates$theta_delta <- intermediates$theta - intermediates$theta_1
  intermediates
}

lambda_delta <- function(intermediates) {
  intermediates$lambda_delta <- intermediates$sigma_delta
  intermediates
}


# A
series_gen <- function(N, y1, init, sigma1) {
  y <- vector(length = N)
  sigma <- vector(length = N)
  y[1] <- y1
  sigma[1] <- sigma1

  for (i in 2:N) {
    sigma[i] <- init[1] + init[2] * y[i - 1]^2 + init[3] * sigma[i - 1]
    y[i] <- sigma[i] + 0.5 * sin(10 * sigma[i]) + sqrt(sigma[i]) * rnorm(1)
  }

  return(y)
}

# B
series_gen2 <- function(N, y1, init, sigma1) {
  y <- vector(length = N)
  sigma <- vector(length = N)
  y[1] <- y1
  sigma[1] <- sigma1

  for (i in 2:N) {
    sigma[i] <- init[1] + init[2] * y[i - 1]^2 + init[3] * sigma[i - 1]
    y[i] <- 0.5 * sigma[i] + 0.1 * sin(0.5 + 20 * sigma[i]) + sqrt(sigma[i]) * rnorm(1)
  }

  return(y)
}

Simulation

(500, A)

# set.seed(1234)
# N <- 500
# init <- c(0.01, 0.1, 0.68)
# sigma_1 <- 0.1
# 
# resultA_bf1 <- matrix(nrow = 3, ncol = 1000)
# timeA_bf1 <- vector(length = 1000)
# iterA_bf1 <- vector(length = 1000)
# judge_test <- NULL
# resultA_sp1 <- matrix(nrow = 3, ncol = 1000)
# timeA_sp1 <- vector(length = 1000)
# iterA_sp1 <- vector(length = N)
# resultA_ip1 <- matrix(nrow = 3, ncol = 1000)
# timeA_ip1 <- vector(length = 1000)
# iterA_ip1 <- vector(length = 1000)
# i <- 1
# while (i <= 1000) {
#   judge <- TRUE
#   while (judge) {
#     y <- series_gen(N, 0, init, 0.1)
#     if (all(!is.na(y))) judge <- FALSE
#   }
#   bf.fit <- bf(y, init = init, sigma_1 = 0.1, judge_k = T)
#   spmbp.fit <- spmbp_B(y, init = init, sigma_1 = 0.1, judge_k = T)
#   ##   ip.fit = IP_GARCH_BB(y, init = init, sigma_1 = 0.1, judge_k = T)
#   theta0 <- c(0, 0, 0)
#   lambda0 <- rep(0, N)
#   ip_raw <- try(semislv(theta = init, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "implicit", diy = TRUE, data = y, IP_GARCH_BB = IP_GARCH_BB, theta_delta = theta_delta, lambda_delta = lambda_delta, control = list(max_iter = 1, tol = 1e-5)))
#   ip.fit <- get_result_from_raw(ip_raw)
#   if (class(ip_raw) != "try-error" & ip.fit$judge_covergence&bf.fit$judge_covergence & spmbp.fit$judge_covergence) {
#     resultA_bf1[, i] <- bf.fit$beta
#     timeA_bf1[i] <- bf.fit$run.time
#     iterA_bf1[i] <- bf.fit$iter
#     resultA_sp1[, i] <- spmbp.fit$beta
#     timeA_sp1[i] <- spmbp.fit$run.time
#     iterA_sp1[i] <- spmbp.fit$iter
#     resultA_ip1[, i] <- ip.fit$beta
#     timeA_ip1[i] <- ip.fit$run.time
#     iterA_ip1[i] <- ip.fit$iter
#     i <- i + 1
#     cat("Time", i, "\n")
#   }
# }
# resultA_bf1 <- resultA_bf1[, which(!is.na(resultA_bf1[1, ]))]
# resultA_sp1 <- resultA_sp1[, which(!is.na(resultA_sp1[1, ]))]
# resultA_ip1 <- resultA_ip1[, which(!is.na(resultA_ip1[1, ]))]
# 
# tb_A_it1 <- matrix(nrow = 4, ncol = 3)
# tb_A_it1[1, ] <- apply(resultA_bf1 - init, 1, mean)
# tb_A_it1[2, ] <- apply(resultA_bf1, 1, sd)
# tb_A_it1[3, ] <- apply(abs(resultA_bf1 - init), 1, mean)
# tb_A_it1[4, ] <- sqrt(apply((resultA_bf1 - init)^2, 1, mean))
# 
# rownames(tb_A_it1) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_A_it1)
# 
# tb_A_spmbp1 <- matrix(nrow = 4, ncol = 3)
# tb_A_spmbp1[1, ] <- apply(resultA_sp1 - init, 1, mean)
# tb_A_spmbp1[2, ] <- apply(resultA_sp1, 1, sd)
# tb_A_spmbp1[3, ] <- apply(abs(resultA_sp1 - init), 1, mean)
# tb_A_spmbp1[4, ] <- sqrt(apply((resultA_sp1 - init)^2, 1, mean))
# 
# rownames(tb_A_spmbp1) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_A_spmbp1)
# 
# tb_A_ip1 <- matrix(nrow = 4, ncol = 3)
# tb_A_ip1[1, ] <- apply(resultA_ip1 - init, 1, mean)
# tb_A_ip1[2, ] <- apply(resultA_ip1, 1, sd)
# tb_A_ip1[3, ] <- apply(abs(resultA_ip1 - init), 1, mean)
# tb_A_ip1[4, ] <- sqrt(apply((resultA_ip1 - init)^2, 1, mean))
# 
# rownames(tb_A_ip1) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_A_ip1)
# 
# print(mean(timeA_bf1))
# print(mean(timeA_sp1))
# print(mean(timeA_ip1)) # 按顺序为it, spmbp与ip
# print(mean(iterA_bf1))
# print(mean(iterA_sp1))
# print(mean(iterA_ip1))

(1000, A)

# set.seed(1234)
# N <- 1000
# init <- c(0.01, 0.1, 0.68)
# sigma_1 <- 0.1
# 
# resultA_bf <- matrix(nrow = 3, ncol = 1000)
# timeA_bf <- vector(length = 1000)
# iterA_bf <- vector(length = 1000)
# judge_test <- NULL
# resultA_sp <- matrix(nrow = 3, ncol = 1000)
# timeA_sp <- vector(length = 1000)
# iterA_sp <- vector(length = N)
# resultA_ip <- matrix(nrow = 3, ncol = 1000)
# timeA_ip <- vector(length = 1000)
# iterA_ip <- vector(length = 1000)
# i <- 1
# while (i <= 1000) {
#   judge <- TRUE
#   while (judge) {
#     y <- series_gen(N, 0, init, 0.1)
#     if (all(!is.na(y))) judge <- FALSE
#   }
#   bf.fit <- bf(y, init = init, sigma_1 = 0.1, judge_k = T)
#   spmbp.fit <- spmbp_B(y, init = init, sigma_1 = 0.1, judge_k = T)
#   ##   ip.fit = IP_GARCH_BB(y, init = init, sigma_1 = 0.1, judge_k = T)
#   theta0 <- c(0, 0, 0)
#   lambda0 <- rep(0, N)
#   ip_raw <- try(semislv(theta = init, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "implicit", diy = TRUE, data = y, IP_GARCH_BB = IP_GARCH_BB, theta_delta = theta_delta, lambda_delta = lambda_delta, control = list(max_iter = 1, tol = 1e-5)))
#   ip.fit <- get_result_from_raw(ip_raw)
#   if (class(ip_raw) != "try-error" & ip.fit$judge_covergence&bf.fit$judge_covergence & spmbp.fit$judge_covergence) {
#     resultA_bf[, i] <- bf.fit$beta
#     timeA_bf[i] <- bf.fit$run.time
#     iterA_bf[i] <- bf.fit$iter
#     resultA_sp[, i] <- spmbp.fit$beta
#     timeA_sp[i] <- spmbp.fit$run.time
#     iterA_sp[i] <- spmbp.fit$iter
#     resultA_ip[, i] <- ip.fit$beta
#     timeA_ip[i] <- ip.fit$run.time
#     iterA_ip[i] <- ip.fit$iter
#     i <- i + 1
#     cat("Time", i, "\n")
#   }
# }
# resultA_bf <- resultA_bf[, which(!is.na(resultA_bf[1, ]))]
# resultA_sp <- resultA_sp[, which(!is.na(resultA_sp[1, ]))]
# resultA_ip <- resultA_ip[, which(!is.na(resultA_ip[1, ]))]
# 
# tb_A_it <- matrix(nrow = 4, ncol = 3)
# tb_A_it[1, ] <- apply(resultA_bf - init, 1, mean)
# tb_A_it[2, ] <- apply(resultA_bf, 1, sd)
# tb_A_it[3, ] <- apply(abs(resultA_bf - init), 1, mean)
# tb_A_it[4, ] <- sqrt(apply((resultA_bf - init)^2, 1, mean))
# 
# rownames(tb_A_it) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_A_it)
# 
# tb_A_spmbp <- matrix(nrow = 4, ncol = 3)
# tb_A_spmbp[1, ] <- apply(resultA_sp - init, 1, mean)
# tb_A_spmbp[2, ] <- apply(resultA_sp, 1, sd)
# tb_A_spmbp[3, ] <- apply(abs(resultA_sp - init), 1, mean)
# tb_A_spmbp[4, ] <- sqrt(apply((resultA_sp - init)^2, 1, mean))
# 
# rownames(tb_A_spmbp) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_A_spmbp)
# 
# tb_A_ip <- matrix(nrow = 4, ncol = 3)
# tb_A_ip[1, ] <- apply(resultA_ip - init, 1, mean)
# tb_A_ip[2, ] <- apply(resultA_ip, 1, sd)
# tb_A_ip[3, ] <- apply(abs(resultA_ip - init), 1, mean)
# tb_A_ip[4, ] <- sqrt(apply((resultA_ip - init)^2, 1, mean))
# 
# rownames(tb_A_ip) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_A_ip)
# 
# print(mean(timeA_bf))
# print(mean(timeA_sp))
# print(mean(timeA_ip)) # 按顺序为it, spmbp与ip
# print(mean(iterA_bf))
# print(mean(iterA_sp))
# print(mean(iterA_ip))

(500, B)

# set.seed(1234)
# N <- 500
# init <- c(0.01, 0.1, 0.8)
# sigma_1 <- 0.1
# 
# resultB_bf1 <- matrix(nrow = 3, ncol = 1000)
# timeB_bf1 <- vector(length = 1000)
# iterB_bf1 <- vector(length = 1000)
# judge_test <- NULL
# resultB_sp1 <- matrix(nrow = 3, ncol = 1000)
# timeB_sp1 <- vector(length = 1000)
# iterB_sp1 <- vector(length = N)
# resultB_ip1 <- matrix(nrow = 3, ncol = 1000)
# timeB_ip1 <- vector(length = 1000)
# iterB_ip1 <- vector(length = 1000)
# i <- 1
# while (i <= 1000) {
#   judge <- TRUE
#   while (judge) {
#     y <- series_gen2(N, 0, init, 0.1)
#     if (all(!is.na(y))) judge <- FALSE
#   }
#   bf.fit <- bf(y, init = init, sigma_1 = 0.1, judge_k = T)
#   spmbp.fit <- spmbp_B(y, init = init, sigma_1 = 0.1, judge_k = T)
#   ##   ip.fit = IP_GARCH_BB(y, init = init, sigma_1 = 0.1, judge_k = T)
#   theta0 <- c(0, 0, 0)
#   lambda0 <- rep(0, N)
#   ip_raw <- try(semislv(theta = init, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "implicit", diy = TRUE, data = y, IP_GARCH_BB = IP_GARCH_BB, theta_delta = theta_delta, lambda_delta = lambda_delta, control = list(max_iter = 1, tol = 1e-5)))
#   ip.fit <- get_result_from_raw(ip_raw)
#   if (class(ip_raw) != "try-error" & ip.fit$judge_covergence&bf.fit$judge_covergence & spmbp.fit$judge_covergence) {
#     resultB_bf1[, i] <- bf.fit$beta
#     timeB_bf1[i] <- bf.fit$run.time
#     iterB_bf1[i] <- bf.fit$iter
#     resultB_sp1[, i] <- spmbp.fit$beta
#     timeB_sp1[i] <- spmbp.fit$run.time
#     iterB_sp1[i] <- spmbp.fit$iter
#     resultB_ip1[, i] <- ip.fit$beta
#     timeB_ip1[i] <- ip.fit$run.time
#     iterB_ip1[i] <- ip.fit$iter
#     i <- i + 1
#     cat("Time", i, "\n")
#   }
# }
# resultB_bf1 <- resultB_bf1[, which(!is.na(resultB_bf1[1, ]))]
# resultB_sp1 <- resultB_sp1[, which(!is.na(resultB_sp1[1, ]))]
# resultB_ip1 <- resultB_ip1[, which(!is.na(resultB_ip1[1, ]))]
# 
# tb_B_it1 <- matrix(nrow = 4, ncol = 3)
# tb_B_it1[1, ] <- apply(resultB_bf1 - init, 1, mean)
# tb_B_it1[2, ] <- apply(resultB_bf1, 1, sd)
# tb_B_it1[3, ] <- apply(abs(resultB_bf1 - init), 1, mean)
# tb_B_it1[4, ] <- sqrt(apply((resultB_bf1 - init)^2, 1, mean))
# 
# rownames(tb_B_it1) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_B_it1)
# 
# tb_B_spmbp1 <- matrix(nrow = 4, ncol = 3)
# tb_B_spmbp1[1, ] <- apply(resultB_sp1 - init, 1, mean)
# tb_B_spmbp1[2, ] <- apply(resultB_sp1, 1, sd)
# tb_B_spmbp1[3, ] <- apply(abs(resultB_sp1 - init), 1, mean)
# tb_B_spmbp1[4, ] <- sqrt(apply((resultB_sp1 - init)^2, 1, mean))
# 
# rownames(tb_B_spmbp1) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_B_spmbp1)
# 
# tb_B_ip1 <- matrix(nrow = 4, ncol = 3)
# tb_B_ip1[1, ] <- apply(resultB_ip1 - init, 1, mean)
# tb_B_ip1[2, ] <- apply(resultB_ip1, 1, sd)
# tb_B_ip1[3, ] <- apply(abs(resultB_ip1 - init), 1, mean)
# tb_B_ip1[4, ] <- sqrt(apply((resultB_ip1 - init)^2, 1, mean))
# 
# rownames(tb_B_ip1) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_B_ip1)
# 
# print(mean(timeB_bf1))
# print(mean(timeB_sp1))
# print(mean(timeB_ip1)) 
# print(mean(iterB_bf1))
# print(mean(iterB_sp1))
# print(mean(iterB_ip1))

(1000, B)

# set.seed(1234)
# 
# N <- 1000
# init <- c(0.01, 0.1, 0.8)
# sigma_1 <- 0.1
# 
# resultB_bf2 <- matrix(nrow = 3, ncol = 1000)
# timeB_bf2 <- vector(length = 1000)
# iterB_bf2 <- vector(length = 1000)
# judge_test <- NULL
# resultB_sp2 <- matrix(nrow = 3, ncol = 1000)
# timeB_sp2 <- vector(length = 1000)
# iterB_sp2 <- vector(length = N)
# resultB_ip2 <- matrix(nrow = 3, ncol = 1000)
# timeB_ip2 <- vector(length = 1000)
# iterB_ip2 <- vector(length = 1000)
# i <- 1
# while (i <= 1000) {
#   judge <- TRUE
#   while (judge) {
#     y <- series_gen2(N, 0, init, 0.1)
#     if (all(!is.na(y))) judge <- FALSE
#   }
#   bf.fit <- bf(y, init = init, sigma_1 = 0.1, judge_k = T)
#   spmbp.fit <- spmbp_B(y, init = init, sigma_1 = 0.1, judge_k = T)
#   ##   ip.fit = IP_GARCH_BB(y, init = init, sigma_1 = 0.1, judge_k = T)
#   theta0 <- c(0, 0, 0)
#   lambda0 <- rep(0, N)
#   ip_raw <- try(semislv(theta = init, lambda = lambda0, Phi_fn = Phi_fn, Psi_fn = Psi_fn, method = "implicit", diy = TRUE, data = y, IP_GARCH_BB = IP_GARCH_BB, theta_delta = theta_delta, lambda_delta = lambda_delta, control = list(max_iter = 1, tol = 1e-5)))
#   ip.fit <- get_result_from_raw(ip_raw)
#   if (class(ip_raw) != "try-error" & ip.fit$judge_covergence&bf.fit$judge_covergence & spmbp.fit$judge_covergence) {
#     resultB_bf2[, i] <- bf.fit$beta
#     timeB_bf2[i] <- bf.fit$run.time
#     iterB_bf2[i] <- bf.fit$iter
#     resultB_sp2[, i] <- spmbp.fit$beta
#     timeB_sp2[i] <- spmbp.fit$run.time
#     iterB_sp2[i] <- spmbp.fit$iter
#     resultB_ip2[, i] <- ip.fit$beta
#     timeB_ip2[i] <- ip.fit$run.time
#     iterB_ip2[i] <- ip.fit$iter
#     i <- i + 1
#     cat("Time", i, "\n")
#   }
# }
# resultB_bf2 <- resultB_bf2[, which(!is.na(resultB_bf2[1, ]))]
# resultB_sp2 <- resultB_sp2[, which(!is.na(resultB_sp2[1, ]))]
# resultB_ip2 <- resultB_ip2[, which(!is.na(resultB_ip2[1, ]))]
# 
# tb_B_it2 <- matrix(nrow = 4, ncol = 3)
# tb_B_it2[1, ] <- apply(resultB_bf2 - init, 1, mean)
# tb_B_it2[2, ] <- apply(resultB_bf2, 1, sd)
# tb_B_it2[3, ] <- apply(abs(resultB_bf2 - init), 1, mean)
# tb_B_it2[4, ] <- sqrt(apply((resultB_bf2 - init)^2, 1, mean))
# 
# rownames(tb_B_it2) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_B_it2)
# 
# tb_B_spmbp2 <- matrix(nrow = 4, ncol = 3)
# tb_B_spmbp2[1, ] <- apply(resultB_sp2 - init, 1, mean)
# tb_B_spmbp2[2, ] <- apply(resultB_sp2, 1, sd)
# tb_B_spmbp2[3, ] <- apply(abs(resultB_sp2 - init), 1, mean)
# tb_B_spmbp2[4, ] <- sqrt(apply((resultB_sp2 - init)^2, 1, mean))
# 
# rownames(tb_B_spmbp2) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_B_spmbp2)
# 
# tb_B_ip2 <- matrix(nrow = 4, ncol = 3)
# tb_B_ip2[1, ] <- apply(resultB_ip2 - init, 1, mean)
# tb_B_ip2[2, ] <- apply(resultB_ip2, 1, sd)
# tb_B_ip2[3, ] <- apply(abs(resultB_ip2 - init), 1, mean)
# tb_B_ip2[4, ] <- sqrt(apply((resultB_ip2 - init)^2, 1, mean))
# 
# rownames(tb_B_ip2) <- c("BIAS", "SD", "MAE", "RMSE")
# print(tb_B_ip2)
# 
# print(mean(timeB_bf2))
# print(mean(timeB_sp2))
# print(mean(timeB_ip2)) # 按顺序为it, spmbp与ip
# print(mean(iterB_bf2))
# print(mean(iterB_sp2))
# print(mean(iterB_ip2))


Try the SemiEstimate package in your browser

Any scripts or data that you put into this service are public.

SemiEstimate documentation built on Sept. 6, 2021, 9:12 a.m.