R/fa.sglfit.R

Defines functions fa.sglfit

Documented in fa.sglfit

#' Estimates factor-augmented sparse MIDAS regression model
#' 
#' @description 
#' Estimates factor-augmented sparse MIDAS regression model
#' 
#' The function first computes factors on MIDAS-weighted regressors \code{x}. 
#' Then the function runs \link{cv.sglfit} to fit the model. The output can be passed to \link{predict.fa.sglfit} to obtain predictions.
#' 
#' @details
#' \ifelse{html}{\out{The factor-augmented sparse MIDAS regression is estimated by applying a sg-LASSO penalty on sparse component. The sequence of linear regression models implied by &lambda; vector is fit by block coordinate-descent. The objective function is  <br><br> ||y - &iota;&alpha; - f&phi; - x&beta;||<sup>2</sup><sub>T</sub> + 2&lambda;  &Omega;<sub>&gamma;</sub>(&beta;), <br> where &iota;&#8712;R<sup>T</sup>enter> and ||u||<sup>2</sup><sub>T</sub>=&#60;u,u&#62;/T is the empirical inner product. The penalty function &Omega;<sub>&gamma;</sub>(.) is applied on  &beta; coefficients and is <br><br> &Omega;<sub>&gamma;</sub>(&beta;) = &gamma; |&beta;|<sub>1</sub> + (1-&gamma;)|&beta;|<sub>2,1</sub>, <br> a convex combination of LASSO and group LASSO penalty functions.}}{The cross-validation is run for sg-LASSO linear model. The sequence of linear regression models implied by \eqn{\lambda} vector is fit by block coordinate-descent. The objective function is \deqn{\|y-\iota\alpha - f\phi - x\beta\|^2_{T} + 2\lambda \Omega_\gamma(\beta),} where \eqn{\iota\in R^T} and \eqn{\|u\|^2_T = \langle u,u \rangle / T} is the empirical inner product. The penalty function \eqn{\Omega_\gamma(.)} is applied on \eqn{\beta} coefficients and is \deqn{\Omega_\gamma(\beta) = \gamma |\beta|_1 + (1-\gamma)|\beta|_{2,1},} a convex combination of LASSO and group LASSO penalty functions.}     
#' @usage 
#' fa.sglfit(x, y, f = NULL, K = NULL, gamma = 1.0, gindex = 1:nvars, 
#'   ...)
#' @param x T by p data matrix, where T and p respectively denote the sample size and the number of regressors.
#' @param y T by 1 response variable.
#' @param f T by K latent factors. If null, the factors are estimated by using Principal Component Analysis (PCA). In this case, the number of optimal factors is selected by appyling eigenvalue ratio estimator of Ahn and Horenstein (2013).
#' @param K number of factors. If \code{f} is not provided, the number of factors is estimated by applying eigenvalue ratio estimator of Ahn and Horenstein (2013) in case \code{K=NULL}. Else, first K Principal Components are used.
#' @param gamma sg-LASSO mixing parameter. \ifelse{html}{\out{&gamma;}}{\eqn{\gamma}} = 1 gives LASSO solution and \ifelse{html}{\out{&gamma;}}{\eqn{\gamma}} = 0 gives group LASSO solution.
#' @param gindex p by 1 vector indicating group membership of each covariate.
#' @param ... Other arguments that can be passed to \link{cv.sglfit}.
#' @return fa.sglfit object.
#' @examples
#' set.seed(1)
#' x = matrix(rnorm(100 * 20), 100, 20)
#' beta = c(5,4,3,2,1,rep(0, times = 15))
#' f = matrix(rnorm(100 * 2), 100, 2)
#' g = c(0.5, 0.5)
#' y = x%*%beta + f%*%g + rnorm(100)
#' gindex = sort(rep(1:4,times=5))
#' fa.sglfit(x = x, y = y, f = f, gindex = gindex, gamma = 0.5,  
#'            standardize = FALSE, intercept = FALSE)
#' @export fa.sglfit
fa.sglfit <- function(x, y, f = NULL, K = NULL, gamma = 1.0, gindex = 1:nvars, ...) {
  y <- drop(y)
  x <- as.matrix(x)
  np <- dim(x)
  nobs <- as.integer(np[1])
  nvars <- as.integer(np[2])
  
  if (is.null(f)){
    rmax <- 10
    S <- svd(x, nu = rmax, nv = rmax)
    if (is.null(K)){
      storesvd <- rep(0, rmax-1)
      for (j in 1:(rmax-1)){
        storesvd[j] <- S$d[j]/S$d[j+1]
      }
      rhat <- which.max(storesvd)
      f <- S$u[,1:rhat]
      K <- rhat
    } else {
      f <- S$u[,1:K]
    }
  }
  f <- as.matrix(f)
  Proj <- diag(nobs) - f %*% solve(t(f) %*% f) %*% t(f)
  fitsequence <-
    sglfit(
      Proj %*% x,
      Proj %*% y,
      gamma = gamma,
      gindex = gindex,
      ...
    )
  lambda <- fitsequence$lambda
  obj <-
    cv.sglfit(
      Proj %*% x,
      Proj %*% y,
      gamma = gamma,
      gindex = gindex,
      lambda = lambda,
      ...
    )
  
  b0 <- obj$cv.fit$lam.min$b0
  beta <- obj$cv.fit$lam.min$beta
  xb <- b0 + x%*%beta
  gamma <- coefficients(lm(y-xb ~ -1+f))
  obj$cv.fit$lam.min$beta <- c(obj$cv.fit$lam.min$beta, gamma)
  
  b0 <- obj$cv.fit$lam.1se$b0
  beta <- obj$cv.fit$lam.1se$beta
  xb <- b0 + x%*%beta
  gamma <- coefficients(lm(y-xb ~ -1+f))
  obj$cv.fit$lam.1se$beta <- c(obj$cv.fit$lam.1se$beta, gamma)
  
  class(obj) <- "fa.sglfit"
  obj
} 

Try the midasml package in your browser

Any scripts or data that you put into this service are public.

midasml documentation built on Nov. 5, 2025, 5:44 p.m.