FastGaSP-package: Fast and Exact Computation of Gaussian Stochastic Process

FastGaSP-packageR Documentation

Fast and Exact Computation of Gaussian Stochastic Process

Description

Implements fast and exact computation of Gaussian stochastic process with the Matern kernel using forward filtering and backward smoothing algorithm. It includes efficient implementations of the inverse Kalman filter, with applications such as estimating particle interaction functions. These tools support models with or without noise. Additionally, the package offers algorithms for fast parameter estimation in latent factor models, where the factor loading matrix is orthogonal, and latent processes are modeled by Gaussian processes. See the references: 1) Mengyang Gu and Yanxun Xu (2020), Journal of Computational and Graphical Statistics; 2) Xinyi Fang and Mengyang Gu (2024), <doi:10.48550/arXiv.2407.10089>; 3) Mengyang Gu and Weining Shen (2020), Journal of Machine Learning Research; 4) Yizi Lin, Xubo Liu, Paul Segall and Mengyang Gu (2025), <doi:10.48550/arXiv.2501.01324>.

Details

The DESCRIPTION file: This package was not yet installed at build time.
Index: This package was not yet installed at build time.
Fast computational algorithms for Gaussian stochastic process with Matern kernels by the forward filtering and backward smoothing algorithm.

Author(s)

Mengyang Gu [aut, cre], Xinyi Fang [aut], Yizi Lin [aut]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

Hartikainen, J. and Sarkka, S. (2010). Kalman filtering and smoothing solutions to temporal gaussian process regression models, Machine Learning for Signal Processing (MLSP), 2010 IEEE International Workshop, 379-384.

M. Gu and Y. Xu (2020), Nonseparable Gaussian stochastic process: a unified view and computational strategy, Fast Nonseparable Gaussian Stochastic Process With Application to Methylation Level Interpolation, Journal of Computational and Graphical Statistics, 29, 250-260.

M. Gu and W. Shen (2020), Generalized probabilistic principal component analysis of correlated data, Journal of Machine Learning Research, 21, 13-1.

M. Gu, X. Wang and J.O. Berger (2018), Robust Gaussian Stochastic Process Emulation, Annals of Statistics, 46, 3038-3066.

See Also

FastGaSP

Examples


library(FastGaSP)

#------------------------------------------------------------------------------
# Example 1 : fast computation algorithm for noisy data
#------------------------------------------------------------------------------

y_R<-function(x){
  sin(2*pi*x)
}

###let's test for 2000 observations
set.seed(1)
num_obs=2000
input=runif(num_obs)
output=y_R(input)+rnorm(num_obs,mean=0,sd=0.1)

##constucting the fgasp.model
fgasp.model=fgasp(input, output)

##range and noise-variance ratio (nugget) parameters 
param=c( log(1),log(.02))
## the log lik
log_lik(param,fgasp.model)
##time cost to compute the likelihood
time_cost=system.time(log_lik(param,fgasp.model))
time_cost[1]

##consider a nonparametric regression setting 
##estimate the parameter by maximum likelihood estimation

est_all<-optim(c(log(1),log(.02)),log_lik,object=fgasp.model,method="L-BFGS-B",
              control = list(fnscale=-1))  

##estimated log inverse range parameter and log nugget
est_all$par

##estimate variance 
est.var=Get_log_det_S2(est_all$par,fgasp.model@have_noise,fgasp.model@delta_x,
                          fgasp.model@output,fgasp.model@kernel_type)[[2]]/fgasp.model@num_obs
est.var

###1. Do some interpolation test 
num_test=5000
testing_input=runif(num_test) ##there are the input where you don't have observations
pred.model=predict(param=est_all$par,object=fgasp.model,testing_input=testing_input)

lb=pred.model@mean+qnorm(0.025)*sqrt(pred.model@var)
ub=pred.model@mean+qnorm(0.975)*sqrt(pred.model@var)

## calculate lb for the mean function
pred.model2=predict(param=est_all$par,object=fgasp.model,testing_input=testing_input,var_data=FALSE)
lb_mean_funct=pred.model2@mean+qnorm(0.025)*sqrt(pred.model2@var)
ub_mean_funct=pred.model2@mean+qnorm(0.975)*sqrt(pred.model2@var)

## plot the prediction
min_val=min(lb,output)
max_val=max(ub,output)

plot(pred.model@testing_input,pred.model@mean,type='l',col='blue',
     ylim=c(min_val,max_val),
     xlab='x',ylab='y')
polygon(c(pred.model@testing_input,rev(pred.model@testing_input)),
        c(lb,rev(ub)),col = "grey80", border = FALSE)
lines(pred.model@testing_input,pred.model@mean,type='l',col='blue')
lines(pred.model@testing_input,y_R(pred.model@testing_input),type='l',col='black')
lines(pred.model2@testing_input,lb_mean_funct,col='blue',lty=2)
lines(pred.model2@testing_input,ub_mean_funct,col='blue',lty=2)
lines(input,output,type='p',pch=16,col='black',cex=0.4) #one can plot data

legend("bottomleft", legend=c("predictive mean","95% predictive interval","truth"),
       col=c("blue","blue","black"), lty=c(1,2,1), cex=.8)

#--------------------------------------------------------------
# Example 2: example that one does not have a noise in the data
#--------------------------------------------------------------
## Here is a function in the Sobolev Space with order 3
y_R<-function(x){
  j_seq=seq(1,200,1)
  record_y_R=0
  for(i_j in 1:200){
    record_y_R=record_y_R+2*j_seq[i_j]^{-2*3}*sin(j_seq[i_j])*cos(pi*(j_seq[i_j]-0.5)*x)

  }
  record_y_R
}


##generate some data without noise
num_obs=50
input=seq(0,1,1/(num_obs-1))

output=y_R(input)


##constucting the fgasp.model
fgasp.model=fgasp(input, output,have_noise=FALSE)

##range and noise-variance ratio (nugget) parameters 
param=c( log(1))
## the log lik
log_lik(param,fgasp.model)



#if one does not have noise one may need to give a lower bound or use a penalty 
#(e.g. induced by a prior) to make the estimation more robust
est_all<-optimize(log_lik,interval=c(0,10),maximum=TRUE,fgasp.model)
  
##Do some interpolation test for comparison
num_test=1000
testing_input=runif(num_test) ##there are the input where you don't have observations

pred.model=predict(param=est_all$maximum,object=fgasp.model,testing_input=testing_input)



#This is the 95 posterior credible interval for the outcomes which contain the estimated 
#variance of the noise
#sometimes there are numerical instability is one does not have noise or error
lb=pred.model@mean+qnorm(0.025)*sqrt(abs(pred.model@var))
ub=pred.model@mean+qnorm(0.975)*sqrt(abs(pred.model@var))

## plot the prediction
min_val=min(lb,output)
max_val=max(ub,output)

plot(pred.model@testing_input,pred.model@mean,type='l',col='blue',
     ylim=c(min_val,max_val),
     xlab='x',ylab='y')
polygon( c(pred.model@testing_input,rev(pred.model@testing_input)),
         c(lb,rev(ub)),col = "grey80", border = FALSE)
lines(pred.model@testing_input,pred.model@mean,type='l',col='blue')
lines(pred.model@testing_input,y_R(pred.model@testing_input),type='l',col='black')
lines(input,output,type='p',pch=16,col='black')
legend("bottomleft", legend=c("predictive mean","95% predictive interval","truth"),
       col=c("blue","blue","black"), lty=c(1,2,1), cex=.8)


##mean square error for all inputs
mean((pred.model@mean- y_R(pred.model@testing_input))^2)


FastGaSP documentation built on April 4, 2025, 5:16 a.m.