# binaryGP_fit: Binary Gaussian Process (with/without time-series) In binaryGP: Fit and Predict a Gaussian Process Model with (Time-Series) Binary Response

## Description

The function fits Gaussian process models with binary response. The models can also include time-series term if a time-series binary response is observed. The estimation methods are based on PQL/PQPL and REML (for mean function and correlation parameter, respectively).

## Usage

 1 2 3 4 binaryGP_fit(X, Y, R = 0, L = 0, corr = list(type = "exponential", power = 2), nugget = 1e-10, constantMean = FALSE, orthogonalGP = FALSE, converge.tol = 1e-10, maxit = 15, maxit.PQPL = 20, maxit.REML = 100, xtol_rel = 1e-10, standardize = FALSE, verbose = TRUE) 

## Arguments

 X a design matrix with dimension n by d. Y a response matrix with dimension n by T. The values in the matrix are 0 or 1. If no time-series, T = 1. If time-series is included, i.e., T > 1, the first column is the response vector of time 1, and second column is the response vector of time 2, and so on. R a positive integer specifying the order of autoregression only if time-series is included. The default is 1. L a positive integer specifying the order of interaction between X and previous Y only if time-series is included. The value couldn't nbe larger than R. The default is 1. corr a list of parameters for the specifing the correlation to be used. Either exponential correlation function or Matern correlation function can be used. See corr_matrix for details. nugget a positive value to use for the nugget. If NULL, a nugget will be estimated. The default is 1e-10. constantMean logical. TRUE indicates that the Gaussian process will have a constant mean, plus time-series structure if R or T is greater than one; otherwise the mean function will be a linear regression in X, plus an intercept term and time-series structure. orthogonalGP logical. TRUE indicates that the orthogonal Gaussian process will be used. Only available when corr is list(type = "exponential", power = 2). converge.tol convergence tolerance. It converges when relative difference with respect to η_t is smaller than the tolerance. The default is 1e-10. maxit a positive integer specifying the maximum number of iterations for estimation to be performed before the estimates are convergent. The default is 15. maxit.PQPL a positive integer specifying the maximum number of iterations for PQL/PQPL estimation to be performed before the estimates are convergent. The default is 20. maxit.REML a positive integer specifying the maximum number of iterations in lbfgs for REML estimation to be performed before the estimates are convergent. The default is 100. xtol_rel a postive value specifying the convergence tolerance for lbfgs. The default is 1e-10. standardize logical. If TRUE, each column of X will be standardized into [0,1]. The default is FALSE. verbose logical. If TRUE, additional diagnostics are printed. The default is TRUE.

## Details

Consider the model

logit(p_t(x))=η_t(x)=∑^R_{r=1}\varphi_ry_{t-r}(\mathbf{x})+α_0+\mathbf{x}'\boldsymbol{α}+∑^L_{l=1}\boldsymbol{γ}_l\textbf{x}y_{t-l}(\mathbf{x})+Z_t(\mathbf{x}),

where p_t(x)=Pr(y_t(x)=1) and Z_t(\cdot) is a Gaussian process with zero mean, variance σ^2, and correlation function R_{\boldsymbol{θ}}(\cdot,\cdot) with unknown correlation parameters \boldsymbol{θ}. The power exponential correlation function is used and defined by

R_{\boldsymbol{θ}}(\mathbf{x}_i,\mathbf{x}_j)=\exp\{-∑^{d}_{l=1}\frac{(x_{il}-x_{jl})^p}{θ_l} \},

where p is given by power. The parameters in the mean functions including \varphi_r,α_0,\boldsymbol{α},\boldsymbol{γ}_l are estimated by PQL/PQPL, depending on whether the mean function has the time-series structure. The parameters in the Gaussian process including \boldsymbol{θ} and σ^2 are estimated by REML. If orthogonalGP is TRUE, then orthogonal covariance function (Plumlee and Joseph, 2016) is employed. More details can be seen in Sung et al. (unpublished).

## Value

 alpha_hat a vector of coefficient estimates for the linear relationship with X. varphi_hat a vector of autoregression coefficient estimates. gamma_hat a vector of the interaction effect estimates. theta_hat a vector of correlation parameters. sigma_hat a value of sigma estimate (standard deviation). nugget_hat if nugget is NULL, then return a nugget estimate, otherwise return nugget. orthogonalGP orthogonalGP. X data X. Y data Y. R order of autoregression. L order of interaction between X and previous Y. corr a list of parameters for the specifing the correlation to be used. Model.mat model matrix. eta_hat eta_hat. standardize standardize. mean.x mean of each column of X. Only available when standardize=TRUE, otherwise mean.x=NULL. scale.x max(x)-min(x) of each column of X. Only available when standardize=TRUE, otherwise scale.x=NULL.

## Author(s)

Chih-Li Sung <iamdfchile@gmail.com>

predict.binaryGP for prediction of the binary Gaussian process, print.binaryGP for the list of estimation results, and summary.binaryGP for summary of significance results.
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 library(binaryGP) ##### Testing function: cos(x1 + x2) * exp(x1*x2) with TT sequences ##### ##### Thanks to Sonja Surjanovic and Derek Bingham, Simon Fraser University ##### test_function <- function(X, TT) { x1 <- X[,1] x2 <- X[,2] eta_1 <- cos(x1 + x2) * exp(x1*x2) p_1 <- exp(eta_1)/(1+exp(eta_1)) y_1 <- rep(NA, length(p_1)) for(i in 1:length(p_1)) y_1[i] <- rbinom(1,1,p_1[i]) Y <- y_1 P <- p_1 if(TT > 1){ for(tt in 2:TT){ eta_2 <- 0.3 * y_1 + eta_1 p_2 <- exp(eta_2)/(1+exp(eta_2)) y_2 <- rep(NA, length(p_2)) for(i in 1:length(p_2)) y_2[i] <- rbinom(1,1,p_2[i]) Y <- cbind(Y, y_2) P <- cbind(P, p_2) y_1 <- y_2 } } return(list(Y = Y, P = P)) } set.seed(1) n <- 30 n.test <- 10 d <- 2 X <- matrix(runif(d * n), ncol = d) ##### without time-series ##### Y <- test_function(X, 1)$Y ## Y is a vector binaryGP.model <- binaryGP_fit(X = X, Y = Y) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results ##### with time-series, lag 1 ##### Y <- test_function(X, 10)$Y ## Y is a matrix with 10 columns binaryGP.model <- binaryGP_fit(X = X, Y = Y, R = 1) print(binaryGP.model) # print estimation results summary(binaryGP.model) # significance results