GPModel: Create a 'GPModel' object
In gpboost: Combining Tree-Boosting with Gaussian Process and Mixed Effects Models

GPModel

R Documentation

Create a `GPModel` object

Description

Create a GPModel which contains a Gaussian process and / or mixed effects model with grouped random effects

Usage

GPModel(likelihood = "gaussian", group_data = NULL,
  group_rand_coef_data = NULL, ind_effect_group_rand_coef = NULL,
  drop_intercept_group_rand_effect = NULL, gp_coords = NULL,
  gp_rand_coef_data = NULL, cov_function = "matern", cov_fct_shape = 1.5,
  gp_approx = "none", num_parallel_threads = NULL, GPU_use = FALSE,
  matrix_inversion_method = "default", weights = NULL,
  likelihood_learning_rate = 1, cov_fct_taper_range = 1,
  cov_fct_taper_shape = 1, num_neighbors = NULL,
  vecchia_ordering = "random", ind_points_selection = "kmeans++",
  num_ind_points = NULL, cover_tree_radius = 1, seed = 0L,
  cluster_ids = NULL, likelihood_additional_param = NULL, num_data = NULL,
  free_raw_data = FALSE, vecchia_approx = NULL, vecchia_pred_type = NULL,
  num_neighbors_pred = NULL)

Arguments

`likelihood`	A `string` specifying the likelihood function (distribution) of the response variable. Available options: "gaussian" "bernoulli_logit": Bernoulli likelihood with a logit link function for binary classification. Aliases: "binary", "binary_logit" "bernoulli_probit": Bernoulli likelihood with a probit link function for binary classification. Aliases: "binary_probit" "quasi_bernoulli_logit": quasi-Bernoulli likelihood with a logit link function for y in [0,1]. Aliases: "quasi_binary", "quasi_binary_logit" "quasi_bernoulli_probit": quasi-Bernoulli likelihood with a probit link function for y in [0,1]. Aliases: "quasi_binary_probit" "binomial_logit": Binomial likelihood with a logit link function. The response variable `y` needs to contain proportions of successes / trials, and the `weights` parameter needs to contain the numbers of trials. Aliases: "binomial" "binomial_probit": Binomial likelihood with a probit link function. The response variable `y` needs to contain proportions of successes / trials, and the `weights` parameter needs to contain the numbers of trials "beta_binomial": Beta-binomial likelihood with a logit link function. The response variable `y` needs to contain proportions of successes / trials, and the `weights` parameter needs to contain the numbers of trials. Aliases: "betabinomial", "beta-binomial" "poisson": Poisson likelihood with a log link function "negative_binomial": negative binomial likelihood with a log link function (aka "nbinom2", "negative_binomial_2"). The variance is mu * (mu + r) / r, mu = mean, r = shape, with this parametrization "negative_binomial_1": Negative binomial 1 (aka "nbinom1") likelihood with a log link function. The variance is mu * (1 + phi), mu = mean, phi = dispersion, with this parametrization "gamma": Gamma likelihood with a log link function "lognormal": Log-normal likelihood with a log link function "beta" : Beta likelihood with a logit link function (parametrization of Ferrari and Cribari-Neto, 2004) "t": t-distribution (e.g., for robust regression) "t_fix_df": t-distribution with the degrees-of-freedom (df) held fixed and not estimated The degrees-of-freedom (df) can be set via the `likelihood_additional_param` parameter. The default is df = 2 "quantile_regression" / "asymmetric_laplace" : an asymmetric Laplace likelihood for quantile regression, aliases: "asymmetric_laplace", "quantile_regression" The quantile can be set via the `likelihood_additional_param` parameter. The default is quantile = 0.5 "zero_inflated_gamma": Zero-inflated gamma likelihood. The log-transformed mean of the response variable equals the sum of fixed and random effects, E(y) = mu = exp(F(X) + Zb), and the rate parameter equals (1-p0) * gamma / mu, where p0 is the zero-inflation probability and gamma the shape parameter. I.e., the rate parameter depends on F(X) + Zb, and p0 and gamma are (univariate auxiliary) parameters that are estimated. Note that E(y) = mu above refers the the mean of the entire distribution and not just the positive part "zero_censored_power_transformed_normal": Likelihood of a censored and power-transformed normal variable for modeling data with a point mass at 0 and a continuous distribution for y > 0. The model used is Y = max(0,X)^lambda, X ~ N(mu, sigma^2), where mu = F(X) + Zb, and sigma and lambda are (auxiliary) parameters that are estimated. For more details on this model, see Sigrist et al. (2012, AOAS) "A dynamic nonstationary spatio-temporal model for short term prediction of precipitation" "zoctn": Zero-one censored transformed normal likelihood for modeling data in [0,1] with point masses at 0 and 1 and a continuous distribution on (0,1). The model used is Z ~ N(mu, sigma^2), W = max(min(Z,1),0), and Y = g(W), where g(x) = expit(a + b * logit(x)) for x in (0,1), mu = F(X) + Zb, and sigma, a, and b are (auxiliary) parameters that are estimated. For more details on this model, see Qiang and Sigrist (2026) "zero_one_censored_transformed_beta": Zero-one censored transformed beta likelihood for modeling data in [0,1] with point masses at 0 and 1 and a continuous distribution on (0,1). If T follows a beta distribution with mean mu = expit(F(X) + Zb) and precision phi, the observed response is obtained by applying the linear transformation Y = (1 + 2u) * T - u and censoring the result to [0,1]. The precision phi and shift u are (auxiliary) parameters that are estimated. For more details on this model, see Kosmidis and Zeileis (2025) "zero_one_censored_shifted_gamma": Zero-one censored shifted gamma likelihood for modeling data in [0,1] with point masses at 0 and 1 and a continuous distribution on (0,1). The model used is Y = min(max(Z - xi, 0), 1), where Z follows a gamma distribution with mean mu = exp(F(X) + Zb) and shape k. The shape k and shift xi are (auxiliary) parameters that are estimated. For more details on this model, see Sigrist and Stahel (2011) "gaussian_heteroscedastic": Gaussian likelihood where both the mean and the variance are related to fixed and random effects. This is currently only implemented for GPs with a 'vecchia' approximation Note: the first lines in the likelihoods source file contain additional comments on the specific parametrizations used Note: other likelihoods can be implemented upon request
`group_data`	A `vector` or `matrix` whose columns are categorical grouping variables. The elements being group levels defining grouped random effects. The elements of 'group_data' can be integer, double, or character. The number of columns corresponds to the number of grouped (intercept) random effects
`group_rand_coef_data`	A `vector` or `matrix` with numeric covariate data for grouped random coefficients
`ind_effect_group_rand_coef`	A `vector` with `integer` indices that indicate the corresponding categorical grouping variable (=columns) in 'group_data' for every covariate in 'group_rand_coef_data'. Counting starts at 1. The length of this index vector must equal the number of covariates in 'group_rand_coef_data'. For instance, c(1,1,2) means that the first two covariates (=first two columns) in 'group_rand_coef_data' have random coefficients corresponding to the first categorical grouping variable (=first column) in 'group_data', and the third covariate (=third column) in 'group_rand_coef_data' has a random coefficient corresponding to the second grouping variable (=second column) in 'group_data'
`drop_intercept_group_rand_effect`	A `vector` of type `logical` (boolean). Indicates whether intercept random effects are dropped (only for random coefficients). If drop_intercept_group_rand_effect[k] is TRUE, the intercept random effect number k is dropped / not included. Only random effects with random slopes can be dropped.
`gp_coords`	A `matrix` with numeric coordinates (= inputs / features) for defining Gaussian processes
`gp_rand_coef_data`	A `vector` or `matrix` with numeric covariate data for Gaussian process random coefficients
`cov_function`	A `string` specifying the covariance function for the Gaussian process. Available options: "matern": Matern covariance function with the smoothness specified by the `cov_fct_shape` parameter (using the parametrization of Rasmussen and Williams, 2006) "matern_estimate_shape": same as "matern" but the smoothness parameter is also estimated "matern_space_time": Spatio-temporal Matern covariance function with different range parameters for space and time. Note that the first column in `gp_coords` must correspond to the time dimension "space_time_gneiting": Spatio-temporal covariance function given in Eq. (16) of Gneiting (2002). Note that the first column in `gp_coords` must correspond to the time dimension. This covariance has seven parameters (in the following order: sigma2, a, c, alpha, nu, beta, delta) which are all estimated by default. You can disable the estimation of some of these parameter using the 'estimate_cov_par_index' argument of the `params` argument in either the `fit` function of a `gp_model` object or the `set_optim_params` function prior to estimation. "matern_ard": anisotropic Matern covariance function with Automatic Relevance Determination (ARD), i.e., with a different range parameter for every coordinate of `gp_coords` "matern_ard_estimate_shape": same as "matern_ard" but the smoothness parameter is also estimated "exponential": Exponential covariance function (using the parametrization of Diggle and Ribeiro, 2007) "gaussian": Gaussian, aka squared exponential, covariance function (using the parametrization of Diggle and Ribeiro, 2007) "gaussian_ard": anisotropic Gaussian, aka squared exponential, covariance function with Automatic Relevance Determination (ARD), i.e., with a different range parameter for every coordinate of `gp_coords` "powered_exponential": powered exponential covariance function with the exponent specified by the `cov_fct_shape` parameter (using the parametrization of Diggle and Ribeiro, 2007) "wendland": Compactly supported Wendland covariance function (using the parametrization of Bevilacqua et al., 2019, AOS) "linear": linear covariance function. This corresponds to a Bayesian linear regression model with a Gaussian prior on the coefficients with a constant variance diagonal prior covariance, and the prior variance is estimated using empirical Bayes. "hurst": Hurst covariance function cov(s, s') = (sigma2 / 2) * ( \|\|s\|\|^(2H) + \|\|s'\|\|^(2H) - \|\|s - s'\|\|^(2H) ). For H = 0.5, this corresponds to Brownian motion (-> see the 'estimate_cov_par_index' argument) "hurst_ard": Hurst covariance function with with Automatic Relevance Determination (ARD), i.e., with a different range parameter for every coordinate of “gp_coords“ except for the first coordinate which has a range parameter of 1 due to identifiability with the marginal variance: `cov(s, s') = (\sigma^2/2)\left[ \left(s_1^2 + \sum_{k=2}^d (s_k/l_k)^2\right)^H + \left({s'}_1^2 + \sum_{k=2}^d ({s'}_k/l_k)^2\right)^H - \left((s_1-{s'}_1)^2 + \sum_{k=2}^d ((s_k-{s'}_k)/l_k)^2\right)^H \right]`
`cov_fct_shape`	A `numeric` specifying the shape parameter of the covariance function (e.g., smoothness parameter for Matern and Wendland covariance) This parameter is irrelevant for some covariance functions such as the exponential or Gaussian
`gp_approx`	A `string` specifying the large data approximation for Gaussian processes. Available options: "none": No approximation "vecchia": Vecchia approximation; see Sigrist (2022, JMLR) for more details "full_scale_vecchia": Vecchia-inducing points full-scale (VIF) approximation; see Gyger, Furrer, and Sigrist (2025) for more details "tapering": The covariance function is multiplied by a compactly supported Wendland correlation function "fitc": Fully Independent Training Conditional approximation aka modified predictive process approximation; see Gyger, Furrer, and Sigrist (2024) for more details "full_scale_tapering": Full-scale approximation combining an inducing point / predictive process approximation with tapering on the residual process; see Gyger, Furrer, and Sigrist (2024) for more details "vecchia_latent": similar as "vecchia" but a Vecchia approximation is applied to the latent Gaussian process for likelihood == "gaussian". For likelihood != "gaussian", "vecchia" and "vecchia_latent" are equivalent
`num_parallel_threads`	An `integer` specifying the number of parallel threads for OMP. If num_parallel_threads = NULL, all available threads are used
`GPU_use`	A `boolean`. If TRUE, GPU acceleration will be used if supported
`matrix_inversion_method`	A `string` specifying the method used for inverting covariance matrices. Available options: "default": iterative methods where possible, otherwise Cholesky factorization "cholesky": Cholesky factorization "iterative": iterative methods. A combination of the conjugate gradient, the Lanczos algorithm, and other methods. This is currently only supported for the following cases: grouped random effects with more than one level likelihood != "gaussian" and gp_approx == "vecchia" (non-Gaussian likelihoods with a Vecchia-Laplace approximation) likelihood != "gaussian" and gp_approx == "full_scale_vecchia" (non-Gaussian likelihoods with a VIF approximation) likelihood == "gaussian" and gp_approx == "full_scale_tapering" (Gaussian likelihood with a full-scale tapering approximation)
`weights`	A `vector` with sample weights. Note that this affects both the random and fixed effects components.
`likelihood_learning_rate`	A `numeric` with a learning rate for the likelihood for generalized Bayesian inference (only non-Gaussian likelihoods)
`cov_fct_taper_range`	A `numeric` specifying the range parameter of the Wendland covariance function and Wendland correlation taper function. We follow the notation of Bevilacqua et al. (2019, AOS)
`cov_fct_taper_shape`	A `numeric` specifying the shape (=smoothness) parameter of the Wendland covariance function and Wendland correlation taper function. We follow the notation of Bevilacqua et al. (2019, AOS)
`num_neighbors`	An `integer` specifying the number of neighbors for the Vecchia and VIF approximations. Internal default values if NULL: 20 for gp_approx = "vecchia" 30 for gp_approx = "full_scale_vecchia" Note: for prediction, the number of neighbors can be set through the 'num_neighbors_pred' parameter in the 'set_prediction_data' function. By default, num_neighbors_pred = 2 * num_neighbors. Further, the type of Vecchia approximation used for making predictions is set through the 'vecchia_pred_type' parameter in the 'set_prediction_data' function
`vecchia_ordering`	A `string` specifying the ordering used in the Vecchia approximation. Available options: "none": the default ordering in the data is used "random": a random ordering "time": ordering accorrding to time (only for space-time models) "time_random_space": ordering according to time and randomly for all spatial points with the same time points (only for space-time models)
`ind_points_selection`	A `string` specifying the method for choosing inducing points Available options: "kmeans++: the k-means++ algorithm "cover_tree": the cover tree algorithm "random": random selection from data points
`num_ind_points`	An `integer` specifying the number of inducing points / knots for FITC, full_scale_tapering, and VIF approximations. Internal default values if NULL: 500 for gp_approx = "FITC" and gp_approx = "full_scale_tapering" 200 for gp_approx = "full_scale_vecchia"
`cover_tree_radius`	A `numeric` specifying the radius (= "spatial resolution") for the cover tree algorithm
`seed`	An `integer` specifying the seed used for model creation (e.g., random ordering in Vecchia approximation)
`cluster_ids`	A `vector` with elements indicating independent realizations of random effects / Gaussian processes (same values = same process realization). The elements of 'cluster_ids' can be integer, double, or character.
`likelihood_additional_param`	A `numeric` specifying an additional parameter for the `likelihood` which cannot be estimated for this `likelihood` (e.g., degrees of freedom for `likelihood = "t_fix_df"`). This is not to be confused with any auxiliary parameters that can be estimated and accessed through the function `get_aux_pars` after estimation. Note that this `likelihood_additional_param` parameter is irrelevant for many likelihoods. If `likelihood_additional_param = NULL`, the following internal default values are used: df = 2 for likelihood = "t_fix_df" quantile = 0.5 for likelihood = "asymmetric_laplace"
`num_data`	A `numeric` with the number of samples. This is only used for iid models
`free_raw_data`	A `boolean`. If TRUE, the data (groups, coordinates, covariate data for random coefficients) is freed in R after initialization
`vecchia_approx`	Discontinued. Use the argument `gp_approx` instead
`vecchia_pred_type`	A `string` specifying the type of Vecchia approximation used for making predictions. This is discontinued here. Use the function 'set_prediction_data' to specify this
`num_neighbors_pred`	an `integer` specifying the number of neighbors for making predictions. This is discontinued here. Use the function 'set_prediction_data' to specify this

Value

A GPModel containing ontains a Gaussian process and / or mixed effects model with grouped random effects

Author(s)

Fabio Sigrist

Examples

# See https://github.com/fabsig/GPBoost/tree/master/R-package for more examples

data(GPBoost_data, package = "gpboost")

#--------------------Grouped random effects model: single-level random effect----------------
gp_model <- GPModel(group_data = group_data[,1], likelihood="gaussian")

#--------------------Gaussian process model----------------
gp_model <- GPModel(gp_coords = coords, cov_function = "matern", cov_fct_shape = 1.5,
                    likelihood="gaussian")

#--------------------Combine Gaussian process with grouped random effects----------------
gp_model <- GPModel(group_data = group_data,
                    gp_coords = coords, cov_function = "matern", cov_fct_shape = 1.5,
                    likelihood="gaussian")

gpboost documentation built on June 25, 2026, 1:07 a.m.