predict.glmGamPoi: Predict 'link' or 'response' values for Gamma-Poisson GLMs
In const-ae/glmGamPoi: Fit a Gamma-Poisson Generalized Linear Model

predict.glmGamPoi

R Documentation

Predict 'link' or 'response' values for Gamma-Poisson GLMs

Description

Predict mu (i.e., type = "response") or log(mu) (i.e., type = "link") from a 'glmGamPoi' fit (created by glm_gp(...)) with the corresponding estimate of the standard error. If newdata is NULL, mu is returned for the original input data.

Usage

## S3 method for class 'glmGamPoi'
predict(
  object,
  newdata = NULL,
  type = c("link", "response"),
  se.fit = FALSE,
  offset = mean(object$Offset),
  on_disk = NULL,
  verbose = FALSE,
  ...
)

Arguments

`object`	a `glmGamPoi` fit object (produced by `glm_gp()`).
`newdata`	a specification of the new data for which the expression for each gene is predicted. `newdata` should be a data.frame if the original fit was specified with a `formula`, provide a `data.frame` with one column for each variable in the formula. For example, if `glm_gp(se, design = ~ age + batch + treatment)`, then the data.frame needs a `age`, `batch`, and `treatment` column that contain the same data types as the original fit. vector if the original fit was specified using a vector, you need to again provide a vector with the same format. matrix if `newdata` is a matrix, it is applied directly as `Mu <- exp(object$Beta %*% t(newdata) + object$offset_matrix)`. So make sure, that it is constructed correctly. NULL if `newdata` is `NULL`, the predicted values for the original input data are returned.
`type`	either 'link' or 'response'. The default is 'link', which returns the predicted values before the link function (`exp()`) is applied. Thus, the values can be positive and negative numbers. However, often the predicted values are easier to interpret after the link function is applied (i.e., type = "response"), because then the values are on the same scale as the original counts.
`se.fit`	boolean that indicates if in addition to the mean the standard error of the mean is returned.
`offset`	count models (in particular for sequencing experiments) usually have a sample specific size factor (`⁠offset = log(size factor)⁠`). It defines how big we expect the predicted results are. If `newdata` is `NULL`, the `offset` is ignored, because the `predict()` returns a result based on the pre-calculated `object$Mu`. If `newdata` is not `NULL`, by default the `offset` is `mean(object$Offset)`, which puts the in the same size as the average sample.
`on_disk`	a boolean that indicates if the results are `HDF5Matrix`'s from the `HDF5Array` package. If `newdata` is `NULL`, `on_disk` is ignored. Otherwise, if `on_disk = NULL`, the result is calculated on disk depending if `offset` is stored on disk.
`verbose`	a boolean that indicates if information about the individual steps are printed while predicting. Default: `FALSE`.
`...`	currently ignored.

Details

For se.fit = TRUE, the function sticks very close to the behavior of stats::predict.glm() for fits from MASS::glm.nb().

Note: If type = "link", the results are computed using the natural logarithm as the link function. This differs from the lfc estimate provided by test_de, which are on the log2 scale.

Value

If se.fit == FALSE, a matrix with dimensions ⁠nrow(object$data) x nrow(newdata)⁠.
If se.fit == TRUE, a list with three entries

fit: the predicted values as a matrix with dimensions ⁠nrow(object$data) x nrow(newdata)⁠. This is what would be returned if se.fit == FALSE.
se.fit: the associated standard errors for each fit. Also a matrix with dimensions ⁠nrow(object$data) x nrow(newdata)⁠.
residual.scale: Currently fixed to 1. In the future, this might become the values from object$overdispersion_shrinkage_list$ql_disp_shrunken.

Examples


 set.seed(1)
 # The simplest example
 y <- rnbinom(n = 10, mu = 3, size = 1/2.4)
 fit <- glm_gp(y, size_factors = FALSE)
 predict(fit, type = "response")
 predict(fit, type = "link", se.fit = TRUE)


 # Fitting a whole matrix
 model_matrix <- cbind(1, rnorm(5))
 true_Beta <- cbind(rnorm(n = 30), rnorm(n = 30, mean = 3))
 sf <- exp(rnorm(n = 5, mean = 0.7))
 model_matrix
 Y <- matrix(rnbinom(n = 30 * 5, mu = sf * exp(true_Beta %*% t(model_matrix)), size = 1/2.4),
             nrow = 30, ncol = 5)
 fit <- glm_gp(Y, design = model_matrix, size_factors = sf, verbose = TRUE)

 head(predict(fit, type = "response"))
 pred <- predict(fit, type = "link", se.fit = TRUE, verbose = TRUE)
 head(pred$fit)
 head(pred$se.fit)


 # Fitting a model with covariates
 data <- data.frame(fav_food = sample(c("apple", "banana", "cherry"), size = 50, replace = TRUE),
                    city = sample(c("heidelberg", "paris", "new york"), size = 50, replace = TRUE),
                    age = rnorm(n = 50, mean = 40, sd = 15))
 Y <- matrix(rnbinom(n = 4 * 50, mu = 3, size = 1/3.1), nrow = 4, ncol = 50)
 fit <- glm_gp(Y, design = ~ fav_food + city + age, col_data = data)
 predict(fit)[, 1:3]

 nd <- data.frame(fav_food = "banana", city = "paris", age = 29)
 predict(fit, newdata = nd)

 nd <- data.frame(fav_food = "banana", city = "paris", age = 29:40)
 predict(fit, newdata = nd, se.fit = TRUE, type = "response")

const-ae/glmGamPoi documentation built on April 5, 2025, 5:58 p.m.