sgdgmf.init: Initialize the parameters of a generalized matrix...
In sgdGMF: Estimation of Generalized Matrix Factorization Models via Stochastic Gradient Descent

sgdgmf.init

R Documentation

Initialize the parameters of a generalized matrix factorization model

Description

Provide four initialization methods to set the initial values of a generalized matrix factorization (GMF) model identified by a glm family and a linear predictor of the form g(\mu) = \eta = X B^\top + A Z^\top + U V^\top, with bijective link function g(\cdot). See sgdgmf.fit for more details on the model specification.

Usage

sgdgmf.init(
  Y,
  X = NULL,
  Z = NULL,
  ncomp = 2,
  family = gaussian(),
  weights = NULL,
  offset = NULL,
  method = c("ols", "glm", "random", "values"),
  type = c("deviance", "pearson", "working", "link"),
  niter = 0,
  values = list(),
  verbose = FALSE,
  parallel = FALSE,
  nthreads = 1,
  savedata = TRUE
)

sgdgmf.init.ols(
  Y,
  X = NULL,
  Z = NULL,
  ncomp = 2,
  family = gaussian(),
  weights = NULL,
  offset = NULL,
  type = c("deviance", "pearson", "working", "link"),
  verbose = FALSE
)

sgdgmf.init.glm(
  Y,
  X = NULL,
  Z = NULL,
  ncomp = 2,
  family = gaussian(),
  weights = NULL,
  offset = NULL,
  type = c("deviance", "pearson", "working", "link"),
  verbose = FALSE,
  parallel = FALSE,
  nthreads = 1
)

sgdgmf.init.random(
  Y,
  X = NULL,
  Z = NULL,
  ncomp = 2,
  family = gaussian(),
  weights = NULL,
  offset = NULL,
  sigma = 1
)

sgdgmf.init.custom(
  Y,
  X = NULL,
  Z = NULL,
  ncomp = 2,
  family = gaussian(),
  values = list(),
  verbose = FALSE
)

Arguments

`Y`	matrix of responses (`n \times m`)
`X`	matrix of row-specific fixed effects (`n \times p`)
`Z`	matrix of column-specific fixed effects (`q \times m`)
`ncomp`	rank of the latent matrix factorization
`family`	a model family, as in the `glm` interface
`weights`	matrix of constant weights (`n \times m`)
`offset`	matrix of constant offset (`n \times m`)
`method`	optimization method to be used for the initial fit
`type`	type of residuals to be used for initializing `U` via incomplete SVD decomposition
`niter`	number of iterations to refine the initial estimate (only if `method="ols"` or `"svd"`)
`values`	a list of custom initial values for `B`, `A`, `U` and `V`
`verbose`	if `TRUE`, prints the status of the initialization process
`parallel`	if `TRUE`, allows for parallel computing using the `foreach` package (only if `method="glm"`)
`nthreads`	number of cores to be used in parallel (only if `parallel=TRUE` and `method="glm"`)
`savedata`	if `TRUE`, stores a copy of the input data

Details

If method = "ols", the initialization is performed fitting a sequence of linear regressions followed by a residual SVD decomposition. To account for non-Gaussian distribution of the data, regression and decomposition are applied on the transformed response matrix Y_h = (g \circ h)(Y), where h(\cdot) is a function which prevent Y_h to take infinite values. For instance, in the Binomial case h(y) = 2 (1-\epsilon) y + \epsilon, while in the Poisson case h(y) = y + \epsilon, where \epsilon is a small positive constant, typically 0.1 or 0.01.

If method = "glm", the initialization is performed by fitting a sequence of generalized linear models followed by a residual SVD decomposition. In particular, to set \beta_j, we use independent GLM fit with y_j \sim X \beta_j. Similarly, to set \alpha_i, we fit the model y_i \sim Z \alpha_i + o_i, with offset o_i = B x_i. Then, we obtain U via SVD on the residuals. Finally, we obtain V via independent GLM fit under the model y_j \sim U v_j + o_j, with offset o_i = X \beta_j + A z_j.

Both under method = "ols" and method = "glm", it is possible to specify the parameter type to change the type of residuals used for the SVD decomposition.

If method = "random", the initialization is performed using independent Gaussian random values for all the parameters in the model.

If method = "values", the initialization is performed using user-specified values provided as an input, which must have compatible dimensions.

Value

An initgmf object, namely a list, containing the initial estimates of the GMF parameters. In particular, the returned object collects the following information:

Y: response matrix (only if savedata=TRUE)
X: row-specific covariate matrix (only if savedata=TRUE)
Z: column-specific covariate matrix (only if savedata=TRUE)
B: the estimated col-specific coefficient matrix
A: the estimated row-specific coefficient matrix
U: the estimated factor matrix
V: the estimated loading matrix
phi: the estimated dispersion parameter
method: the selected estimation method
family: the model family
ncomp: rank of the latent matrix factorization
type: type of residuals used for the initialization of U
verbose: if TRUE, print the status of the initialization process
parallel: if TRUE, allows for parallel computing
nthreads: number of cores to be used in parallel
savedata: if TRUE, stores a copy of the input data

Examples

library(sgdGMF)

# Set the data dimensions
n = 100; m = 20; d = 5

# Generate data using Poisson, Binomial and Gamma models
data_pois = sim.gmf.data(n = n, m = m, ncomp = d, family = poisson())
data_bin = sim.gmf.data(n = n, m = m, ncomp = d, family = binomial())
data_gam = sim.gmf.data(n = n, m = m, ncomp = d, family = Gamma(link = "log"), dispersion = 0.25)

# Initialize the GMF parameters assuming 3 latent factors
init_pois = sgdgmf.init(data_pois$Y, ncomp = 3, family = poisson(), method = "ols")
init_bin = sgdgmf.init(data_bin$Y, ncomp = 3, family = binomial(), method = "ols")
init_gam = sgdgmf.init(data_gam$Y, ncomp = 3, family = Gamma(link = "log"), method = "ols")

# Get the fitted values in the link and response scales
mu_hat_pois = fitted(init_pois, type = "response")
mu_hat_bin = fitted(init_bin, type = "response")
mu_hat_gam = fitted(init_gam, type = "response")

# Compare the results
oldpar = par(no.readonly = TRUE)
par(mfrow = c(3,3), mar = c(1,1,3,1))
image(data_pois$Y, axes = FALSE, main = expression(Y[Pois]))
image(data_pois$mu, axes = FALSE, main = expression(mu[Pois]))
image(mu_hat_pois, axes = FALSE, main = expression(hat(mu)[Pois]))
image(data_bin$Y, axes = FALSE, main = expression(Y[Bin]))
image(data_bin$mu, axes = FALSE, main = expression(mu[Bin]))
image(mu_hat_bin, axes = FALSE, main = expression(hat(mu)[Bin]))
image(data_gam$Y, axes = FALSE, main = expression(Y[Gam]))
image(data_gam$mu, axes = FALSE, main = expression(mu[Gam]))
image(mu_hat_gam, axes = FALSE, main = expression(hat(mu)[Gam]))
par(oldpar)

sgdGMF documentation built on June 8, 2025, 12:05 p.m.

sgdGMF index

Package overview README.md Algorithm comparison" Analysis of the residuals" Initialization algorithms" Introduction to the sgdGMF package"

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

sgdGMF
Estimation of Generalized Matrix Factorization Models via Stochastic Gradient Descent

sgdgmf.init: Initialize the parameters of a generalized matrix...
In sgdGMF: Estimation of Generalized Matrix Factorization Models via Stochastic Gradient Descent

Initialize the parameters of a generalized matrix factorization model

Description

Usage

Arguments

Details

Value

Examples

Related to sgdgmf.init in sgdGMF...

R Package Documentation

Browse R Packages

We want your feedback!

sgdGMF Estimation of Generalized Matrix Factorization Models via Stochastic Gradient Descent

sgdgmf.init: Initialize the parameters of a generalized matrix... In sgdGMF: Estimation of Generalized Matrix Factorization Models via Stochastic Gradient Descent

Initialize the parameters of a generalized matrix factorization model

Description

Usage

Arguments

Details

Value

Examples

Related to sgdgmf.init in sgdGMF...

R Package Documentation

Browse R Packages

We want your feedback!

sgdGMF
Estimation of Generalized Matrix Factorization Models via Stochastic Gradient Descent

sgdgmf.init: Initialize the parameters of a generalized matrix...
In sgdGMF: Estimation of Generalized Matrix Factorization Models via Stochastic Gradient Descent