sgdgmf.rank | R Documentation |
Select the number of significant principal components of a GMF model via exploitation of eigenvalue-gap methods
sgdgmf.rank(
Y,
X = NULL,
Z = NULL,
maxcomp = ncol(Y),
family = gaussian(),
weights = NULL,
offset = NULL,
method = c("onatski", "act", "oht"),
type.reg = c("ols", "glm"),
type.res = c("deviance", "pearson", "working", "link"),
normalize = FALSE,
maxiter = 10,
parallel = FALSE,
nthreads = 1,
return.eta = FALSE,
return.mu = FALSE,
return.res = FALSE,
return.cov = FALSE
)
Y |
matrix of responses ( |
X |
matrix of row-specific fixed effects ( |
Z |
matrix of column-specific fixed effects ( |
maxcomp |
maximum number of eigenvalues to compute |
family |
a family as in the |
weights |
matrix of optional weights ( |
offset |
matrix of optional offsets ( |
method |
rank selection method |
type.reg |
regression method to be used to profile out the covariate effects |
type.res |
residual type to be decomposed |
normalize |
if |
maxiter |
maximum number of iterations |
parallel |
if |
nthreads |
number of cores to be used in parallel (only if |
return.eta |
if |
return.mu |
if |
return.res |
if |
return.cov |
if |
A list containing the method
, the selected latent rank ncomp
,
and the eigenvalues used to select the latent rank lambdas
.
Additionally, if required, in the output list will also provide the linear predictor
eta
, the predicted mean matrix mu
, the residual matrix res
, and
the implied residual covariance matrix covmat
.
Onatski, A. (2010). Determining the number of factors from empirical distribution of eigenvalues. Review of Economics and Statistics, 92(4): 1004-1016
Gavish, M., Donoho, D.L. (2014) The optimal hard thresholding for singular values is 4/sqrt(3). IEEE Transactions on Information Theory, 60(8): 5040–5053
Fan, J., Guo, J. and Zheng, S. (2020). Estimating number of factors by adjusted eigenvalues thresholding. Journal of the American Statistical Association, 117(538): 852–861
Wang, L. and Carvalho, L. (2023). Deviance matrix factorization. Electronic Journal of Statistics, 17(2): 3762-3810
library(sgdGMF)
# Set the data dimensions
n = 100; m = 20; d = 5
# Generate data using Poisson, Binomial and Gamma models
data_pois = sim.gmf.data(n = n, m = m, ncomp = d, family = poisson())
data_bin = sim.gmf.data(n = n, m = m, ncomp = d, family = binomial())
data_gam = sim.gmf.data(n = n, m = m, ncomp = d, family = Gamma(link = "log"), dispersion = 0.25)
# Initialize the GMF parameters assuming 3 latent factors
ncomp_pois = sgdgmf.rank(data_pois$Y, family = poisson(), normalize = TRUE)
ncomp_bin = sgdgmf.rank(data_bin$Y, family = binomial(), normalize = TRUE)
ncomp_gam = sgdgmf.rank(data_gam$Y, family = Gamma(link = "log"), normalize = TRUE)
# Get the selected number of components
print(paste("Poisson:", ncomp_pois$ncomp))
print(paste("Binomial:", ncomp_bin$ncomp))
print(paste("Gamma:", ncomp_gam$ncomp))
# Plot the screeplot used for the component determination
oldpar = par(no.readonly = TRUE)
par(mfrow = c(3,1))
barplot(ncomp_pois$lambdas, main = "Poisson screeplot")
barplot(ncomp_bin$lambdas, main = "Binomial screeplot")
barplot(ncomp_gam$lambdas, main = "Gamma screeplot")
par(oldpar)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.