lmm | R Documentation |
lmm is used to fit linear mixed-effects models (LMM) based on summary-level data. The LMM parameters are estimated by restricted maximum likelihood (REML) with Fisher scoring (FS) gradient descent algorithm.
lmm(
XX,
XY,
ZX,
ZY,
ZZ,
Ynorm,
n,
d = ncol(ZZ),
theta0 = NULL,
method = c("REML", "ML"),
max.iter = 50,
epsilon = 1e-05,
output.cov = TRUE,
output.RE = FALSE
)
XX |
t(X)%*%X, X is a design matrix for fixed effects. |
XY |
t(Y%*%X), Y is a features-by-samples matrix of observed responses (genes-by-cells expression matrix for scRNA-seq). |
ZX |
t(Z)%*%X, Z = [Z1, ..., Zk], a design matrix for k random factors (variables). |
ZY |
t(Y%*%Z). |
ZZ |
t(Z)%*%Z. |
Ynorm |
Norms for features (each row in Y), that is, Ynorm = rowSums(Y*Y). |
n |
Numbers of samples (cells in scRNA-seq), nrow(X). |
d |
A vector of (m1,...,mk), mi = ncol(Zi), number of columns in Zi. m1 + ... + mk = ncol(Z), number of columns in Z. |
theta0 |
A vector of initial values of the variance components, (s1, ...,sk, s_(k+1)), si = sigma_i^2, the variance component of the i-th type random effects. s_(k+1) = sigma^2, the variance component of model residual error. |
method |
Either REML or ML with Fisher scoring (FS) iterative algorithm. |
max.iter |
The maximal number of iterations for the iterative algorithm. |
epsilon |
Positive convergence tolerance. If the absolute value of the first partial derivative of log likelihood is less than epsilon, the iterations converge. |
output.cov |
If TRUE, output the covariance matrices for the estimated coefficients, which are needed for testing contrasts. |
output.RE |
If TRUE, output the best linear unbiased prediction (BLUP) of the random effects. |
A list containing the following components:
dlogL First partial derivatives of log-likelihoods for each feature (gene).
logLik Either log maximum likelihood (ML) or log restricted maximum likelihood (REML) for each feature (gene). The log restricted maximum likelihood may have a constant difference from other computations. The constant difference is related to the fixed design matrix, X.
niter Nmbers of iterations for each feature (gene).
coef A matrix of estimated coefficients (fixed effects), each column corresponds to a feature (gene) and each row one covariate.
se A matrix of the standard errors of the estimated coefficients.
t A matrix of t-values for the fixed effects, equal to coef/se.
df Degrees of freedom.
p A matrix of two-sided p-values for the fixed effects.
cov A array of covariance matrices of the estimated coefficients (fixed effects).
theta A matrix of the estimated variance components, each column corresponds to a feature (gene) and each row one variance component. The last row is the variance component of the residual error.
se.theta Standard errors of the estimated theta.
RE A matrix of the best linear unbiased prediction (BLUP) of random effects.
#Generate data: X, Y, and Z.
set.seed(2024)
n <- 1e3
m <- 10
Y <- matrix(rnorm(m*n), m, n)
rownames(Y) <- paste0("Gene", 1:nrow(Y))
trt <- sample(c("A", "B"), n, replace = TRUE)
X <- model.matrix(~ 0 + trt)
q <- 20
sam <- rep(NA, n)
sam[trt == "A"] <- paste0("A", sample.int(q/2, sum(trt == "A"), replace = TRUE))
sam[trt == "B"] <- paste0("B", sample.int(q/2, sum(trt == "B"), replace = TRUE))
Z <- model.matrix(~ 0 + sam)
d <- ncol(Z)
#Fit LMM by summary-level data
#Compute and store the summary-level data:
n <- nrow(X)
XX <- t(X)%*%X
XY <- t(Y%*%X)
ZX <- t(Z)%*%X
ZY <- t(Y%*%Z)
ZZ <- t(Z)%*%Z
Ynorm <- rowSums(Y*Y)
fit <- lmm(XX, XY, ZX, ZY, ZZ, Ynorm = Ynorm, n = n, d = d)
str(fit)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.