| lmmfit | R Documentation |
lmmfit, a wrapper function of lmm, fits linear mixed-effects models (LMM) by sample-level data. The LMM parameters are estimated by either restricted maximum likelihood (REML) or maximum likelihood (ML) method with Fisher scoring (FS) gradient descent algorithm.
lmmfit(
Y,
X,
Z,
d = ncol(Z),
method = c("REML", "ML"),
nBlocks = ceiling((ncol(Y) * 1e-08) * nrow(Y)),
theta0 = NULL,
max.iter = 50,
epsilon = 1e-05,
output.cov = TRUE,
output.FIM = FALSE,
output.RE = FALSE,
output.SS = FALSE,
verbose = FALSE
)
Y |
a features-by-samples matrix of responses. For single-cell RNA sequencing (scRNA-seq) data, Y is a genes-by-cells gene expression matrix, whose entries can be raw counts, log-transformed values, or normalized values. If Y is a matrix of raw counts, it will be log-transformed as log2(1 + Y) for differential expression analysis. |
X |
a design matrix for fixed effects, with rows corresponding to the columns of Y. |
Z |
[Z1, ..., Zk] is a design matrix for k random components (factors), with rows corresponding to the columns of Y. |
d |
(d1,...,dk), where di = ncol(Zi), the number of columns in Zi. sum(d) = ncol(Z), the number of columns in Z. For the model with only one random component, d = ncol(Z). |
method |
a character string, either "REML" or "ML". Defaults to "REML". If ‘REML’, the model is fit using REML; otherwise, using ML. |
nBlocks |
the number of blocks into which large datasets are subdivided to reduce memory usage when computing summary statistics. The default value, nBlocks = ceiling((ncol(Y)*1e-08)*nrow(Y)), may not be adequate. If encountering the error: vector memory limit reached, you should increase the nBlocks value to avoid the issue. |
theta0 |
a vector of initial values of the variance components, (s1, ...,sk, s_(k+1)), si = sigma_i^2, the i-th variance component. s_(k+1) = sigma^2, the variance component of model residual error. |
max.iter |
the maximal number of iterations for the iterative algorithm. |
epsilon |
positive convergence tolerance. If the absolute value of the first partial derivative of log likelihood is less than epsilon, the iterations converge. |
output.cov |
a logical scalar. If TRUE, output an array of the covariance matrices for the estimated coefficients, cov.beta. |
output.FIM |
a logical scalar. If TRUE, output an array of the Fisher information matrices (FIM) for fixed effects, FIM.beta[,,i] = |
output.RE |
a logical scalar. If TRUE, output the best linear unbiased prediction (BLUP) of the random effects. |
output.SS |
a logical scalar. If TURE, output list(XX, XY, ZX, ZY, ZZ, Ynorm, n), a list of the summary-level data (summary statistics), defined by XX=t(X)%*%X, XY=t(Y%*%X), ZX=t(Z)%*%X, ZY=t(Z)%*%Y, ZZ=t(Z)%*%Z, Ynorm=rowSums(Y*Y), and n=nrow(X), which can also be computed using the sslmm function. |
verbose |
a logical scalar. If TRUE, print the number of features for which LMM fitting did not converge (abs(dlogL) > epsilon). |
A list containing the following components:
method |
the method, either REML or ML, for estimating the LMM parameters (fixed effects and variance components). |
epsilon |
convergence tolerance. |
dlogL |
first partial derivatives of log-likelihoods for each feature. |
logLik |
maximum log-likelihoods for ML method or maximum log-restricted-likelihood for REML method. |
niter |
numbers of iterations for each feature. |
coef |
a matrix of estimated coefficients (fixed effects or beta), each column corresponds to a feature and each row one covariate. |
se |
a matrix of standard errors of the estimated coefficients. |
t |
a matrix of t-values for the fixed effects, equal to coef/se. |
p |
a matrix of two-sided p-values for the t-tests of the fixed effects. |
df |
degrees of freedom for the t-statistics (values). |
cov |
a array of covariance matrices of the estimated coefficients (fixed effects). |
theta |
a matrix of the estimated variance components, each column corresponds to a feature and each row one variance component. The last row is the variance component of the residual error. |
se.theta |
standard errors of the estimated theta. |
FIM.beta |
the Fisher information matrices (FIM) for fixed effects. |
FIM.theta |
the Fisher information matrices (FIM) for variance components. |
RE |
a matrix of the best linear unbiased prediction (BLUP) of random effects. |
summary_data |
a list of the summary-level data (summary statistics). |
lmm
#Generate data: X, Y, and Z.
set.seed(2024)
n <- 1e3
m <- 10
Y <- matrix(rnorm(m*n), m, n)
rownames(Y) <- paste0("Gene", 1:nrow(Y))
trt <- sample(c("A", "B"), n, replace = TRUE)
X <- model.matrix(~ 0 + trt)
q <- 20
sam <- rep(NA, n)
sam[trt == "A"] <- paste0("A", sample.int(q/2, sum(trt == "A"), replace = TRUE))
sam[trt == "B"] <- paste0("B", sample.int(q/2, sum(trt == "B"), replace = TRUE))
Z <- model.matrix(~ 0 + sam)
d <- ncol(Z)
#Fit LMM by the cell-level data
fit <- lmmfit(Y, X, Z, d = d)
str(fit)
#Fit LMM by summary-level data
#Compute and store the summary-level data:
n <- nrow(X)
XX <- t(X)%*%X
XY <- t(Y%*%X)
ZX <- t(Z)%*%X
ZY <- t(Y%*%Z)
ZZ <- t(Z)%*%Z
Ynorm <- rowSums(Y*Y)
fitss <- lmm(XX, XY, ZX, ZY, ZZ, Ynorm = Ynorm, n = n, d = d)
identical(fit, fitss)
#Hypothesis testing
lmmtest(fit)
lmmtest(fit, index = 2)
lmmtest(fit, contrast = cbind("B-A" = c(-1, 1)))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.