linear_regres: Linear Regression
In EvaYiwenWang/PLSDAbatch: PLSDA-batch

linear_regres

R Documentation

Linear Regression

Description

This function fits linear regression (linear model or linear mixed model) on each microbial variable and includes treatment and batch effects as covariates. It generates p-values, adjusted p-values for multiple comparisons, and evaluation metrics of model quality.

Usage

linear_regres(
    data,
    trt,
    batch.fix = NULL,
    batch.fix2 = NULL,
    batch.random = NULL,
    type = "linear model",
    p.adjust.method = "fdr"
)

Arguments

`data`	A data frame that contains the response variables for the linear regression. Samples as rows and variables as columns.
`trt`	A factor or a class vector for the treatment grouping information (categorical outcome variable).
`batch.fix`	A factor or a class vector for the batch grouping information (categorical outcome variable), treated as a fixed effect in the model.
`batch.fix2`	A factor or a class vector for a second batch grouping information (categorical outcome variable), treated as a fixed effect in the model.
`batch.random`	A factor or a class vector for the batch grouping information (categorical outcome variable), treated as a random effect in the model.
`type`	The type of model to be used for fitting, either 'linear model' or 'linear mixed model'.
`p.adjust.method`	The method to be used for p-value adjustment, either "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr" or "none".

Value

linear_regres returns a list that contains the following components:

`type`	The type of model used for fitting.
`model`	Each object fitted.
`raw.p`	The p-values for each response variable.
`adj.p`	The adjusted p-values for each response variable.
`p.adjust.method`	The method used for p-value adjustment.
`R2`	The proportion of variation in the response variable that is explained by the predictor variables. A higher R2 indicates a better model. Results for 'linear model' only.
`adj.R2`	Adjusted R2 for many predictor variables in the model. Results for 'linear model' only.
`cond.R2`	The proportion of variation in the response variable that is explained by the "complete" model with all covariates. Results for 'linear mixed model' only. Similar to `R2` in linear model.
`marg.R2`	The proportion of variation in the response variable that is explained by the fixed effects part only. Results for 'linear mixed model' only.
`RMSE`	The average error performed by the model in predicting the outcome for an observation. A lower RMSE indicates a better model.
`RSE`	also known as the model `sigma`, is a variant of the RMSE adjusted for the number of predictors in the model. A lower RSE indicates a better model.
`AIC`	A penalisation value for including additional predictor variables to a model. A lower AIC indicates a better model.
`BIC`	is a variant of AIC with a stronger penalty for including additional variables to the model.

Note

R2, adj.R2, cond.R2, marg.R2, RMSE, RSE, AIC, BIC all include the results of two models: (i) the full input model; (ii) a model without batch effects. It can help to decide whether it is better to include batch effects.

Author(s)

Yiwen Wang, Kim-Anh Lê Cao

References

\insertRef

daniel2020performancePLSDAbatch

Examples

library(TreeSummarizedExperiment) # for functions assays(),rowData()
data('AD_data')

# centered log ratio transformed data
ad.clr <- assays(AD_data$EgData)$Clr_value
ad.batch <- rowData(AD_data$EgData)$Y.bat # batch information
ad.trt <- rowData(AD_data$EgData)$Y.trt # treatment information
names(ad.batch) <- names(ad.trt) <- rownames(AD_data$EgData)
ad.lm <- linear_regres(data = ad.clr, trt = ad.trt,
                        batch.fix = ad.batch,
                        type = 'linear model')
ad.p.adj <- ad.lm$adj.p
head(ad.lm$AIC)

EvaYiwenWang/PLSDAbatch documentation built on Sept. 25, 2024, 8:54 p.m.