R/DoLmFit.R

Defines functions DoLmFit

Documented in DoLmFit

#' @title Apply multivariable linear regression for each row of input data matrix.
#'
#' @description \code{DoLmFit} applies multivariable linear regression to regress gene/microRNA epxpression data on phenotype of interest (poi) adjusting for potential confounding factors (cf).
#'
#' @param data A matrix, the normalized gene/microRNA expression dataset, should be a numeric matrix, with rows referring to genes/microRNAs and columns to samples.
#' @param pheno A data.frame with columns are different phenotype data. Sample phenotype in a scientific research could be treatment/control, normal/cancer or smoker/non-smoker. Different phenotypes should each be encoded as 0/1 when inputting to \code{DoLmFit}, for example, Normal-0; Cancer-1.
#'
#' @return A table with rows for all genes (ranked by significance) and columns of t-statistic, p-value, adjusted p-value (default to Benjamini–Hochberg procedure).
#'
#'
#' @importFrom stats lm
#' @importFrom stats p.adjust
#'
#' @export DoLmFit
#'
#' @examples
#' # prepare your normalized data matrix
#' data.m <- matrix(rnorm(120), nrow = 20, ncol = 6)
#'
#' # prepare the phenotype info (0-control; 1-treatment)
#' poi.v <- c(0, 0, 0, 1, 1, 1)
#' cf.v <- c(0, 1, 1, 2, 2, 2)
#'
#' # run function
#' DoLmFit(data = data.m, pheno = cbind(poi.v, cf.v))



DoLmFit <- function(data, pheno.m){

    res <- matrix(nrow = nrow(data), ncol = 3, dimnames = list(rownames(data), c('t', 'p', 'q')))
    for(i in 1:nrow(data)){
        df <- data.frame(y = data[i, ], pheno.m)

        lm.o <- lm(y ~ ., df)
        res[i, c('t', 'p')] <- summary(lm.o)$coeff[2, 3:4]  
    }
    res[, 'q'] <- p.adjust(res[, 'p'], method = 'fdr')
    res
}
YC3/mirNet documentation built on Sept. 3, 2020, 3:25 a.m.