fitRegMultiple: Fit models.

View source: R/fitting_functions_multiple_genes.R

fitRegMultipleR Documentation

Fit models.

Description

Model fitting functions for regsplice package.

Usage

fitRegMultiple(
  rs_results,
  rs_data,
  alpha = 1,
  lambda_choice = c("lambda.min", "lambda.1se"),
  seed = NULL,
  ...
)

fitNullMultiple(rs_results, rs_data, seed = NULL, ...)

fitFullMultiple(rs_results, rs_data, seed = NULL, ...)

Arguments

rs_results

RegspliceResults object, which will be used to store results. Initialized using the constructor function RegspliceResults(). See RegspliceResults for details.

rs_data

RegspliceData object. In the case of RNA-seq read count data, this has been pre-transformed with runVoom. Contains counts and weights data matrices, and vector of experimental conditions for each biological sample stored in colData. See RegspliceData for details.

alpha

Elastic net parameter alpha for glmnet model fitting functions. Must be between 0 (ridge regression) and 1 (lasso). Default is 1 (lasso). See glmnet documentation for more details.

lambda_choice

Parameter to select which optimal lambda value to choose from the cv.glmnet cross validation fit. Choices are "lambda.min" (model with minimum cross-validated error) and "lambda.1se" (most regularized model with cross-validated error within one standard error of minimum). Default is "lambda.min". See glmnet documentation for more details.

seed

Random seed (integer). Default is NULL. Provide an integer value to set the random seed for reproducible results.

...

Other arguments to pass to cv.glmnet, glmnet, or glm.

Details

There are three model fitting functions:

fitRegMultiple fits regularized (lasso) models containing an optimal subset of exon:condition interaction terms for each gene. The model fitting procedure penalizes the interaction terms only, so that the main effect terms for exons and samples are always included. This ensures that the null model is nested, allowing likelihood ratio tests to be calculated.

fitNullMultiple fits the null models, which do not contain any interaction terms.

fitFullMultiple fits full models, which contain all exon:condition interaction terms for each gene.

See createDesignMatrix for more details about the terms in each model.

The fitting functions fit models for all genes in the data set.

A random seed can be provided with the seed argument, to generate reproducible results.

If the rs_data object does not contain a weights matrix, all exon bins are weighted equally.

Previous step: Initialize RegspliceResults object with initializeResults. Next step: Calculate likelihood ratio tests with LRTests.

Value

Returns a RegspliceResults object containing deviance and degrees of freedom of the fitted models. See RegspliceResults for details.

See Also

createDesignMatrix RegspliceResults initializeResults LRTests

glmnet cv.glmnet glm

Examples

file_counts <- system.file("extdata/vignette_counts.txt", package = "regsplice")
data <- read.table(file_counts, header = TRUE, sep = "\t", stringsAsFactors = FALSE)
head(data)

counts <- data[, 2:7]
tbl_exons <- table(sapply(strsplit(data$exon, ":"), function(s) s[[1]]))
gene_IDs <- names(tbl_exons)
n_exons <- unname(tbl_exons)
condition <- rep(c("untreated", "treated"), each = 3)

rs_data <- RegspliceData(counts, gene_IDs, n_exons, condition)

rs_data <- filterZeros(rs_data)
rs_data <- filterLowCounts(rs_data)
rs_data <- runNormalization(rs_data)
rs_data <- runVoom(rs_data)

rs_results <- initializeResults(rs_data)

rs_results <- fitRegMultiple(rs_results, rs_data)
rs_results <- fitNullMultiple(rs_results, rs_data)
rs_results <- fitFullMultiple(rs_results, rs_data)


lmweber/regsplice documentation built on March 19, 2024, 1:45 p.m.