glm_bas: Conduct bayesian model selection, inference, and...

Description Usage Arguments Details Value See Also Examples

View source: R/glm_bas.R

Description

glm_bas is used to perform logistic regression analysis based on previously generated features, it reports logistic regression statistics after bayesian model selection, and it can plot diagrams of logit estimates, wald test / likelihood ratio test statistics, and the goodness of fit (deviance) across various samples.

Usage

1
2
3
4
5
glm_bas(se, beta_prior = robust(), model_prior = beta.binomial(1, 1),
  MCMC_iterations = 1e+05, decision_method = "BPM", top = NULL,
  save_dir = "LogisticModel",
  sample_names_coldata = colnames(colData(se))[1],
  group_list = group_list_default)

Arguments

se

A SummarizedExperiment object containing the matrix of response variables, each row should represent one modification site, and each collumn should represent a sample or condition. The entries of assay are integer values of 1 or 0, with 1 indicating the positive class and 0 indicating the negative class used in logistic regression, the uncertain values should be set as NA.

The sample names should be defined by the colnames of the SummarizedExperiment, alternatively, it can be defined by the first collumn of the colData, or by the user defined collumns in colData by the argument Sample_Names_coldata.

The features or design matrix should be included in the mcols of the SummarizedExperiment object.

beta_prior

prior distribution for regression coefficients, default setting is "robust"; see bas.glm

model_prior

Family of the prior distribution on the models, default setting is beta.binomial(1,1); see bas.glm

MCMC_iterations

Number of models to sample, default setting is 100000,the maximum number is 1^e8.

decision_method

Decision method used in the bayesian model selection, default setting is 'BPM'; see predict.bas

top

The number of top models used for BMA related decision making, default setting is NULL (BMA over all models); see predict.bas

save_dir

The path to save the statistics and the diagrams of the logistic models, the default path is named "LogisticModel".

sample_names_coldata

Provided column names in colData for the sample labels.

group_list

Optional, a list indicating the grouping of features in the output diagrams, by default it uses group_list_default.

Details

glm_bas build logistic regression model on the features and targets defined by the annotated SummarizedExperiment object returned by predictors.annot. The model selection is conducted by bayesian model selection method defined by the package BAS.

Value

Folders under the directory specified by save_dir will be created, the reports and the diagrams will be saved within them.

See Also

Use predictors.annot to annotate features.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
library(SummarizedExperiment)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(BSgenome.Hsapiens.UCSC.hg19)
library(fitCons.UCSC.hg19)
library(phastCons100way.UCSC.hg19)

Feature_List_hg19 = list(
HNRNPC_eCLIP = eCLIP_HNRNPC_gr,
YTHDC1_TREW = YTHDC1_TREW_gr,
YTHDF1_TREW = YTHDF1_TREW_gr,
YTHDF2_TREW = YTHDF2_TREW_gr,
miR_targeted_genes = miR_targeted_genes_grl,
#miRanda = miRanda_hg19_gr,
TargetScan = TargetScan_hg19_gr,
Verified_miRtargets = verified_targets_gr
)


SE_features_added <- predictors.annot(se = SE_example,
                                     txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
                                     bsgnm = Hsapiens,
                                     fc = fitCons.UCSC.hg19,
                                     pc = phastCons100way.UCSC.hg19,
                                     struct_hybridize = Struc_hg19,
                                     feature_lst = Feature_List_hg19,
                                     HK_genes_list = HK_hg19_eids)


glm_bas(
SE_features_added,
MCMC_iterations = 10000,
decision_method = "BPM",
save_dir = "LogisticModel_x",
sample_names_coldata = "ID"
)

#To Do:
#1. make a option called "no model selection", to quikly infer the effect sizes.
#1.5 make the response variable could be TRUE or FALSE for logistic model.
#2. Isolate the meta plot functions from the original ones.
#3. Add choice of Poisson GLM and Gaussian OLM.

ZhenWei10/m6ALogisticModel documentation built on May 17, 2019, 10:11 p.m.