glm_bas: Conduct bayesian model selection, inference, and...
In ZhenWei10/m6ALogisticModel: Linear model analysis on RNA modification data.

Description Usage Arguments Details Value See Also Examples

glm_bas is used to perform logistic regression analysis based on previously generated features, it reports logistic regression statistics after bayesian model selection, and it can plot diagrams of logit estimates, wald test / likelihood ratio test statistics, and the goodness of fit (deviance) across various samples.

glm_bas(se, beta_prior = robust(), model_prior = beta.binomial(1, 1),
  MCMC_iterations = 1e+05, decision_method = "BPM", top = NULL,
  save_dir = "LogisticModel",
  sample_names_coldata = colnames(colData(se))[1],
  group_list = group_list_default)

`se`	A `SummarizedExperiment` object containing the matrix of response variables, each row should represent one modification site, and each collumn should represent a sample or condition. The entries of `assay` are integer values of 1 or 0, with 1 indicating the positive class and 0 indicating the negative class used in logistic regression, the uncertain values should be set as `NA`. The sample names should be defined by the `colnames` of the `SummarizedExperiment`, alternatively, it can be defined by the first collumn of the `colData`, or by the user defined collumns in `colData` by the argument `Sample_Names_coldata`. The features or design matrix should be included in the `mcols` of the `SummarizedExperiment` object.
`beta_prior`	prior distribution for regression coefficients, default setting is "robust"; see `bas.glm`
`model_prior`	Family of the prior distribution on the models, default setting is beta.binomial(1,1); see `bas.glm`
`MCMC_iterations`	Number of models to sample, default setting is 100000,the maximum number is 1^e8.
`decision_method`	Decision method used in the bayesian model selection, default setting is 'BPM'; see `predict.bas`
`top`	The number of top models used for BMA related decision making, default setting is NULL (BMA over all models); see `predict.bas`
`save_dir`	The path to save the statistics and the diagrams of the logistic models, the default path is named "LogisticModel".
`sample_names_coldata`	Provided column names in `colData` for the sample labels.
`group_list`	Optional, a `list` indicating the grouping of features in the output diagrams, by default it uses `group_list_default`.

glm_bas build logistic regression model on the features and targets defined by the annotated SummarizedExperiment object returned by predictors.annot. The model selection is conducted by bayesian model selection method defined by the package BAS.

Folders under the directory specified by save_dir will be created, the reports and the diagrams will be saved within them.

Use predictors.annot to annotate features.

library(SummarizedExperiment)
library(TxDb.Hsapiens.UCSC.hg19.knownGene)
library(BSgenome.Hsapiens.UCSC.hg19)
library(fitCons.UCSC.hg19)
library(phastCons100way.UCSC.hg19)

Feature_List_hg19 = list(
HNRNPC_eCLIP = eCLIP_HNRNPC_gr,
YTHDC1_TREW = YTHDC1_TREW_gr,
YTHDF1_TREW = YTHDF1_TREW_gr,
YTHDF2_TREW = YTHDF2_TREW_gr,
miR_targeted_genes = miR_targeted_genes_grl,
#miRanda = miRanda_hg19_gr,
TargetScan = TargetScan_hg19_gr,
Verified_miRtargets = verified_targets_gr
)


SE_features_added <- predictors.annot(se = SE_example,
                                     txdb = TxDb.Hsapiens.UCSC.hg19.knownGene,
                                     bsgnm = Hsapiens,
                                     fc = fitCons.UCSC.hg19,
                                     pc = phastCons100way.UCSC.hg19,
                                     struct_hybridize = Struc_hg19,
                                     feature_lst = Feature_List_hg19,
                                     HK_genes_list = HK_hg19_eids)


glm_bas(
SE_features_added,
MCMC_iterations = 10000,
decision_method = "BPM",
save_dir = "LogisticModel_x",
sample_names_coldata = "ID"
)

#To Do:
#1. make a option called "no model selection", to quikly infer the effect sizes.
#1.5 make the response variable could be TRUE or FALSE for logistic model.
#2. Isolate the meta plot functions from the original ones.
#3. Add choice of Poisson GLM and Gaussian OLM.