bpr_diff_eval_predict_wrap: Predict differential gene expression from differential...

Description Usage Arguments Value Author(s) See Also

Description

bpr_diff_eval_predict_wrap is a function that wraps all the necessary subroutines for performing prediction of differential gene expression levels. Initially, it optimizes the parameters of the basis functions so as to learn the methylation profiles for the control and the treatment samples Then, the two learned methylation profiles are concatenated to keep all coefficients for both profiles. Then the learned parameters / coefficients of the basis functions are given as input features for performing regression in order to predict the corresponding differential (log2 fold-change) gene expression levels.

Usage

1
2
3
4
5
bpr_diff_eval_predict_wrap(formula = NULL, x, y, model_name = "svm",
  w = NULL, basis = NULL, train_ind = NULL, train_perc = 0.7,
  fit_feature = "RMSE", cpg_dens_feat = TRUE, opt_method = "CG",
  interm_features = TRUE, opt_itnmax = 100, is_parallel = TRUE,
  no_cores = NULL, is_summary = TRUE)

Arguments

formula

An object of class formula, e.g. see lm function. If NULL, the simple linear regression model is used.

x

The binomial distributed observations. A list containing two lists for control and treatment samples. Each list has elements of length N, where each element is an L x 3 matrix of observations, where 1st column contains the locations. The 2nd and 3rd columns contain the total reads and number of successes at the corresponding locations, repsectively. See process_haib_caltech_wrap on a possible way to get this data structure.

y

Corresponding gene expression data. A list containing two vectors for control and treatment samples.

model_name

A string denoting the regression model. Currently, available models are: "svm", "randomForest", "rlm", "mars" and "lm".

w

Optional vector of initial parameter / coefficient values.

basis

Optional basis function object, default is an 'rbf' object, see create_rbf_object.

train_ind

Optional vector containing the indices for the train set.

train_perc

Optional parameter for defining the percentage of the dataset to be used for training set, the remaining will be the test set.

fit_feature

Return additional feature on how well the profile fits the methylation data. Either NULL for ignoring this feature or one of the following: 1) "RMSE" for returning the fit of the profile using the RMSE as measure of error or 2) "NLL" for returning the fit of the profile using the Negative Log Likelihood as measure of error.

cpg_dens_feat

Logical, whether to return an additional feature for the CpG density across the promoter region.

opt_method

The optimization method to be used. See optim for possible methods. Default is "CG".

interm_features

Logical, create intermediate features.

opt_itnmax

Optional argument giving the maximum number of iterations for the corresponding method. See optim for details.

is_parallel

Logical, indicating if code should be run in parallel.

no_cores

Number of cores to be used, default is max_no_cores - 2.

is_summary

Logical, print the summary statistics.

Value

A 'bpr_diff_predict' object which, in addition to the input parameters, consists of the following variables:

Author(s)

C.A.Kapourani C.A.Kapourani@ed.ac.uk

See Also

bpr_optimize, create_basis, eval_functions, train_model_gex, predict_model_gex


andreaskapou/BPRMeth-devel documentation built on May 12, 2019, 3:32 a.m.