mobster_fit | R Documentation |
This function fits the mixure of Beta distributions with a power-law Pareto
Type-I tail (optional). The function performs model selection for different mixtures, which
the user specify with the input parmeters. The function return a list of all fits computed
(objects of class dbpmm
), the best fit, a table with the results of the fits and a
variable that specify which score has been used for model selection. The fitting of each model
also runs the function choose_clusters
which implements a simple heuristic to filter
out small clusters from the fit output.
Note: You can also use the "auto setup" functionality that, with one keyword, loads preset parameter values in order to implement different analysis.
mobster_fit(
x,
K = 1:3,
samples = 5,
init = "peaks",
tail = c(TRUE, FALSE),
epsilon = 1e-10,
maxIter = 250,
fit.type = "MM",
seed = 12345,
model.selection = "reICL",
trace = FALSE,
parallel = TRUE,
pi_cutoff = 0.02,
N_cutoff = 10,
auto_setup = NULL,
silent = FALSE,
description = "My MOBSTER model"
)
x |
Input tibble (or data.frame) which is required to have a VAF column which reports the
frequency of the mutant allele (this should be computed adjusting the raw VAF for tumour purity
and copy number status). See also package |
K |
A vector with the number of Beta components to use. All values of |
samples |
Number of fits that should be attempted for each configuration of the model tested. |
init |
Initial values for the paremeters of the model. With |
tail |
If |
epsilon |
Tolerance for convergency estimation. For MLE fit this is compared to the differential of the negative log-likelihood (NLL); for MM fit the largest differential among the mixing proportions (pi) is used. |
maxIter |
Maximum number of steps for a fit. If convergency is not achieved before these steps, the fit is interrupted. |
fit.type |
A string that determines the type of fit. |
seed |
Seed for the random numbers generator |
model.selection |
Score to minimize to select the best model; this has to be one of |
trace |
If |
parallel |
Optional parameter to run the fits in parallel (default), or not. |
pi_cutoff |
Parameter passed to function |
N_cutoff |
Parameter passed to function |
auto_setup |
Overrides all the parameters with an predined set of values, in order to implement different analyses. Availables keys: 'FAST', uses 1) max 2 clones (1 subclone), 2) random initial conditions 3) 2 samples per parameter set 4) mild 'epsilon' and 'maxIter', sequential run. For reference, the default set of parameters represent a more exhaustive analysis. |
description |
A textual description of this dataset. |
A list of all fits computed (objects of class dbpmm
), the best fit, a table with the results of the fits and a
variable that specify which score has been used for model selection.
# Generate a random dataset
x = random_dataset(seed = 123, Beta_variance_scaling = 100, N = 200)
print(x) # Contains a ggplot object
# Fit, default models, changed epsilon for convergence
x = mobster_fit(x$data, epsilon = 1e-5)
plot(x$best)
print(x$best)
lapply(x$runs[1:3], plot)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.