View source: R/find_informative_sites.R
| find_informative_sites | R Documentation |
This function generates a set of informative CpG sites to estimate the purity or the tumor content of a set of tumor samples.
find_informative_sites(
tumor_table,
control_table,
auc,
ref_table,
cores = 1,
max_sites = 20,
min_distance = 1000000,
percentiles = c(0, 100),
hyper_range = c(min = 0.4, max = 0.9),
hypo_range = c(min = 0.1, max = 0.6),
control_costraints = c(0.3, 0.7),
method = c("even", "top", "hyper", "hypo"),
full_info = FALSE
)
tumor_table |
A matrix of beta-values of tumor samples. |
control_table |
A matrix of beta-values of control/normal samples. |
auc |
A data.frame with AUC scores generated by |
ref_table |
A data.frame with first two columns reporting genomic location (chromosome, genomic_coordinates). |
cores |
Number of parallel processes. |
max_sites |
Maximum number of sites to retrieve (half hyper-, half hypo-methylated) (default=20). |
min_distance |
Exclude sites located at less than 'min_distance' from higher-ranking site (default = 1e6 bps). |
percentiles |
A vector of length 2. Min and max percentiles to select sites with beta values outside hypo- and hyper-ranges (default = c(0,100); i.e. only min and max beta should be outside of ranges). |
hyper_range |
A vector of length 2 with minimum lower and upper values required to select hyper-methylated informative sites. |
hypo_range |
A vector of length 2 with minimum lower and upper values required to select hypo-methylated informative sites. |
control_costraints |
To select a site, "first quartile"/"third quartile" of control data must be above/below these beta-values. |
method |
How to select sites: "even" (half hyper-, half hypo-methylated sites), "top" (highest AUC irregardless of hyper or hypomethylation), "hyper" (hyper-methylated sites only), "hypo" (hypo-methylated, sites only). |
full_info |
Return all informative sites with a column reporting wether to use a site or not (for debugging purposes). |
A new parameter, named control_costraints, is required to force the
selection of sites with upper/lower quartiles of control scores are below
beta-values given by control_costraints. Sites are divided into
hyper and hypo depending on their level of methylation with
respect to the average beta-score of normal samples.
A data.frame reporting probe names and type ("hyper" and "hypo") of informative sites.
## WARNING: The following code doesn't retrieve any informative site ## It just shows how to use the tool auc_data <- get_AUC(tumor_toy_data, control_toy_data) info_sites <- find_informative_sites(tumor_toy_data, control_toy_data, auc_data, illumina27k_hg19)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.