Allows the user to set and examine a variety of RnBeads global options. They affect the way in which the package computes and displays its results.
1 2 3
Option names as
Option name in the form of a
rnb.options() with no arguments returns a list with the current values of the options. To access the
value of a single option, one should use, e.g.,
rnb.getOption("filtering.greedycut"), rather than
rnb.options("filtering.greedycut") which is a list of length one. Also, only a limited set of options
is available (see below). Attempting to get or set the value of a non-existing option results in an error.
rnb.getOption, the current value for
rnb.options(), a list of all
RnBeads options and their current values. If option names are given, a list of all requested options
and their values. If option values are set,
rnb.options returns the previous values of the modified
character vector storing a short title of the analysis. If specified, this name appears at
the page title of every report.
Flag indicating if logging functionality is enabled in the automatic runs of the pipeline.
Email address associated with the analyses.
Genome assembly to be used. Currently only important for bisulfite mode. The supported genomes returned by the
Flag indicating if analysis on site or probe level is to be conducted. Note that the preprocessing module always operates on the site level (only), regardless of the value of this option.
Region types to carry out analysis on, in the form of a
NULL (default value)
signifies that all available region annotations (as returned by
rnb.region.types) are summarized
upon loading and normalization, and the other modules analyze all regions summarized in the dataset. If this
option is set to an empty vector, analysis on the region level is skipped.
Aggregation function to apply when calculating the methylation value for a region based on the values of the
CpGs associated with that region. Accepted values for this function are
"coverage.weighted". The last method is
applicable only for sequencing-based methylation datasets. It computes the weighted average of the values of
the associated CpGs, whereby weights are calculated based on the coverages of the respective sites.
If a number larger than 1 is specified, RnBeads will subdivide each region specified in the
region.types option into subsegments containing on average
region.subsegments sites per
subsegment. This is done by clustering the sites within each regions according to their genomic coordinates.
These subsegments are then used for subsequent analysis.
Use cautiously as this will significantly increase the runtime of the pipeline.
The region types to which subsegmentation will be applied. Defaults to
region.types when set to
Column name or index in the table of phenotypic information to be used when plotting sample identifiers. If
this option is
NULL, it points to a non-existing column or a column that does not list IDs, the default
identifiers are used. These are the row names of the sample phenotype table (and the column names of the beta
character vector of length 2 or more giving the color scheme for displaying categorical trait values in
plots. RnBeads denotes missing values (
NA) by grey, therefore, it is not recommended to include shades
of grey in this vector. The default value of this option is the result of the
"Dark2" palette of
RColorBrewer with 8 values.
character vector of length 2 or more giving the color scheme for displaying continuous (gradient) trait
values in plots. RnBeads interpolates between the color values.
Minimum number of samples each subgroup defined by a trait, in order for this trait to be considered in the
methylation profiles and in the differential methylation modules. This must be a positive
Maximum number of subgroups defined by a trait, in order for this trait to be considered in the methylation
profiles and in the differential methylation modules. This must be an
integer of value
more. As a special case, a value of
NULL (default) indicates that the maximum number of subgroups is
the number of samples in an analysis minus
1, i.e. traits with all unique values will be ignored.
Column name in the sample annotation table that indicates sample replicates. Replicates are expected to
contain the same value. Samples without replicates should contain unique or missing values. If this option is
NULL (default), replicate handling is disabled.
Flag indicating whether large output files should be compressed (in
Flag controlling whether data import report should be generated. This option be set to
FALSE only when
the provided data source is an object of type RnBSet, i.e. the data has been previously loaded
Type of data assumed to be supplied by default (Infinium 450k microarray).
For sequencing data set this to
bs.bed.dir and save the options.
rnb.execute.import for further details.
Separator used in the plain text data tables. See
rnb.execute.import for details.
Preset for bed-like formats.
"BisSNP", "Encode","EPP", "bismarkCytosine", "bismarkCov" are currently
supported. See the RnBeads vignette and the FAQ section on the website for more details.
Column indices in the supplied BED file with DNA methylation information.
These are represented by a named
integer vector, in which the names are:
"t". These names
correspond the columns for chromosome, start position, end position, strand, methylation degree, read
coverage, number of reads with C and number of reads with T, respectively. Methylation degree and/or read
coverage, if not specified, are inferred from the values in the columns
Further details and examples of BED files can be found in Section 4.1 of the RnBeads vignette.
Singleton of type
integer specifying the frame shift between
the coordinates in the input BED file and the corresponding genomic reference. This (
is added to the coordinates from the BED file before matching the methylation sites to the annotated ones.
Perform a small loading test, by reading 1000 rows from each BED file, after which normal loading is performed. See RnBeads vignette and the FAQ section on the website for more details.
Perform only the small loading test, and skip loading all the data.
Skip the check of the loaded RnBSet object after loading. Helps with keeping the memory profile down
Flag indicating if gender prediction is to be performed. Gender prediction is only supported for Infinium 450k datasets with signal intensity information. The value of this option is ignored for other datasets.
Flag controlling whether the data should be preprocessed (whether quality filtering and in case of Infinium microarray data normalization should be applied).
Flag controlling whether the data should be normalized and normalization report generated. Setting this to
NULL (default) enables this step for analysis on Infinium datasets, but disables it in case of
sequencing-based datasets. Note that normalization is never applied in sequencing datasets; if this flag is
enabled, it will lead to a warning message.
Normalization method to be applied, or
"none". Multiple normalization methods are supported:
Illumina scaling normalization;
"swan" (default) - SWAN-normalization by Gordon et al., as implemented
beta-mixture quantile normalization method by Teschendorff et al; as well as
"wm.swan" - all normalization methods implemented in the
wateRmelon package. When
setting this option to a specific algorithm, make sure its dedicated package is installed.
A character singleton specifying which background subtraction is to be performed during normalization.
The methylumi background
correction methods are supported. The following values are accepted:
Flag indicating if the report on normalization should include plots of shifts (degrees of beta value correction).
Flag indicating if the quality control module is to be executed.
[Infinium 450k] Add boxplots for all types of quality control probes to the quality control report. The boxplots give signal distribution across samples.
[Infinium 450k] Add barplots for each quality control probes to the quality control report.
[Infinium 450k] Add boxplot of negative control probe intensities for all samples.
[Infinium 450k] Flag indicating if intersample distances based on the beta values of SNP probes are to be displayed. This can help identify or validate genetically similar or identical samples.
[Infinium 450k] Add boxplot of beta-values for the SNP-calling probes. Can be useful for detection of sample mix-ups.
[Infinium 450k] Add bar plots of beta-values for the SNP-calling probes in each profiled sample.
[Infinium 450k] Maximal number of samples included in a single quality control barplot and negative control boxplot.
[Bisulfite sequencing] Add genome-wide sequencing coverage plot for each sample.
[Bisulfite sequencing] Values for coverage cutoffs to be shown in a coverage thresholds plot. This must be an
vector of positive values. Setting this to an empty vector disables the coverage thresholds plot.
[Bisulfite sequencing] Add sequencing coverage histogram for each sample.
[Bisulfite sequencing] Add sequencing coverage violin plot for each sample.
Name of a file specifying site or probe identifiers to be
whitelisted. Every line in this file must contain exactly one identifier. The whitelisted sites are always
retained in the analysed datasets, even if filtering criteria or blacklisting requires their removal.
For Infinium studies, the file must contain Infinium probe identifiers. For bisulfite sequencing studies,
the file must contain CpG positions in the form "chromosome:coordinate" (1-based coordinate of the cytosine),
chr2:48607772. Unknown identifiers are silently ignored.
Name of a file specifying site or probe identifiers to be
blacklisted. Every line in this file must contain exactly one identifier. The blacklisted sites are removed
from the analysed datasets as a first step in the preprocessing module. For Infinium studies, the file must
contain Infinium probe identifiers. For bisulfite sequencing studies, the file must contain CpG positions in
the form "chromosome:coordinate" (1-based coordinate of the cytosine), e.g.
Unknown identifiers are silently ignored.
character vector giving the list of probe context types to be removed as a filtering step. Possible
context values are
"Other". Probes in the second context measure CpG methylation; the last context denotes probes
dedicated to SNP detection. Setting this option to
NULL or an empty vector effectively disables the
step of context-specific probe removal.
Removal of sites or probes based on overlap with SNPs. The accepted values for this option are:
no SNP-based filtering;
filter out a probe when the last 3 bases in its target sequence overlap with SNP;
filter out a probe when the last 5 bases in its target sequence overlap with SNP;
filter out a CpG site or probe when any base in its target sequence overlaps with SNP.
Bisulfite sequencing datasets operate on sites instead of probes, therefore, the values
"5" are treated as
Flag indicating if the removal of potentially cross-reactive probes should be performed as a filtering step in the preprocessing module. A probes whose sequence maps to multiple genomic locations (allowing up to 3 mismatches) is cross-reactive.
Flag indicating if the Greedycut procedure should be run as a filtering step in the preprocessing module.
Threshold for the detection p-value to be used in Greedycut. This is a value between 0 and 1. This option has
effect only when
Indicator of what the behaviour of Greedycut should be in case of ties between the scores of rows (probes) and
columns (samples). The value of this option must be one of
last one indicating random choice. This option has effect only when
Flag indicating if the removal of probes located on sex chromosomes should be performed as a filtering step.
Number between 0 and 1, indicating the fraction of allowed missing values per site. A site is filtered out
when its methylation beta values are
NAs in a larger fraction of samples than this threshold. Setting
this option to 1 (default) retains all sites, and thus effectively disables the missing value filtering step
in the preprocessing module. If this is set to 0, all sites that contain missing values are filtered out.
Threshold for minimal acceptable coverage. This must be a non-negative value. Setting this option to 0 (zero) effectively considers any known or unknown read coverage for sufficiently deep.
Flag indicating whether methylation values for low coverage sites should be set to missing. In combination
filtering.missing.value.quantile this can lead to the removal of sites.
(Bisulfite sequencing mode) Flag indicating whether methylation sites with a coverage of more than 10 times the 95-percentile of coverage should be removed.
Threshold used to filter probes based on the variability of their assigned beta values. This must be a real
value between 0 and 1, denoting minimum standard deviation of the beta values in one site across all samples.
Any sites that have standard deviation lower than this threshold are filtered out. Note that sites with
undetermined varibility, that is, sites for which there are no measurements (all beta values are
are retained. Setting this option to 0 (default) disables filtering based on methylation variability.
Flag indicating if the covariate inference analysis module is to be executed.
Column names in the sample annotation table for which surrogate variable analysis (SVA) should be conducted. An empty vector (default) means that SVA is skipped.
Column name in the sample annotation table giving the assignment of samples to reference methylomes.
The target samples should have
NA values in this column.
Number of most variable CpGs which are tested for association with the reference cell types. Setting this
NULL forces the algorithm to use all available sites in the dataset, and may greatly
increase the running time for cell type comoposition estimation.
Number of top cell type markers used for determining cell type contributions to the target DNA methylation profiles using the projection method of Houseman et al.
Name of the method to be used for estimating the number of surrogate variables.
must be either 'leek' or 'be', See
sva function for details.
Flag indicating if the exploratory analysis module is to be executed.
Traits, given as column names or indices in the sample annotation table, to be used in the exploratory
analysis. These traits are used in multiple steps in the module: they are visualized using point types and
colors in the dimension reduction plots; tested for strong correlations and associations with principal
components in a methylation space; used to define groups when plotting beta distributions and/or inter-sample
methylation variability. The default value of this parameter -
NULL - indicates that columns should be
automatically selected; see
rnb.sample.groups for how this is done.
Number of most variable probes, sites or regions to select prior to performing dimension reduction techniques and tests for associations. Preselection can significantly reduce the running time and memory usage in the exploratory analysis module. Setting this number to zero (default) disables preselection.
Maximum number of principal components to be tested for associations with other factors, such as control probe
states and sample traits. This must be an
integer value between
10. Setting this
0 disables such tests.
Significance threshold for a p-value resulting from applying a test for association. This is a value between 0 and 1.
Number of permutations in tests performed to check for associations between traits, and between control probe
intensities and coordinates in the prinicipal component space. This must be a non-negative
Setting this option to
0 disables permutation tests.
[Infinium 450k] Flag indicating if quality-associated batch effects should be studied. This amounts to testing for
associations between intensities of quality control probes and principal components. This option has effect
exploratory.principal.components is non-zero.
Flag indicating whether beta value distributions for sample groups and probe or site categories should be computed.
Flag indicating if methylation variability in sample groups should be computed as part of the exploratory analysis module.
Flag indicating if the inter-sample methylation variability step in the exploratory analysis module should
include deviation plots. Deviation plots show intra-group methylation variability at the covered sites and
regions. Setting this option to
NULL (default) enables deviation plots on Infinium datasets, but
disables them in case of sequencing-based datasets, because their generation can be very computationally
intensive. This option has effect only when
Which sites should be used by clustering algorithms in the exploraroty analysis module.
RnBeads performs several algorithms that cluster the samples in the dataset. If this option is set to
"all" (default), clustering is performed using all sites; a value of
"top" indicates that only
the most variable sites are used (see the option
Number of most variable sites to use when visualizing heatmaps. This must be a non-empty
containing positive values. This option is ignored when
Flag indicating if the generated methylation value heatmaps in the clustering section of the exploratory
analysis module should be saved as PDF files. Enabling this option is not recommended for large values of
exploratory.clustering.top.sites (more than 200), because heatmaps might generate very large PDF files.
Region types for generating regional methylation profiles. If
NULL (default), regional methylation
profiles are created only for the region types that are available for the targeted assembly and summarized in
the dataset of interest. Setting this option to an empty vector disables the region profiles step in the
exploratory analysis module.
A list of gene symbols to be used for custom locus profiling. Locus views will be generated for these genes.
Path to a bed file containing custom genomic regions. Locus views will be generated for these regions.
Flag indicating if the differential methylation module is to be executed.
Method to be used for calculating p-values on the site level. Currently supported options are "ttest" for a (paired)
t-test and "limma" for a linear modeling approach implemented in the
limma package for differential expression
Number of permutation tests performed to compute the p-value of rank permutation tests in the differential
methylation analysis. This must be a non-negative
integer. Setting this option to
disables permutation tests for rank permutations. Note that p-values for differential methylation are
computed and also considered for the ranking in any case.
Column names or indices in the table of the sample annotation table to be used for group definition in the
differential methylation analysis. The default value -
NULL - indicates that columns should be
automatically selected. See
rnb.sample.groups for how this is done. By default,
the comparisons are done in a one vs. all manner if there are multiple
groups defined in a column.
Column names or indices in the table of sample annotation table to be used for group definition in the
differential methylation analysis in which all pairwise comparisons between groups should be conducted (the default
is one vs all if multiple groups are specified in a column).
Caution: for large numbers of sample groups this can lead to combinatorial explosion and thus to huge runtimes.
A value of
NULL (default) indicates that no column is selected for all pairwise comparisons explicitely.
If specified, the selected columns must be a subset of the columns that will be selected according to the
Column names or indices in the table of phenotypic information to be used for confounder adjustment in the
differential methylation analysis. Currently this is only supported for
A NAMED vector containing for each column name for which paired analysis
should be performed (say columnA) the name or index of another column (say columnB) in which same values indicate
the same pairing. columnA should be the name of the value columnB in this vector.
For more details see
Flag indicating if the differential methylation analysis should account for Surrogate Variables. If
TRUE, RnBeads looks for overlaps between the
inference.targets.sva options and include the surrogate variables as confounding factors only for these
columns. In other words, it will only have an effect if the corresponding inference option
inference.targets.sva option for details) is enabled.
Currently this is only supported for
Should the differential methylation analysis account for celltype using the reference based Houseman method.
It will only have an effect if the corresponding inference option is enabled (see
option for details). Currently this is only supported for
Flag indicating whether Gene Ontology (GO)-enrichment analysis is to be conducted on the identified differentially methylated regions.
Flag indicating whether a section corresponding to differential site methylation should be added to the report.
Has no effect on the actual analysis, just the report. To disable differential site methylation analysis entirely
Flag indicating whether the data should be exported to bed files.
character vector specifying which data types should be exported to
Track hub directories. Possible values
in the vector are
"bigWig". When this options is set to
NULL, track hub
export is disabled. Note that if
"bigBed" is contained in this option, bed files are created
Flag indicating whether methylation value matrices are to be exported to comma-separated value (CSV) files.
Flag indicating whether methylation values and differential methylation analysis settings should be exported to a format compatible with FaST-LMM-EWASher, a tool for adjusting for cell-type compositions. See Zou, J., et al., Nature Methods, 2014 for further details on the tool.
character vector of sites and region names to be exported. If
NULL, no region methylation values
Flag indicating whether big tables should be stored on disk rather than in main memory in order to keep memory requirements down. May slow down analysis!
Flag indicating if the active R session should be terminated when an error is encountered during execution.
When plotting methylation value distributions, this threshold specifies the number of observations drawn per
group. Distributions are estimated and plotted based on these random subsamples. This approach can
significantly reduce the memory requirements of the preprocessing and exploratory analysis modules, where
methylation value distributions are plotted. Setting this to
0 disables subsampling. More information
is presented the Details section of
Flag indicating whether in some places of the code memory management should actively being enforced in order to achieve a better memory profile. I.e. garbage collection, variable removal is conducted actively. May slow down analysis.
Flag indicating whether disked dumped big matrices (see
disk.dump.big.matrices option) should actively
be deleted when RnBSets are modified. You should switch it to
TRUE and the amount of hard drive space is also limited.