knitr::opts_chunk$set( warning=FALSE, message=FALSE )
biscuiteer
is package to process output from
biscuit into
bsseq objects. It includes a number
of features, such as VCF header parsing, shrunken M-value calculations (which
can be used for compartment inference), and age inference. However, the task of
locus- and region-level differential methylation inference is delegated to
other packages (such as dmrseq
).
From Bioconductor,
if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("biscuiteer")
A development version is available on GitHub and can be installed via:
if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install("trichelab/biscuiteerData") BiocManager::install("trichelab/biscuiteer")
biscuiteer
can load either headered or header-free BED files produced from
biscuit vcf2bed
or biscuit mergecg
. In either case, a VCF file is needed
when loading biscuit
output. For practical purposes, only the VCF header is
for biscuiteer
. However, it is encouraged that the user keep the entire VCF,
as biscuit
can be used to call SNVs and allows for structural variant
detection in a similar manner to typical whole-genome sequencing tools.
Furthermore, biscuit
records the version of the software and the calling
arguments used during processing the output VCF, which allows for better
reproducibility.
NOTE: Both the input BED and VCF files must be tabix'ed before being input to
biscuiteer
. This can be done by running bgzip biscuit_output.xxx
followed by
tabix -p xxx biscuit_output.xxx.gz
, where xxx
is either bed
or vcf
.
Data can be loaded using the readBiscuit
function in biscuiteer
:
library(biscuiteer) orig_bed <- system.file("extdata", "MCF7_Cunha_chr11p15.bed.gz", package="biscuiteer") orig_vcf <- system.file("extdata", "MCF7_Cunha_header_only.vcf.gz", package="biscuiteer") bisc <- readBiscuit(BEDfile = orig_bed, VCFfile = orig_vcf, merged = FALSE)
Metadata from the biscuit
output can be viewed via:
biscuitMetadata(bisc)
In the instance where you have two separate BED files that you would like to
analyze in a single bsseq object, you can combine the files using unionize
,
which is a wrapper around the BiocGenerics function, combine
.
shuf_bed <- system.file("extdata", "MCF7_Cunha_chr11p15_shuffled.bed.gz", package="biscuiteer") shuf_vcf <- system.file("extdata", "MCF7_Cunha_shuffled_header_only.vcf.gz", package="biscuiteer") bisc2 <- readBiscuit(BEDfile = shuf_bed, VCFfile = shuf_vcf, merged = FALSE) comb <- unionize(bisc, bisc2)
The epiBED file format provides an easy way to analyze read- or fragment-level
methylation and genetic information at the same time. readEpibed
provides
functionality for parsing the RLE strings found in the epiBED file into a
GRanges object for analysis in R.
NOTE: The input file must be bgzip'ed and tabix'ed.
epibed.nome <- system.file("extdata", "hct116.nome.epibed.gz", package="biscuiteer") epibed.bsseq <- system.file("extdata", "hct116.bsseq.epibed.gz", package="biscuiteer") epibed.nome.gr <- readEpibed(epibed = epibed.nome, genome = "hg19", chr = "chr1") epibed.bsseq.gr <- readEpibed(epibed = epibed.bsseq, genome = "hg19", chr = "chr1")
A handful of analysis paths are available in biscuiteer
, including A/B
compartment inference, age estimation from WGBS data, hypermethylation of
Polycomb Repressor Complex (PRC) binding sites, and hypomethylation of
CpG-poor "partially methylated domains" (PMDs).
When performing A/B compartment inference, the goal is to have something that
has roughly gaussian error. getLogitFracMeth
uses Dirichlet smoothing to turn
raw measurements into lightly moderated, logit-transformed methylated-fraction
estimates, which can the be used as inputs to
compartmap
reg <- GRanges(seqnames = rep("chr11",5), strand = rep("*",5), ranges = IRanges(start = c(0,2.8e6,1.17e7,1.38e7,1.69e7), end= c(2.8e6,1.17e7,1.38e7,1.69e7,2.2e7)) ) frac <- getLogitFracMeth(bisc, minSamp = 1, r = reg) frac
biscuiteer
has the functionality to guess the age of the sample(s) provided
using the Horvath-style "clock" models (see
Horvath, 2013
for more information).
NOTE: The prediction accuracy of this function is entirely dependent on the parameters set by the user. As such, the defaults (as shown in the example below) should only be used as a starting point for exploration by the user.
NOTE: Please cite the appropriate papers for the epigenetic "clock" chosen:
horvath
or horvathshrunk
hannum
skinandblood
ages <- WGBSage(comb, "horvath") ages
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.