Description Usage Arguments Value Author(s)
process_bismark_beatson_wrap
is a wrapper method for processing HTS
data and returning the methylation promoter regions and the corresponding
gene expression data for those promoter regions. Note that the format of
BS-Seq data should be in the bismark format and for the RNA-Seq data in
Beatson bed format.
1 2 3 4 5 | process_bismark_beatson_wrap(bs_files, rna_files, chrom_size_file = NULL,
chr_discarded = NULL, upstream = -7000, downstream = 7000,
min_bs_cov = 4, max_bs_cov = 1000, cpg_density = 10, sd_thresh = 0.1,
ignore_strand = TRUE, gene_log2_transf = TRUE, gene_outl_thresh = TRUE,
gex_outlier = 300, fmin = -1, fmax = 1)
|
bs_files |
Filename (or vector of filenames if there are replicates) of the BS-Seq '.bed' formatted data to read values from. |
rna_files |
Filename of the RNA-Seq '.bed' formatted data to read values from. Currently, this version does not support pooling RNA-Seq replicates. |
chrom_size_file |
Optional filename containing genome chromosome sizes. |
chr_discarded |
A vector with chromosome names to be discarded. |
upstream |
Integer defining the length of bp upstream of TSS for creating the promoter region. |
downstream |
Integer defining the length of bp downstream of TSS for creating the promoter region. |
min_bs_cov |
The minimum number of reads mapping to each CpG site. CpGs with less reads will be considered as noise and will be discarded. |
max_bs_cov |
The maximum number of reads mapping to each CpG site. CpGs with more reads will be considered as noise and will be discarded. |
cpg_density |
Optional integer defining the minimum number of CpGs that
have to be in a methylated region. Regions with less than |
sd_thresh |
Optional numeric defining the minimum standard deviation of the methylation change in a region. This is used to filter regions with no methylation change. |
ignore_strand |
Logical, whether or not to ignore strand information. |
gene_log2_transf |
Logical, whether or not to log2 transform the gene expression data. |
gene_outl_thresh |
Logical, whehter or not to remove outlier gene expression data. |
gex_outlier |
Numeric, denoting the threshold above of which the gene expression data (before the log2 transformation) are considered as noise. |
fmin |
Optional minimum range value for region location scaling. Under this version, this parameter should be left to its default value. |
fmax |
Optional maximum range value for region location scaling. Under this version, this parameter should be left to its default value. |
A processHTS
object which contains following information:
methyl_region
: A list containing methylation data,
where each entry in the list is an L_{i} X 3 dimensional matrix,
where L_{i} denotes the number of CpGs found in region i
. The
columns contain the following information:
1st column: Contains the locations of CpGs relative to TSS. Note that the actual locations are scaled to the (fmin, fmax) region.
2nd column: Contains the total reads of each CpG in the corresponding location.
3rd column: Contains the methylated reads each CpG in the corresponding location.
gex
: A vector containing the corresponding gene
expression levels for each entry of the methyl_region
list.
prom_region
: A GRanges
object
containing corresponding annotated promoter regions for each entry of the
methyl_region
list. The GRanges object has one additional metadata
column named tss
, which stores the TSS of each promoter.
rna_data
: A GRanges
object containing
the corresponding RNA-Seq data for each entry of the methyl_region
list. The GRanges object has three additional metadata columns which are
explained in read_rna_encode_caltech
upstream
:
Integer defining the length of bp upstream of TSS.
downstream
: Integer defining the length of bp downstream of TSS.
cpg_density
: Integer defining the minimum number of CpGs that
have to be in a methylated region. Regions with less than n
CpGs are
discarded.
sd_thresh
: Numeric defining the minimum standard
deviation of the methylation change in a region. This is used to filter
regions with no methylation change.
fmin
: Minimum range
value for region location scaling.
fmax
: Maximum range value
for region location scaling.
C.A.Kapourani C.A.Kapourani@ed.ac.uk
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.