Description Usage Arguments Value Author(s) Examples
View source: R/deprecated_process_functions.R
(DEPRECATED) process_haib_caltech_wrap is a wrapper
method for processing HTS data and returning the methylation promoter
regions and the corresponding gene expression data for those promoter
regions. Note that the format of BS-Seq data should be in the Encode Haib
bed format and for the RNA-Seq data in Encode Caltech bed format.
| 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | process_haib_caltech_wrap(
  bs_files,
  rna_files,
  chrom_size_file = NULL,
  chr_discarded = NULL,
  upstream = -7000,
  downstream = 7000,
  min_bs_cov = 4,
  max_bs_cov = 1000,
  cpg_density = 10,
  sd_thresh = 0.1,
  ignore_strand = TRUE,
  gene_log2_transf = TRUE,
  gene_outl_thresh = TRUE,
  gex_outlier = 300,
  fmin = -1,
  fmax = 1
)
 | 
| bs_files | Filename (or vector of filenames if there are replicates) of the BS-Seq '.bed' formatted data to read values from. | 
| rna_files | Filename of the RNA-Seq '.bed' formatted data to read values from. Currently, this version does not support pooling RNA-Seq replicates. | 
| chrom_size_file | Optional filename containing genome chromosome sizes. | 
| chr_discarded | A vector with chromosome names to be discarded. | 
| upstream | Integer defining the length of bp upstream of TSS for creating the promoter region. | 
| downstream | Integer defining the length of bp downstream of TSS for creating the promoter region. | 
| min_bs_cov | The minimum number of reads mapping to each CpG site. CpGs with less reads will be considered as noise and will be discarded. | 
| max_bs_cov | The maximum number of reads mapping to each CpG site. CpGs with more reads will be considered as noise and will be discarded. | 
| cpg_density | Optional integer defining the minimum number of CpGs that
have to be in a methylated region. Regions with less than  | 
| sd_thresh | Optional numeric defining the minimum standard deviation of the methylation change in a region. This is used to filter regions with no methylation change. | 
| ignore_strand | Logical, whether or not to ignore strand information. | 
| gene_log2_transf | Logical, whether or not to log2 transform the gene expression data. | 
| gene_outl_thresh | Logical, whehter or not to remove outlier gene expression data. | 
| gex_outlier | Numeric, denoting the threshold above of which the gene expression data (before the log2 transformation) are considered as noise. | 
| fmin | Optional minimum range value for region location scaling. Under this version, this parameter should be left to its default value. | 
| fmax | Optional maximum range value for region location scaling. Under this version, this parameter should be left to its default value. | 
A processHTS object which contains following information:
methyl_region: A list containing methylation data,
where each entry in the list is an L_{i} X 3 dimensional matrix,
where L_{i} denotes the number of CpGs found in region i. The
columns contain the following information: 
1st column: Contains the locations of CpGs relative to TSS. Note that the actual locations are scaled to the (fmin, fmax) region.
2nd column: Contains the total reads of each CpG in the corresponding location.
3rd column: Contains the methylated reads each CpG in the corresponding location.
gex: A vector containing the corresponding gene
expression levels for each entry of the methyl_region list. 
prom_region: A GRanges object
containing corresponding annotated promoter regions for each entry of the
methyl_region list. The GRanges object has one additional metadata
column named tss, which stores the TSS of each promoter.  
rna_data: A GRanges object containing
the corresponding RNA-Seq data for each entry of the methyl_region
list. The GRanges object has three additional metadata columns which are
explained in read_rna_encode_caltech 
upstream:
Integer defining the length of bp upstream of TSS. 
downstream: Integer defining the length of bp downstream of TSS.
cpg_density: Integer defining the minimum number of CpGs that
have to be in a methylated region. Regions with less than n CpGs are
discarded. 
sd_thresh: Numeric defining the minimum standard
deviation of the methylation change in a region. This is used to filter
regions with no methylation change. 
fmin: Minimum range
value for region location scaling. 
fmax: Maximum range value
for region location scaling. 
C.A.Kapourani C.A.Kapourani@ed.ac.uk
| 1 2 3 4 | # Obtain the path to the files
rrbs_file <- system.file("extdata", "rrbs.bed", package = "BPRMeth")
rnaseq_file <- system.file("extdata", "rnaseq.bed", package = "BPRMeth")
proc_data <- process_haib_caltech_wrap(rrbs_file, rnaseq_file)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.