read_regions: Read genomic regions in BEDX+Y format
In rcavalcante/annotatr: Annotation of Genomic Regions to Genomic Annotations

read_regions

R Documentation

Read genomic regions in BEDX+Y format

Description

read_regions() reads genomic regions by calling the rtracklayer::import() function. This function can automatically deal with BEDX files from BED3 to BED6. For BED6+Y, the extraCols argument should be used to correctly interpret the extra columns.

Usage

read_regions(
  con,
  genome = NA,
  format,
  extraCols = character(),
  rename_name,
  rename_score,
  ...
)

Arguments

`con`	A path, URL, connection or BEDFile object. See `rtracklayer::import()` documentation.
`genome`	From `rtracklayer::import()`: The identifier of a genome, or NA if unknown. Typically, this is a UCSC identifier like 'hg19'. An attempt will be made to derive the `seqinfo` on the return value using either an installed BSgenome package or UCSC, if network access is available.
`format`	From `rtracklayer::import()`: The format of the output. If not missing, should be one of 'bed', 'bed15', 'bedGraph' or 'bedpe'. If missing and 'con' is a filename, the format is derived from the file extension. This argument is unnecessary when 'con' is a derivative of 'RTLFile'.
`extraCols`	From `rtracklayer::import()`: A character vector in the same form as 'colClasses' from 'read.table'. It should indicate the name and class of each extra/special column to read from the BED file. As BED does not encode column names, these are assumed to be the last columns in the file. This enables parsing of the various BEDX+Y formats.
`rename_name`	A string to rename the name column of the BED file. For example, if the name column actually contains a categorical variable.
`rename_score`	A string to rename the score column of the BED file. For example, if the score column represents a quantity about the data besides the score in the BED specification.
`...`	Parameters to pass onto the format-specific method of `rtracklayer::import()`.

Details

NOTE: The name (4th) and score (5th) columns are so named. If these columns have a particular meaning for your data, they should be renamed with the rename_name and/or rename_score parameters.

Value

A GRanges object.

Examples


   # Example of reading a BED6+3 file where the last 3 columns are non-standard
   file = system.file('extdata', 'IDH2mut_v_NBM_multi_data_chr9.txt.gz', package = 'annotatr')
   extraCols = c(diff_meth = 'numeric', mu0 = 'numeric', mu1 = 'numeric')
   gr = read_regions(con = file, genome = 'hg19', extraCols = extraCols, format = 'bed',
       rename_name = 'DM_status', rename_score = 'pval')

rcavalcante/annotatr documentation built on Aug. 22, 2024, 7:33 a.m.