read_regions: Read genomic regions in BEDX+Y format

View source: R/read.R

read_regionsR Documentation

Read genomic regions in BEDX+Y format

Description

read_regions() reads genomic regions by calling the rtracklayer::import() function. This function can automatically deal with BEDX files from BED3 to BED6. For BED6+Y, the extraCols argument should be used to correctly interpret the extra columns.

Usage

read_regions(
  con,
  genome = NA,
  format,
  extraCols = character(),
  rename_name,
  rename_score,
  ...
)

Arguments

con

A path, URL, connection or BEDFile object. See rtracklayer::import() documentation.

genome

From rtracklayer::import(): The identifier of a genome, or NA if unknown. Typically, this is a UCSC identifier like 'hg19'. An attempt will be made to derive the seqinfo on the return value using either an installed BSgenome package or UCSC, if network access is available.

format

From rtracklayer::import(): The format of the output. If not missing, should be one of 'bed', 'bed15', 'bedGraph' or 'bedpe'. If missing and 'con' is a filename, the format is derived from the file extension. This argument is unnecessary when 'con' is a derivative of 'RTLFile'.

extraCols

From rtracklayer::import(): A character vector in the same form as 'colClasses' from 'read.table'. It should indicate the name and class of each extra/special column to read from the BED file. As BED does not encode column names, these are assumed to be the last columns in the file. This enables parsing of the various BEDX+Y formats.

rename_name

A string to rename the name column of the BED file. For example, if the name column actually contains a categorical variable.

rename_score

A string to rename the score column of the BED file. For example, if the score column represents a quantity about the data besides the score in the BED specification.

...

Parameters to pass onto the format-specific method of rtracklayer::import().

Details

NOTE: The name (4th) and score (5th) columns are so named. If these columns have a particular meaning for your data, they should be renamed with the rename_name and/or rename_score parameters.

Value

A GRanges object.

Examples


   # Example of reading a BED6+3 file where the last 3 columns are non-standard
   file = system.file('extdata', 'IDH2mut_v_NBM_multi_data_chr9.txt.gz', package = 'annotatr')
   extraCols = c(diff_meth = 'numeric', mu0 = 'numeric', mu1 = 'numeric')
   gr = read_regions(con = file, genome = 'hg19', extraCols = extraCols, format = 'bed',
       rename_name = 'DM_status', rename_score = 'pval')


rcavalcante/annotatr documentation built on Aug. 22, 2024, 7:33 a.m.