readRegionsFromBedFile: Read Genomic Regions from BED File

View source: R/readRegionsFromBedFile.R

readRegionsFromBedFileR Documentation

Read Genomic Regions from BED File

Description

Reads a BED file and returns the genomic regions as GRanges object

Usage

readRegionsFromBedFile(file, header=FALSE, sep="\t",
                       col.names=c("chrom", "chromStart",
                                   "chromEnd", "names"),
                       ignoreMcols=TRUE, seqInfo=NULL)

Arguments

file

the name of the file, text-mode connection, or URL to read data from

header,sep,col.names

arguments passed on to read.table

ignoreMcols

if TRUE (default), further columns are ignored; if FALSE, further columns are appended to the resulting GRanges object as metadata colums (see details below).

seqInfo

can be NULL (default) or an object of class Seqinfo (see details below).

Details

This function is a simple wrapper around the read.table function that reads from a BED file and returns the genomic regions as a GRanges object. How the file is split into columns can be controlled by the arguments header, sep, and col.names. These arguments are passed on to read.table as they are. The choice of the col.names argument is crucial. A wrong col.names argument results in erroneous assignment of columns. The function readRegionsFromBedFile requires columns named “chrom”, “chromStart”, and “chromEnd” to be present in the object returned from read.table upon reading from the BED file. If a column named “strands” is contained in the BED file, this column is used as strand info in the resulting GRanges object.

If ignoreMcols=TRUE (default), further columns are ignored. If ignoreMcols=FALSE, all columns other than “chrom”, “chromStart”, “chromEnd”, “names”, “strand”, and “width” are appended to the resulting GRanges object as metadata columns.

Note that the default for col.names has changed in version 1.23.2 of the package. Starting with this version, the BED is no longer assumed to contain strand and width information.

The seqInfo argument can be used to assign the right metadata, such as, genome, chromosome names, and chromosome lengths to the resulting GRanges object.

Value

a GRanges object

Author(s)

Ulrich Bodenhofer bodenhofer@bioinf.jku.at

References

http://www.bioinf.jku.at/software/podkat

http://genome.ucsc.edu/FAQ/FAQformat.html#format1

See Also

read.table

Examples

## basic example (hg38 regions of HBA1 and HBA2)
bedFile <- system.file("examples/HBA.bed", package="podkat")
readRegionsFromBedFile(bedFile)

## example with enforcing seqinfo
data(hg38Unmasked)
readRegionsFromBedFile(bedFile, seqInfo=seqinfo(hg38Unmasked))

##
## example with regions targeted by Illumina TruSeq Exome Enrichment kit:
## download file "truseq_exome_targeted_regions.hg19.bed.chr.gz" from
## http://support.illumina.com/downloads/truseq_exome_targeted_regions_bed_file.ilmn
## (follow link "TruSeq Exome Targeted Regions BED file"; these regions
##  are based on hg19)
##
## Not run: 
readRegionsFromBedFile("truseq_exome_targeted_regions.hg19.bed.chr.gz")

data(hg19Unmasked)
readRegionsFromBedFile("truseq_exome_targeted_regions.hg19.bed.chr.gz",
                       seqInfo=seqinfo(hg19Unmasked))
## End(Not run)

UBod/podkat documentation built on Feb. 18, 2024, 2:32 a.m.