readGFF: Reads a file in GFF format

Description Usage Arguments Value Author(s) See Also Examples

View source: R/readGFF.R

Description

Reads a file in GFF format and creates a data frame or DataFrame object from it. This is a low-level function that should not be called by user code.

Usage

1
2
3
4
5
readGFF(filepath, version=0,
        columns=NULL, tags=NULL, filter=NULL, nrows=-1,
        raw_data=FALSE)

GFFcolnames(GFF1=FALSE)

Arguments

filepath

A single string containing the path or URL to the file to read. Alternatively can be a connection.

version

readGFF should do a pretty descent job at detecting the GFF version. Use this argument only if it doesn't or if you want to force it to parse and import the file as if its 9-th column was in a different format than what it really is (e.g. specify version=1 on a GTF or GFF3 file to interpret its 9-th column as the "group" column of a GFF1 file). Supported versions are 1, 2, and 3.

columns

The standard GFF columns to load. All of them are loaded by default.

tags

The tags to load. All of them are loaded by default.

filter
nrows

-1 or the maximum number of rows to read in (after filtering).

raw_data
GFF1

Value

A DataFrame with columns corresponding to those in the GFF.

Author(s)

H. Pages

See Also

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
## Standard GFF columns.
GFFcolnames()
GFFcolnames(GFF1=TRUE)  # "group" instead of "attributes"

tests_dir <- system.file("tests", package="rtracklayer")
test_gff3 <- file.path(tests_dir, "genes.gff3")

## Load everything.
df0 <- readGFF(test_gff3)
head(df0)

## Load some tags only (in addition to the standard GFF columns).
my_tags <- c("ID", "Parent", "Name", "Dbxref", "geneID")
df1 <- readGFF(test_gff3, tags=my_tags)
head(df1)

## Load no tags (in that case, the "attributes" standard column
## is loaded).
df2 <- readGFF(test_gff3, tags=character(0))
head(df2)

## Load some standard GFF columns only (in addition to all tags).
my_columns <- c("seqid", "start", "end", "strand", "type")
df3 <- readGFF(test_gff3, columns=my_columns)
df3
table(df3$seqid, df3$type)
makeGRangesFromDataFrame(df3, keep.extra.columns=TRUE)

## Combine use of 'columns' and 'tags' arguments.
readGFF(test_gff3, columns=my_columns, tags=c("ID", "Parent", "Name"))
readGFF(test_gff3, columns=my_columns, tags=character(0))

## Use the 'filter' argument to load only features of type "gene"
## or "mRNA" located on chr10.
my_filter <- list(type=c("gene", "mRNA"), seqid="chr10")
readGFF(test_gff3, filter=my_filter)
readGFF(test_gff3, columns=my_columns, tags=character(0), filter=my_filter)

Bioconductor-mirror/rtracklayer documentation built on July 28, 2017, 5:27 a.m.