load_gff_annotations: Extract annotation information from a gff file into a df

View source: R/annotation_gff.R

load_gff_annotationsR Documentation

Extract annotation information from a gff file into a df

Description

Try to make import.gff a little more robust; I acquire (hopefully) valid gff files from various sources: yeastgenome.org, microbesonline, tritrypdb, ucsc, ncbi. To my eyes, they all look like reasonably good gff3 files, but some of them must be loaded with import.gff2, import.gff3, etc. That is super annoying. Also, I pretty much always just do as.data.frame() when I get something valid from rtracklayer, so this does that for me, I have another function which returns the iranges etc. This function wraps import.gff/import.gff3/import.gff2 calls in try() because sometimes those functions fail in unpredictable ways.

Usage

load_gff_annotations(
  gff,
  type = NULL,
  id_col = "ID",
  ret_type = "data.frame",
  second_id_col = "locus_tag",
  try = NULL,
  row.names = NULL
)

Arguments

gff

Gff filename.

type

Subset the gff file for entries of a specific type.

id_col

Column in a successful import containing the IDs of interest.

ret_type

Return a data.frame or something else?

second_id_col

Second column to check.

try

Give your own function call to use for importing.

row.names

Choose another column for setting the rownames of the data frame.

Value

Dataframe of the annotation information found in the gff file.

See Also

[rtracklayer] [GenomicRanges]

Examples

 example_gff <- system.file("share", "gas.gff", package = "hpgldata")
 gas_gff_annot <- load_gff_annotations(example_gff)
 dim(gas_gff_annot)

elsayed-lab/hpgltools documentation built on May 9, 2024, 5:02 a.m.