annotate_probes: Annotate probes

View source: R/downloads.R

annotate_probesR Documentation

Annotate probes

Description

Get genome annotation for oligonucleotide sequence

Usage

annotate_probes(
  source = "data.frame",
  ann.data = NULL,
  gff.path = NULL,
  org.name,
  db = "refseq",
  refs = TRUE,
  probe.id.var,
  probe.start.var,
  probe.stop.var,
  file.annot = NULL,
  save.format = "txt",
  sep = ";",
  return = "add.resume",
  priority = c("CDS", "gene", "region"),
  data,
  data.probe.id.var,
  delete.downloads = FALSE,
  verbose = TRUE
)

Arguments

source

character; genome annotation source. Possible values are: "data.frame" (from data frame), "giff" (from GIFF file), "load" (download from NCBI with getGFF function)

ann.data

genome annotation data frame

gff.path

character; .gff file name and path

org.name

character; the scientific name of the organism of interest

db

character; database from which the genome shall be retrieved; possible values are "refseq", "genbank", "ensembl"

refs

logical; download genome if it isn't marked in the database as either a reference or a representative genome

probe.id.var

vector of probes' identification numbers

probe.start.var, probe.stop.var

integer; vector of probes' start and end coordinates

file.annot

character; resulting annotation file name and path

save.format

character; format of resulting annotation file; possible values are "txt", "csv"

sep

character; field separator string

return

character; returned object; possible values are: "annotation" (annotation data frame), "resume" (annotation attributes only), "add.resume" (user's data frame with added annotation attributes)

priority

character; vector of sequence ontology types that should be returned in resume in the first place

data, data.probe.id.var

users data frame and probes' identification variable in it (used if return = "add.resume")

delete.downloads

logical; delete files that were downloaded from NCBI

verbose

logical; show messages

Details

This function uses boimartr genome annotation retrieval instruments. See getGFF for details. If retrieval is not available, GFF file may be used.

This function creates annotation ".txt" or ".csv" file. By default file is created in working directory. Optionally function returns annotation resume, i.e. annotation attribute for specified sequence ontology (SO). Priorities of SOs are set by user in priopity parameter. For example, if priopity = c("CDS", "gene", "region"), the function returns resume for "CDS" SO, if there are none - for "gene" CO etc. If there are several attributes meet priority, the first annotation attribute is returned. If none of priority COs found, the first annotation attribute is returned.

Number of found annotations are indicated in returned data ("ann.n" column).

Value

Annotation data frame, or annotation attributes, or user's data frame with added annotation attributes. Also annotation file is created.

Author(s)

Elena N. Filatova

Examples

path<-tempdir()
dir.create(path) # create temporal directory
data(ann.data) # load genome annotation data frame
annotation<-annotate_probes(source = "data.frame", ann.data = ann.data,
                probe.id.var = 1:5,
                probe.start.var = c (1, 100, 200, 300, 400),
                probe.stop.var = c (99, 199, 299, 399, 499),
                file.annot = paste0(path, "/annotation.txt"), save.format = "txt",
                return = "resume")
file.remove(paste0(path, "/annotation.txt")) # delete files
unlink(path, recursive = TRUE)


disprose documentation built on March 19, 2022, 2:15 a.m.