fetch_uniprot: Fetch protein data from UniProt

View source: R/fetch_uniprot.R

fetch_uniprotR Documentation

Fetch protein data from UniProt

Description

Fetches protein metadata from UniProt.

Usage

fetch_uniprot(
  uniprot_ids,
  columns = c("protein_name", "length", "sequence", "gene_names", "xref_geneid",
    "xref_string", "go_f", "go_p", "go_c", "cc_interaction", "ft_act_site", "ft_binding",
    "cc_cofactor", "cc_catalytic_activity", "xref_pdb"),
  batchsize = 200,
  max_tries = 10,
  timeout = 20,
  show_progress = TRUE
)

Arguments

uniprot_ids

a character vector of UniProt accession numbers.

columns

a character vector of metadata columns that should be imported from UniProt (all possible columns can be found here. For cross-referenced database provide the database name with the prefix "xref_", e.g. "xref_pdb")

batchsize

a numeric value that specifies the number of proteins processed in a single single query. Default and max value is 200.

max_tries

a numeric value that specifies the number of times the function tries to download the data in case an error occurs.

timeout

a numeric value that specifies the maximum request time per try. Default is 20 seconds.

show_progress

a logical value that determines if a progress bar will be shown. Default is TRUE.

Value

A data frame that contains all protein metadata specified in columns for the proteins provided. The input_id column contains the provided UniProt IDs. If an invalid ID was provided that contains a valid UniProt ID, the valid portion of the ID is still fetched and present in the accession column, while the input_id column contains the original not completely valid ID.

Examples


fetch_uniprot(c("P36578", "O43324", "Q00796"))

# Not completely valid ID
fetch_uniprot(c("P02545", "P02545;P20700"))


protti documentation built on Oct. 22, 2024, 1:06 a.m.