mapUniprot: Mapping identifiers with the UniProt API

mapping-and-queryingR Documentation

Mapping identifiers with the UniProt API

Description

These functions are the main workhorses for mapping identifiers from one database to another. They make use of the latest UniProt API (seen at https://www.uniprot.org/help/api).

Usage

mapUniProt(
    from = "UniProtKB_AC-ID",
    to = "UniRef90",
    columns = character(0L),
    query,
    verbose = FALSE,
    debug = FALSE,
    paginate = TRUE,
    pageSize = 500L
)
queryUniProt(
    query = character(0L),
    fields = c("accession", "id"),
    collapse = " OR ",
    n = Inf,
    pageSize = 25L
)
allToKeys(fromName = "UniProtKB_AC-ID")
allFromKeys()
returnFields()

Arguments

from

character(1) The identifier type to map from, by default "UniProtKB_AC-ID", short for UniProt accession identifiers. See a list of all 'from' type identifiers with allFromKeys.

to

character(1) The target mapping identifier, by default "UniRef90". It can be any one of those returned by allToKeys from the appropriate fromName argument.

columns, fields

character() Additional information to be retreived from UniProt service. See a full list of possible input return fields at https://www.uniprot.org/help/return_fields. Example fields include, "accession", "id", "gene_names", "xref_pdb", "xref_hgnc", "sequence", etc.

query

character() or named list() Typically, a string that would indicate the target accession identifiers but can also be a named list based on the available query fields. See https://www.uniprot.org/help/query-fields for a list of query fields. The typical query might only include a character vector of UniProt accession identifiers, e.g., c("A0A0C5B5G6", "A0A1B0GTW7", "A0JNW5", "A0JP26", "A0PK11", "A1A4S6")

collapse

character(1) A string indicating either " OR " or " AND " for combining query clauses.

n

numeric(1) Maximum number of rows to return

fromName

character(1) A from key to use as the basis of mapping to other keys, by default, "UniProtKB_AC-ID".

verbose

logical(1) Whether the operations should provide verbose updates (default FALSE).

debug

logical(1) Whether to display the URL API endpoints, for advanced debugging (default FALSE)

paginate

logical(1) Whether to use the pagination API (i.e., "results" vs "stream") in the request responses. For performance, it is set to TRUE by default.

pageSize

integer(1) number of records per page. It corresponds to the size parameter in the API request.

Details

Note that mapUniProt is used internally by the select method but made available for API queries with finer control. Provide values from the name column in returnFields as the columns input in either mapUniProt or select method.

When using from='Gene_Name', you may restrict the search results to a specific organism by including e.g., taxId=9606 in the query as a named list element. See examples below.

Value

  • mapUniProtA data.frame of returned results

  • allToKeysA sorted character vector of possible "To" keytypes based on the given "From" type

  • allFromKeysA sorted character vector of possible "From" keytypes

  • returnFieldsA data.frame of entries for the columns input in mapUniprot; see 'name' column

Author(s)

Marcel Ramos

Examples

mapUniProt(
    from="UniProtKB_AC-ID",
    to='RefSeq_Protein',
    query=c('P13368','Q9UM73','P97793','Q17192')
)

mapUniProt(
    from='GeneID', to='UniProtKB', query=c('1','2','3','9','10')
)

mapUniProt(
    from = "UniProtKB_AC-ID",
    to = "UniProtKB",
    columns = c("accession", "id"),
    query = list(organism_id = 10090, ids = c('Q7TPG8', 'P63318'))
)

## restrict 'from = Gene_Name' result to taxId 9606
mapUniProt(
    from = "Gene_Name",
    to = "UniProtKB-Swiss-Prot",
    columns = c("accession", "id"),
    query = list(taxId = 9606, ids = 'TP53')
)

mapUniProt(
    from = "UniProtKB_AC-ID", to = "UniProtKB",
    query = c("P31946", "P62258"),
    columns = c("accession", "id", "xref_pdb", "xref_hgnc", "sequence")
)

queryUniProt(
    query = c("accession:A5YMT3", "organism_id:9606"),
    fields = c("accession", "id", "reviewed"),
    collapse = " AND "
)

allToKeys(fromName = "UniRef100")

head(allFromKeys())

head(returnFields())

Bioconductor/UniProt.ws documentation built on Nov. 7, 2024, 4:25 a.m.