cdssubmit: Submit a set of queries to NCBI's CD-SEARCH via R.

View source: R/cdsearchr.R

cdssubmitR Documentation

Submit a set of queries to NCBI's CD-SEARCH via R.

Description

Core internal cdsearchr function that prepares and submit queries to the CD-SEARCH webserver located at a specified URL. Uses httr for POST\'ing to the server.

Usage

cdssubmit(queries = NA, cdsurl = NULL,
db = c("cdd", "pfam", "smart", "tigrfam", "cog", "kog"),
smode = c("auto", "prec", "live"), evalue = 0.01, useid1 = TRUE, compbasedadj = 1,
biascompfilter = TRUE, tdata = c("hits", "aligns", "feats"), alnfmt = NA,
dmode = c("rep", "std", "full"), qdefl = TRUE, cddefl = TRUE, maxhit = 500,
nseqs = NULL)

Arguments

queries

(character string, mandatory) path to a FASTA file containing the query protein sequences.

cdsurl

(character string, mandatory) the URL at which the remote CD-SEARCH server is accessible. Should be inherited from cdsearchr. (Set to NULL by default.)

db

(character string, optional) controls which databases CD-SEARCH should search the queries against. Please refer to "database selection" under the URL https://www.ncbi.nlm.nih.gov/Structure/cdd/cdd_help.shtml#BatchRPSBSearchMode for particulars on the databases. This parameter only has an effect if smode is set to "live" (see below). (Set to "cdd" by default.)

smode

(character string, optional) controls which search mode CD-SEARCH should use. "auto" will check the queries first against a set of precalculated results (by checking query identifiers; really only works if these are sequences in NCBI already), and if that fails, it performs a "live" search against the CD-SEARCH database. "prec" would return results only for queries that have a result in the precalculated database. "live" will search every query anew against its databases even if precalculated results exist for that query. (Set to "auto" by default.)

evalue

(numeric, optional) expect value (statistical significance threshold) used for filtering and reporting annotation matches. (Set to 0.01 by default.)

useid1

(binary, optional) controls whether queries should also be searched against archived sequence identifiers if the query's identifier (if it happens to be an NCBI identifier) does not match anything in the current Entrez Protein database records. (Set to TRUE by default.)

compbasedadj

(integer, optional) should CD-SEARCH use compositionally- corrected scoring? (0 - correction turned off; 1 - correction turned on.) (Set to 1 by default.)

biascompfilter

(binary, optional) should compositionally biased regions of the queries be filtered out? (Set to TRUE by default.)

tdata

(character string, optional) what type of target data should be returned: "hits" (domain hits), "aligns" (domain alignments), or "feats" (domain features). Changing from the default might break functionality as of the current version of seqvisr. (Set to "hits" by default.)

alnfmt

(character string, optional) data format to be used for downloading alignment data in the event tmode is set to "aligns". This will never be the case for cdsearchr, and this option exists only for the sake of completeness. (Set to NA by default.)

dmode

(character string, optional) which data mode must be used for the results. This dictates what set of domains are returned: the highest scoring hit for each region of the sequence ("rep"), the best hits from each database available in CD-SEARCH (so multiple hits per query region are possible; "std"), or all hits ("full"). (Set to "rep" by default.)

qdefl

(binary, optional) should query titles be included in the results? (Set to TRUE by default.)

cddefl

(binary, optional) should domain titles be included in the results? (Set to FALSE by default.)

maxhit

(integer, optional) maximum number of results per query that should be retrieved. Only matters if smode is set to "live".

nseqs

(integer, mandatory) number of sequences being submitted to the CD-SEARCH server. This value should be automatically calculated and passed on from cdsearchr. (Set to NULL by default.)

Value

A httr response object containing (among other things, but most importantly) the unique cdsid referring to the submitted queries and search request which can subsequently be used to query the server.


vragh/seqvisr documentation built on April 20, 2024, 10:06 a.m.