EPM_job_split: Split A PubMed Retrieval Job into Manageable Batches.

View source: R/epm_all_fx.R

EPM_job_splitR Documentation

Split A PubMed Retrieval Job into Manageable Batches.

Description

Assess the number of PubMed records expected from a user-provided query and split the job in multiple sub-queries if the number is bigger than "max_records_per_batch" (typically, n=10,000). Sub-queries are split according to the "Create Date" of PubMed records. This does not support splitting jobs returning more than "max_records_per_batch" (typically, n=10,000) records that have the same "Create Date" (i.e., "[CRDT]").

Usage

EPM_job_split(
  query_string,
  api_key = NULL,
  max_records_per_batch = 9999,
  verbose = FALSE
)

Arguments

query_string

String (character vector of length 1), corresponding to the query string.

api_key

String (character vector of length 1), corresponding to the NCBI API key. Can be NULL.

max_records_per_batch

Integer, maximum number of records that should be expected be sub-query. This number should be in the range 1,000 to 10,000 (typicall, max_records_per_batch=10,000).

verbose

logical, shall progress information be printed to console.

Value

Character vector including the response from the server.

Author(s)

Damiano Fantini, damiano.fantini@gmail.com

References

https://www.data-pulse.com/dev_site/easypubmed/

Examples

# Note: a time limit can be set in order to kill the operation when/if 
# the NCBI/Entrez server becomes unresponsive.
setTimeLimit(elapsed = 4.9)
try({
  qry <- 'Damiano Fantini[AU] AND "2018"[PDAT]'
  easyPubMed:::EPM_job_split(query_string = qry, verbose = TRUE)
}, silent = TRUE)
setTimeLimit(elapsed = Inf)
                           




dami82/easyPubMed documentation built on Jan. 4, 2024, 6:21 a.m.