perform_search_session: Wrapper function to acquire citation data from multiple...

View source: R/Record_search.R

perform_search_sessionR Documentation

Wrapper function to acquire citation data from multiple sources

Description

It is better to use this function instead of the individual search_* tools, since it also automatically acquires manually downloaded records (e.g., for EMBASE and SCOPUS for which automatic search is not available).

Usage

perform_search_session(
  query,
  year_query = NULL,
  actions = c("API", "parsed"),
  sources = c("IEEE", "WOS", "Pubmed", "Scopus", "Embase"),
  session_name = "Session1",
  query_name = "Query1",
  records_folder = "Records",
  overwrite = FALSE,
  skip_on_failure = FALSE,
  journal = "Session_journal.csv"
)

Arguments

query

A boolean query with AND/OR/NOT operators, brackets for term grouping and quotation marks for n-grams.

year_query

A year based filtering query. See clean_date_filter_arg() for more info.

actions

Whether to acquire records through automatic search, parsing of manually downloaded data, or both.

sources

The sources for which records should be collected

session_name

How to name the current search session and will be used to create a folder to collect search results. It should be the same as the name used for classification session of the same records.

query_name

A label for the current query. It will be used to name a folder inside the session_name folder. It is useful to separate records acquired with different queries in the same search session.

records_folder

The path to a folder where to store search results.

overwrite

Whether to overwrite results for a given session_name/query_name/sources if the search is repeated and a result file already exists.

skip_on_failure

Whether to skip problematic sources of fail.

journal

A path to a file (Excel or CSV) to store a summary of the search results. If the file already exists, the summary of the new session_name/query_name/sources will be added to the file.

Details

The function organizes search results into folder defined by the pattern records_folder/session_name/query_name, which allows to have different search sessions (and classification sessions) and multiple queries per session (useful when it is too complex to convey all information into a single query). These folders can be created manually and manually downloaded citation data must be put into these folders to be acquired by the functions.

To acquire the manually downloaded files, they must be given a name containing the source as in sources. There could be more files for the same sources, since all research databases have download limits and users may need to download results in batches. The function acquires these files and parse them into a standard format, creating a new file for each source.

The output is a "journal" file storing all information about the queries, the sources used, the number of results, etc... which allow keeping track of all search sessions. If a journal file is already present, the new results will be added.

Value

A "Journal" data frame containing a summary of the search results grouped by session_name/query_name/sources/actions.

Examples

## Not run: 
# Initial query to be built on domain knowledge. It accepts OR, AND, NOT
# boolean operators and round brackets to group terms.
query <- '((model OR models OR modeling OR network OR networks) AND
(dissemination OR transmission OR spread OR diffusion) AND (nosocomial OR
hospital OR "long-term-care" OR "long term care" OR "longterm care" OR
"long-term care" OR "healthcare associated") AND (infection OR resistance OR
resistant))'

# Year filter. The framework converts it to the API-specific format seamlessly.
# common logical comparators can be used, i.e. <, <=, >, >=, while dashes
# denotes inclusive date intervals. A single year restricts results to one year
# period.
year_filter <- "2010-2020"

journal <- perform_search_session(
  query = query, year_query = year_filter,
  session_name = "Session1", query_name = "Query1",
  records_folder = "Records",
  journal = "Session_journal.csv"
)

## End(Not run)

bakaburg1/BaySREn documentation built on March 30, 2022, 12:16 a.m.