library(medrxivr) library(dplyr) knitr::opts_chunk$set( collapse = TRUE, eval = TRUE, warning = FALSE, comment = "#>" )
First load the medrxivr
package:
library(medrxivr)
To find records that contain any of many terms, pass the terms as a vector to the mx_search()
function, as in the code chunk below. Query terms can include regular expression syntax - see the section at the end of this document on common regular expression that may be useful when searching.
myquery <- c("dementia","vascular","alzheimer's") # Combined with Boolean OR mx_results <- mx_search(data = mx_snapshot(), # Use daily snapshot for data query = myquery)
To find records relevant to more than one topic domain, create a vector for each topic (note: there is no upper limit on the number of topics your can have) and combine these vectors into a list which is then passed to the mx_search()
function:
topic1 <- c("dementia","vascular","alzheimer's") # Combined with Boolean OR topic2 <- c("lipids","statins","cholesterol") # Combined with Boolean OR myquery <- list(topic1, topic2) # Combined with Boolean AND mx_results <- mx_search(data = mx_snapshot(), query = myquery)
By default, a range of fields (title, abstract, first author, subject, link (which contains DOI)) are searched, but you can limit the search to a subset of these using the fields
argument:
# Limit search to title/abstract mx_results <- mx_search(data = mx_snapshot(), query = "dementia", fields = c("title","abstract")) # Search by DOI mx_results <- mx_search(data = mx_snapshot(), query = "10.1101/2020.01.30.20019836", fields = "link")
Often it is useful to be able to exclude records that contain a certain term that is not relevant to your search. For example, in the search below, we are looking for records related to "dementia" alone by excluding those that mention "mild cognitive impairment":
mx_results <- mx_search(data = mx_snapshot(), query = "dementia", NOT = "[Mm]ild cognitive impairment")
You can define either/both of the earliest and latest date you wish to include records from. Note: the search is inclusive of both dates specified:
mx_results <- mx_search(data = mx_snapshot(), query = "dementia", from_date = "2020-01-01", # 1st Jan 2020 to_date = "2020-01-08") # 8th Jan 2020
medRxiv allows authors to upload a new version of their preprint as often as they like. By default, medrxivr
only returns the most recent version of the preprint, but if you are interested in exploring how a record changed over time, you can retrieve all versions of the preprint by setting deduplicate = FALSE
mx_results <- mx_search(data = mx_snapshot(), query = "10.1101/2020.01.30.20019836", fields = "link", deduplicate = FALSE)
Example regex: [Dd]ementia
Description: The search is case sensitive, so this syntax allows you to find both Dementia and dementia using a single term, rather than having to enter them separately. However, setting the autocaps
argument of mx_search()
to TRUE
will automatically search for both capitalised and uncapitalised versions of your search terms (e.g. with auto_caps = TRUE
you just need to search for "dementia" to find both Dementia and dementia - behind the scenes, "dementia" is converted to "[Dd]ementia".
Example regex: randomi*ation
Description: The wildcard operator "*" defines any single alphanumeric character - in this case, the term will find both randomisation and randomization.
Example regex: systematic NEAR4 review
Description: The "NEAR4" operator defines that up to 4 words can be between systematic and review and the search will still find it. To change how far apart the terms are allowed to be, simply change the number following NEAR (e.g. to find terms that are only one word apart, the syntax would be systematic NEAR1 review
). Please note that the search is directional, in that the example term here will find "systematic methods for the review", but will not find "the review was systematic".
Example regex: \\bNCOV\\b
Description: Sometimes it is useful to be able to define the start and end of terms. For example, if you were searching for NCOV-19, simply using ncov
as your search term would also return records containing uncovered. Using \\b
allows you to define where the term beings and ends, thus excluding false positive matches.
To find records that contain "Mendelian" within 4 words of "randomisation" (with varying capitalisation of "Mendelian" and UK/US spellings of "randomisation"), the following syntax is correct:
mx_results <- mx_search(data = mx_snapshot(), query = "mendelian NEAR4 randomi*ation", auto_caps = TRUE)
To check whether your search term will find what you expect it to, there is a useful regex tester, designed by Adam Spannbauer.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.