knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The goal of openalex
is to provide access to data from OpenAlex - an open and comprehensive catalog of scholarly papers, authors, institutions and more ... - to R through the Open Alex REST API...
You can install the current version of openalex
from GitHub with:
#install.packages("devtools") devtools::install_github("kth-library/openalex", dependencies = TRUE)
This is a basic example which shows you how to get information for papers and authors:
library(openalex) library(dplyr) suppressPackageStartupMessages(library(purrr)) library(knitr) iid <- openalex:::openalex_autocomplete( query = "Royal Institute of Technology", entity_type = "institution", format = "table") |> head(1) |> pull("id") data <- openalex_crawl(entity = "works", verbose = TRUE, fmt = "tables", query = openalex:::openalex_query(filter = sprintf("institutions.id:%s,publication_year:2025", iid))) res <- data |> map(head) # return only first six rows from each table res
By providing an email address you enter the "polite pool" which provides even less of rate limiting for API requests.
You can provide it in ~/.Renviron
by setting OPENALEX_USERAGENT=http://github.com/hadley/httr (mailto:your_email@your_institution.org)
.
You can also set it just for the session by using a helper fcn openalex_polite()
to temporarily set or unset the email used in the user agent string when making API requests:
library(openalex) # set an email to use for the session openalex_polite("you@example.com") # unset, and use default user agent string... openalex_polite("")
A premium subscription API key can be used by setting OPENALEX_KEY=secret_premium_api_key
in your .Renviron
, or temporarily in a session using:
library(openalex) # temporarily use a premium subscription API key openalex_key("secret_premium_api_key") # unset to not use the premium subscription API key openalex_key("")
This will make it possible to make API calls that return the latest available records, for example based on recent creation dates or recent last modification timestamps.
# we do not require an API key for the publish date published_since_ <- openalex_works_published_since(since_days = 7) # but an API key is needed when using "from_created_date" and "from_updated_date" fields. created_since_7d <- openalex_works_created_since(since_days = 7) updated_since_1h <- openalex_works_updated_since(since_minutes = 60) # first few rows of each of these retrievals created_since_7d |> _$work_ids |> head() |> knitr::kable() updated_since_1h |> _$work_ids |> head() |> knitr::kable()
When data from openalex
is displayed publicly, this attribution also needs to be displayed:
library(openalex) openalex_attribution()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.