Why does the WoS API sometimes return a different number of records than the WoS web interface?

The API does not conduct lemmatization before applying your query, while the web app does. This means that the API will typically return a smaller result set than the web app.

What are the throttling limits on the WoS and InCites APIs?

There are two important limits that you should be aware of:

Why doesn't pull_incites() return data for all of my publications?

Not all publications that are indexed in the Web of Science database are also indexed in InCites.

How do I link together the author and address data frames returned by pull_wos()?

You can join the data frames using the author_address linking table, like so:

library(wosr)
library(dplyr)

data <- pull_wos("TS = \"dog welfare\"")

data$author %>% 
  left_join(data$author_address, by = c("ut", "author_no")) %>% 
  left_join(data$address, by = c("ut", "addr_no"))

How do I download data for a query that returns more than 100,000 records?

The WoS API doesn't allow you to download data for a query that matches 100,000 or more publications. You can get around this by breaking your query into pieces using the publication year tag (PY). For example, if you have a broad query like "TS = dog" (which matches over 250,000 records), you could break it up into four sub-queries that have contiguous date ranges (and which return fewer than 100,000 records each). For example:

queries <- c(
  "TS = dog AND PY = 1900-1980", 
  "TS = dog AND PY = 1981-2000", 
  "TS = dog AND PY = 2001-2010", 
  "TS = dog AND PY = 2011-2018"
)
results <- pull_wos_apply(queries)

There are some fields that I'm interested in that pull_wos() doesn't return. How do I get them?

Open up an issue on wosr's issue page describing the field(s) that you want.

[^1]: To accommodate this limit, pull_incites() sleeps for a given amount of time (determined by how many times it has received a throttling error for the request it is trying to make) before retrying the request.



vt-arc/wosr documentation built on Sept. 27, 2022, 5:44 a.m.