query_bookworm | R Documentation |
This function retrieves word frequency data from the Hathi Trust Bookworm Server at https://bookworm.htrc.illinois.edu/develop/, with options to group the results according to various forms of metadata and to limit according to that same metadata. It uses code authored by Ben Schmidt (from https://github.com/bmschmidt/edinburgh/).
query_bookworm(
word,
groups = "date_year",
ignore_case = TRUE,
counttype = "WordsPerMillion",
method = c("data", "returnPossibleFields", "search_results"),
format = c("json", "csv", "tsv", "feather"),
lims = c(1920, 2000),
compare_to,
as_json = FALSE,
verbose = FALSE,
query,
...
)
word |
Term to get frequencies for. Can be a vector of strings. It can be left empty if one is interested primarily in statistics about the corpus as a whole. |
groups |
Category to group results by. The default is |
ignore_case |
Default is |
counttype |
The default is words per million,
It is possible to combine some of these - e.g., counttype = c("TextCount",
"TextPercent"). But it is not possible to combine |
method |
Type of results to return. Can be
|
format |
Format of returned results. In theory the Bookworm DB should be able to return results as "json", "tsv", "csv", or even "feather"; currently only "json" works (and it's the only supported format here). |
lims |
Min and max year as a two-element numeric vector. Default is
|
compare_to |
A word to compare relative frequencies to. Currently this
is most useful with |
as_json |
Whether to return the raw json. Useful for complex queries where the function does not know how to return a tibble, or when you want to use the raw json to produce a different data structure. |
verbose |
If |
query |
You can directly pass on a query string (in JSON). This is
useful for very complex queries, but there's no checking that the
parameters are correct so you may encounter unexpected errors. See
https://bookworm-project.github.io/Docs/query_structure.html for more on
the query structure. If you use |
... |
Additional parameters passed to the query builder; these would be
the fields that method = |
A tidy tibble whenever possible, with columns for each grouping
parameter, the word (if any), and the counts and counttypes. For method = "search_result"
, a workset that can be used in browse_htids and
get_workset_meta.
Ben Schmidt
query_bookworm(word = c("democracy", "monarchy"), lims = c(1760, 2000),
counttype = c("WordsPerMillion", "WordCount"))
query_bookworm(word = "democracy", groups = c("date_year", "lc_classes"),
lims = c(1900,2000))
query_bookworm(word = "democracy", groups = "date_year", date_year = "1941",
lc_classes = "Education", method = "search_results")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.