ulan_match: Match names to the Getty ULAN

Description Usage Arguments Value Functions Note Examples

Description

Queries the Getty ULAN to find matching entries for a given string. You may filter the results by specifying an early or late date.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
ulan_match(names, early_year = -9999, late_year = 2090,
  strictly_between = FALSE, method = c("sparql", "local"),
  max_results = 5, cutoff_score = NULL)

ulan_id(names, early_year = -9999, late_year = 2090,
  strictly_between = FALSE, method = c("sparql", "local"),
  max_results = 1)

ulan_data(names, early_year = -9999, late_year = 2090,
  strictly_between = FALSE, method = c("sparql", "local"),
  max_results = 1)

Arguments

names

A character vector of names to match to a canonical ULAN id.

early_year

Match only artists who died after this year. Like late_year, this argument should be a numeric vector of length 1, or of the same length as names. If length 1, the same date restrictions will be used to match every value of names. Otherwise, each name match can be restricted to its own pair of early_year and late_year. If no early_year or late_year are specified, then artists from all time periods will be eligible for matching. Any NA values in early_year or late_year will be coerced to default maxima and minima.

late_year

Match only artists who were born before this year.

strictly_between

Method for filtering search results using the early_year/late_year parameters. TRUE will only include artists whose life dates fall within the range [late_year, early_year]. FALSE (the default) will include artists whose life dates intersect with [early_year, late_year]

method

This value determines which method will be used to match the name to a canonical ULAN id. sparql will query the Getty's live endpoint, relying on its Lucene index for finding close matches, while local instead uses string cosine similarity based on a local table of ULAN entries.

max_results

The maximum number of results to return. Defaults to 5. Depending on the query, the actual number of results returned may be lower. When method = "sparql" is used, values over 50 will be ignored.

cutoff_score

The minimum similarity score that must be returned by the chosen method for a candidate to be included in results. NULL will use default values for each method: 0.95 for the local method, and 3 for the sparql method.

Value

A named list of data.frames, one per submitted name, with 7 columns and no more than max_results rows:

id

integer. ULAN id

pref_name

character. ULAN preferred name

birth_year

integer. Artist birth year, if assigned.

death_year

integer. Artist death year, if assigned

gender

character. Artist gender, if assigned.

nationality

character. Artist nationality, if assigned.

score

numeric. The score of the result. When method = "sparql", this is the Lucene index score. When method = "local", it will instead be a cosine similarity score. Results with a score below cutoff_score are dropped.

Unmatched names will return a data.frame with NAs for all values save name.

Functions

Note

cutoff_score will be ingored for

method = "sparql" requires an internet connection.

Examples

1
2
3
4
5
6
## Not run: ulan_id("Rembrandt", early_year = 1600,
                 late_year = 1700, method = "sparql")
## End(Not run)
## Not run: ulan_id(c("Rembrandt", "Rothko"), early_year = c(1600, 1900),
                 late_year = c(1700, 2000), method = "sparql")
## End(Not run)

mdlincoln/ulanr documentation built on May 22, 2019, 4:16 p.m.