KorAPQuery-class: Class KorAPQuery

KorAPQuery-classR Documentation

Class KorAPQuery

Description

This class provides methods to perform different kinds of queries on the KorAP API server. KorAPQuery objects, which are typically created by the corpusQuery() method, represent the current state of a query to a KorAP server.

corpusQuery performs a corpus query via a connection to a KorAP-API-server

fetchNext fetches the next bunch of results of a KorAP query.

fetchAll fetches all results of a KorAP query.

frequencyQuery combines corpusQuery(), corpusStats() and ci() to compute a table with the relative frequencies and confidence intervals of one ore multiple search terms across one or multiple virtual corpora.

Usage

## S4 method for signature 'KorAPQuery'
initialize(
  .Object,
  korapConnection = NULL,
  request = NULL,
  vc = "",
  totalResults = 0,
  nextStartIndex = 0,
  fields = c("corpusSigle", "textSigle", "pubDate", "pubPlace", "availability",
    "textClass", "snippet"),
  requestUrl = "",
  webUIRequestUrl = "",
  apiResponse = NULL,
  hasMoreMatches = FALSE,
  collectedMatches = NULL
)

## S4 method for signature 'KorAPConnection'
corpusQuery(
  kco,
  query = if (missing(KorAPUrl))
    stop("At least one of the parameters query and KorAPUrl must be specified.", call. =
    FALSE) else httr::parse_url(KorAPUrl)$query$q,
  vc = if (missing(KorAPUrl)) "" else httr::parse_url(KorAPUrl)$query$cq,
  KorAPUrl,
  metadataOnly = TRUE,
  ql = if (missing(KorAPUrl)) "poliqarp" else httr::parse_url(KorAPUrl)$query$ql,
  fields = c("corpusSigle", "textSigle", "pubDate", "pubPlace", "availability",
    "textClass", "snippet"),
  accessRewriteFatal = TRUE,
  verbose = kco@verbose,
  expand = length(vc) != length(query),
  as.df = FALSE,
  context = NULL
)

## S4 method for signature 'KorAPQuery'
fetchNext(
  kqo,
  offset = kqo@nextStartIndex,
  maxFetch = maxResultsPerPage,
  verbose = kqo@korapConnection@verbose,
  randomizePageOrder = FALSE
)

## S4 method for signature 'KorAPQuery'
fetchAll(kqo, verbose = kqo@korapConnection@verbose, ...)

## S4 method for signature 'KorAPQuery'
fetchRest(kqo, verbose = kqo@korapConnection@verbose, ...)

## S4 method for signature 'KorAPConnection'
frequencyQuery(
  kco,
  query,
  vc = "",
  conf.level = 0.95,
  as.alternatives = FALSE,
  ...
)

buildWebUIRequestUrl(
  kco,
  query = if (missing(KorAPUrl))
    stop("At least one of the parameters query and KorAPUrl must be specified.", call. =
    FALSE) else httr::parse_url(KorAPUrl)$query$q,
  vc = if (missing(KorAPUrl)) "" else httr::parse_url(KorAPUrl)$query$cq,
  KorAPUrl,
  metadataOnly = TRUE,
  ql = if (missing(KorAPUrl)) "poliqarp" else httr::parse_url(KorAPUrl)$query$ql,
  fields = c("corpusSigle", "textSigle", "pubDate", "pubPlace", "availability",
    "textClass", "snippet"),
  accessRewriteFatal = TRUE
)

## S3 method for class 'KorAPQuery'
format(x, ...)

## S4 method for signature 'KorAPQuery'
show(object)

Arguments

.Object

korapConnection

KorAPConnection object

request

query part of the request URL

vc

string describing the virtual corpus in which the query should be performed. An empty string (default) means the whole corpus, as far as it is license-wise accessible.

totalResults

number of hits the query has yielded

nextStartIndex

at what index to start the next fetch of query results

fields

(meta)data fields that will be fetched for every match.

requestUrl

complete URL of the API request

webUIRequestUrl

URL of a web frontend request corresponding to the API request

apiResponse

data-frame representation of the JSON response of the API request

hasMoreMatches

logical that signals if more query results can be fetched

collectedMatches

matches already fetched from the KorAP-API-server

kco

KorAPConnection() object (obtained e.g. from new("KorAPConnection")

query

string that contains the corpus query. The query language depends on the ql parameter. Either query must be provided or KorAPUrl.

KorAPUrl

instead of providing the query and vc string parameters, you can also simply copy a KorAP query URL from your browser and use it here (and in KorAPConnection) to provide all necessary information for the query.

metadataOnly

logical that determines whether queries should return only metadata without any snippets. This can also be useful to prevent access rewrites. Note that the default value is TRUE. If you want your corpus queries to return not only metadata, but also KWICS, you need to authorize your RKorAPClient application as explained in the authorization section of the RKorAPClient Readme on GitHub and set the metadataOnly parameter to FALSE.

ql

string to choose the query language (see section on Query Parameters in the Kustvakt-Wiki for possible values.

accessRewriteFatal

abort if query or given vc had to be rewritten due to insufficient rights (not yet implemented).

verbose

print progress information if true

expand

logical that decides if query and vc parameters are expanded to all of their combinations

as.df

return result as data frame instead of as S4 object?

context

string that specifies the size of the left and the right context returned in snippet (provided that metadataOnly is set to false and that the necessary access right are met). The format of the context size specifcation (e.g. ⁠3-token,3-token⁠) is described in the Service: Search GET documentation of the Kustvakt Wiki. If the parameter is not set, the default context size secification of the KorAP server instance will be used. Note that you cannot overrule the maximum context size set in the KorAP server instance, as this is typically legally motivated.

kqo

object obtained from corpusQuery()

offset

start offset for query results to fetch

maxFetch

maximum number of query results to fetch

randomizePageOrder

fetch result pages in pseudo random order if true. Use set.seed() to set seed for reproducible results.

...

further arguments passed to or from other methods

conf.level

confidence level of the returned confidence interval (passed through ci() to prop.test()).

as.alternatives

LOGICAL that specifies if the query terms should be treated as alternatives. If as.alternatives is TRUE, the sum over all query hits, instead of the respective vc token sizes is used as total for the calculation of relative frequencies.

x

KorAPQuery object

object

KorAPQuery object

Value

Depending on the as.df parameter, a table or a KorAPQuery() object that, among other information, contains the total number of results in ⁠@totalResults⁠. The resulting object can be used to fetch all query results (with fetchAll()) or the next page of results (with fetchNext()). A corresponding URL to be used within a web browser is contained in ⁠@webUIRequestUrl⁠ Please make sure to check ⁠$collection$rewrites⁠ to see if any unforeseen access rewrites of the query's virtual corpus had to be performed.

The kqo input object with updated slots collectedMatches, apiResponse, nextStartIndex, hasMoreMatches

References

https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9026

https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/9026

See Also

KorAPConnection(), fetchNext(), fetchRest(), fetchAll(), corpusStats()

Examples

## Not run: 

# Fetch metadata of every query hit for "Ameisenplage" and show a summary
new("KorAPConnection") %>% corpusQuery("Ameisenplage") %>% fetchAll()

## End(Not run)

## Not run: 

# Use the copy of a KorAP-web-frontend URL for an API query of "Ameise" in a virtual corpus
# and show the number of query hits (but don't fetch them).

new("KorAPConnection", verbose = TRUE) %>%
 corpusQuery(KorAPUrl =
   "https://korap.ids-mannheim.de/?q=Ameise&cq=pubDate+since+2017&ql=poliqarp")

## End(Not run)

## Not run: 

# Plot the time/frequency curve of "Ameisenplage"
new("KorAPConnection", verbose=TRUE) %>%
  { . ->> kco } %>%
  corpusQuery("Ameisenplage") %>%
  fetchAll() %>%
  slot("collectedMatches") %>%
  mutate(year = lubridate::year(pubDate)) %>%
  dplyr::select(year) %>%
  group_by(year) %>%
  summarise(Count = dplyr::n()) %>%
  mutate(Freq = mapply(function(f, y)
    f / corpusStats(kco, paste("pubDate in", y))@tokens, Count, year)) %>%
  dplyr::select(-Count) %>%
  complete(year = min(year):max(year), fill = list(Freq = 0)) %>%
  plot(type = "l")

## End(Not run)
## Not run: 

q <- new("KorAPConnection") %>% corpusQuery("Ameisenplage") %>% fetchNext()
q@collectedMatches

## End(Not run)

## Not run: 

q <- new("KorAPConnection") %>% corpusQuery("Ameisenplage") %>% fetchAll()
q@collectedMatches

## End(Not run)

## Not run: 

q <- new("KorAPConnection") %>% corpusQuery("Ameisenplage") %>% fetchRest()
q@collectedMatches

## End(Not run)

## Not run: 

new("KorAPConnection", verbose = TRUE) %>%
  frequencyQuery(c("Mücke", "Schnake"), paste0("pubDate in ", 2000:2003))

## End(Not run)


RKorAPClient documentation built on Aug. 9, 2023, 1:07 a.m.