solr_all: All purpose search
In ropensci/solrium: General Purpose R Interface to 'Solr'

solr_all

R Documentation

All purpose search

Description

Includes documents, facets, groups, mlt, stats, and highlights

Usage

solr_all(
  conn,
  name = NULL,
  params = NULL,
  body = NULL,
  callopts = list(),
  raw = FALSE,
  parsetype = "df",
  concat = ",",
  optimizeMaxRows = TRUE,
  minOptimizedRows = 50000L,
  progress = NULL,
  ...
)

Arguments

`conn`	A solrium connection object, see SolrClient
`name`	Name of a collection or core. Or leave as `NULL` if not needed.
`params`	(list) a named list of parameters, results in a GET request as long as no body parameters given
`body`	(list) a named list of parameters, if given a POST request will be performed
`callopts`	Call options passed on to crul::HttpClient
`raw`	(logical) If TRUE, returns raw data in format specified by wt param
`parsetype`	(character) One of 'list' or 'df'
`concat`	(character) Character to concatenate elements of longer than length 1. Note that this only works reliably when data format is json (wt='json'). The parsing is more complicated in XML format, but you can do that on your own.
`optimizeMaxRows`	(logical) If `TRUE`, then rows parameter will be adjusted to the number of returned results by the same constraints. It will only be applied if rows parameter is higher than `minOptimizedRows`. Default: `TRUE`
`minOptimizedRows`	(numeric) used by `optimizedMaxRows` parameter, the minimum optimized rows. Default: 50000
`progress`	a function with logic for printing a progress bar for an HTTP request, ultimately passed down to curl. only supports `httr::progress` for now. See the README for an example.
`...`	Further args to be combined into query

Value

XML, JSON, a list, or data.frame

Parameters

q Query terms, defaults to ':', or everything.
sort Field to sort on. You can specify ascending (e.g., score desc) or descending (e.g., score asc), sort by two fields (e.g., score desc, price asc), or sort by a function (e.g., sum(x_f, y_f) desc, which sorts by the sum of x_f and y_f in a descending order).
start Record to start at, default to beginning.
rows Number of records to return. Default: 10.
pageDoc If you expect to be paging deeply into the results (say beyond page 10, assuming rows=10) and you are sorting by score, you may wish to add the pageDoc and pageScore parameters to your request. These two parameters tell Solr (and Lucene) what the last result (Lucene internal docid and score) of the previous page was, so that when scoring the query for the next set of pages, it can ignore any results that occur higher than that item. To get the Lucene internal doc id, you will need to add docid to the &fl list.
pageScore See pageDoc notes.
fq Filter query, this does not affect the search, only what gets returned. This parameter can accept multiple items in a lis or vector. You can't pass more than one parameter of the same name, so we get around it by passing multiple queries and we parse internally
fl Fields to return, can be a character vector like c('id', 'title'), or a single character vector with one or more comma separated names, like 'id,title'
defType Specify the query parser to use with this request.
timeAllowed The time allowed for a search to finish. This value only applies to the search and not to requests in general. Time is in milliseconds. Values <= 0 mean no time restriction. Partial results may be returned (if there are any).
qt Which query handler used. Options: dismax, others?
NOW Set a fixed time for evaluating Date based expresions
TZ Time zone, you can override the default.
echoHandler If TRUE, Solr places the name of the handle used in the response to the client for debugging purposes. Default:
echoParams The echoParams parameter tells Solr what kinds of Request parameters should be included in the response for debugging purposes, legal values include:
- none - don't include any request parameters for debugging
- explicit - include the parameters explicitly specified by the client in the request
- all - include all parameters involved in this request, either specified explicitly by the client, or implicit because of the request handler configuration.
wt (character) One of json, xml, or csv. Data type returned, defaults to 'csv'. If json, uses jsonlite::fromJSON() to parse. If xml, uses xml2::read_xml() to parse. If csv, uses read.table() to parse. wt=csv gives the fastest performance at least in all the cases we have tested in, thus it's the default value for wt

References

See https://lucene.apache.org/solr/guide/8_2/searching.html for more information.

Examples

## Not run: 
# connect
(cli <- SolrClient$new(host = "api.plos.org", path = "search", port = NULL))

solr_all(cli, params = list(q='*:*', rows=2, fl='id'))

# facets
solr_all(cli, params = list(q='*:*', rows=2, fl='id', facet="true",
  facet.field="journal"))

# mlt
solr_all(cli, params = list(q='ecology', rows=2, fl='id', mlt='true',
  mlt.count=2, mlt.fl='abstract'))

# facets and mlt
solr_all(cli, params = list(q='ecology', rows=2, fl='id', facet="true",
  facet.field="journal", mlt='true', mlt.count=2, mlt.fl='abstract'))

# stats
solr_all(cli, params = list(q='ecology', rows=2, fl='id', stats='true',
  stats.field='counter_total_all'))

# facets, mlt, and stats
solr_all(cli, params = list(q='ecology', rows=2, fl='id', facet="true",
  facet.field="journal", mlt='true', mlt.count=2, mlt.fl='abstract',
  stats='true', stats.field='counter_total_all'))

# group
solr_all(cli, params = list(q='ecology', rows=2, fl='id', group='true',
 group.field='journal', group.limit=3))

# facets, mlt, stats, and groups
solr_all(cli, params = list(q='ecology', rows=2, fl='id', facet="true",
 facet.field="journal", mlt='true', mlt.count=2, mlt.fl='abstract',
 stats='true', stats.field='counter_total_all', group='true',
 group.field='journal', group.limit=3))

# using wt = xml
solr_all(cli, params = list(q='*:*', rows=50, fl=c('id','score'),
  fq='doc_type:full', wt="xml"), raw=TRUE)

## End(Not run)

ropensci/solrium documentation built on Sept. 12, 2022, 3:01 p.m.