Description Usage Arguments Value Parameters number of results Note References See Also Examples
Returns only matched documents, and doesn't return other items, including facets, groups, mlt, stats, and highlights.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
conn |
A solrium connection object, see SolrClient |
name |
Name of a collection or core. Or leave as |
params |
(list) a named list of parameters, results in a GET request as long as no body parameters given |
body |
(list) a named list of parameters, if given a POST request will be performed |
callopts |
Call options passed on to crul::HttpClient |
raw |
(logical) If TRUE, returns raw data in format specified by wt param |
parsetype |
(character) One of 'list' or 'df' |
concat |
(character) Character to concatenate elements of longer than length 1. Note that this only works reliably when data format is json (wt='json'). The parsing is more complicated in XML format, but you can do that on your own. |
optimizeMaxRows |
(logical) If |
minOptimizedRows |
(numeric) used by |
progress |
a function with logic for printing a progress
bar for an HTTP request, ultimately passed down to curl. only supports
|
... |
Further args to be combined into query |
XML, JSON, a list, or data.frame
q Query terms, defaults to ':', or everything.
sort Field to sort on. You can specify ascending (e.g., score desc) or descending (e.g., score asc), sort by two fields (e.g., score desc, price asc), or sort by a function (e.g., sum(x_f, y_f) desc, which sorts by the sum of x_f and y_f in a descending order).
start Record to start at, default to beginning.
rows Number of records to return. Default: 10.
pageDoc If you expect to be paging deeply into the results (say beyond page 10,
assuming rows=10) and you are sorting by score, you may wish to add the pageDoc
and pageScore parameters to your request. These two parameters tell Solr (and Lucene)
what the last result (Lucene internal docid and score) of the previous page was,
so that when scoring the query for the next set of pages, it can ignore any results
that occur higher than that item. To get the Lucene internal doc id, you will need
to add docid
to the &fl list.
pageScore See pageDoc notes.
fq Filter query, this does not affect the search, only what gets returned. This parameter can accept multiple items in a lis or vector. You can't pass more than one parameter of the same name, so we get around it by passing multiple queries and we parse internally
fl Fields to return, can be a character vector like c('id', 'title')
,
or a single character vector with one or more comma separated names, like
'id,title'
defType Specify the query parser to use with this request.
timeAllowed The time allowed for a search to finish. This value only applies
to the search and not to requests in general. Time is in milliseconds. Values <= 0
mean no time restriction. Partial results may be returned (if there are any).
qt Which query handler used. Options: dismax, others?
NOW Set a fixed time for evaluating Date based expresions
TZ Time zone, you can override the default.
echoHandler If TRUE
, Solr places the name of the handle used in the
response to the client for debugging purposes. Default:
echoParams The echoParams parameter tells Solr what kinds of Request parameters should be included in the response for debugging purposes, legal values include:
none - don't include any request parameters for debugging
explicit - include the parameters explicitly specified by the client in the request
all - include all parameters involved in this request, either specified explicitly by the client, or implicit because of the request handler configuration.
wt (character) One of json, xml, or csv. Data type returned, defaults
to 'csv'. If json, uses jsonlite::fromJSON()
to parse. If xml,
uses xml2::read_xml()
to parse. If csv, uses read.table()
to parse.
wt=csv
gives the fastest performance at least in all the cases we have
tested in, thus it's the default value for wt
Because solr_search()
returns a data.frame, metadata doesn't fit into the
output data.frame itself. You can access number of results (numFound
) in
the attributes of the results. For example, attr(x, "numFound")
for
number of results, and attr(x, "start")
for the offset value (if one
was given). Or you can get all attributes like attributes(x)
. These
metadata are not in the attributes when raw=TRUE
as those metadata
are in the payload (unless wt="csv"
).
SOLR v1.2 was first version to support csv. See https://issues.apache.org/jira/browse/SOLR-66
See https://lucene.apache.org/solr/guide/8_2/searching.html for more information.
solr_highlight()
, solr_facet()
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 | ## Not run:
# Connect to a local Solr instance
(cli <- SolrClient$new())
cli$search("gettingstarted", params = list(q = "features:notes"))
solr_search(cli, "gettingstarted")
solr_search(cli, "gettingstarted", params = list(q = "features:notes"))
solr_search(cli, "gettingstarted", body = list(query = "features:notes"))
(cli <- SolrClient$new(host = "api.plos.org", path = "search", port = NULL))
cli$search(params = list(q = "*:*"))
cli$search(params = list(q = "title:golgi", fl = c('id', 'title')))
cli$search(params = list(q = "*:*", facet = "true"))
# search
solr_search(cli, params = list(q='*:*', rows=2, fl='id'))
# search and return all rows
solr_search(cli, params = list(q='*:*', rows=-1, fl='id'))
# Search for word ecology in title and cell in the body
solr_search(cli, params = list(q='title:"ecology" AND body:"cell"',
fl='title', rows=5))
# Search for word "cell" and not "body" in the title field
solr_search(cli, params = list(q='title:"cell" -title:"lines"', fl='title',
rows=5))
# Wildcards
## Search for word that starts with "cell" in the title field
solr_search(cli, params = list(q='title:"cell*"', fl='title', rows=5))
# Proximity searching
## Search for words "sports" and "alcohol" within four words of each other
solr_search(cli, params = list(q='everything:"sports alcohol"~7',
fl='abstract', rows=3))
# Range searches
## Search for articles with Twitter count between 5 and 10
solr_search(cli, params = list(q='*:*', fl=c('alm_twitterCount','id'),
fq='alm_twitterCount:[5 TO 50]', rows=10))
# Boosts
## Assign higher boost to title matches than to body matches
## (compare the two calls)
solr_search(cli, params = list(q='title:"cell" abstract:"science"',
fl='title', rows=3))
solr_search(cli, params = list(q='title:"cell"^1.5 AND abstract:"science"',
fl='title', rows=3))
# FunctionQuery queries
## This kind of query allows you to use the actual values of fields to
## calculate relevancy scores for returned documents
## Here, we search on the product of counter_total_all and alm_twitterCount
## metrics for articles in PLOS Journals
solr_search(cli, params = list(q="{!func}product($v1,$v2)",
v1 = 'sqrt(counter_total_all)',
v2 = 'log(alm_twitterCount)', rows=5, fl=c('id','title'),
fq='doc_type:full'))
## here, search on the product of counter_total_all and alm_twitterCount,
## using a new temporary field "_val_"
solr_search(cli,
params = list(q='_val_:"product(counter_total_all,alm_twitterCount)"',
rows=5, fl=c('id','title'), fq='doc_type:full'))
## papers with most citations
solr_search(cli, params = list(q='_val_:"max(counter_total_all)"',
rows=5, fl=c('id','counter_total_all'), fq='doc_type:full'))
## papers with most tweets
solr_search(cli, params = list(q='_val_:"max(alm_twitterCount)"',
rows=5, fl=c('id','alm_twitterCount'), fq='doc_type:full'))
## many fq values
solr_search(cli, params = list(q="*:*", fl=c('id','alm_twitterCount'),
fq=list('doc_type:full','subject:"Social networks"',
'alm_twitterCount:[100 TO 10000]'),
sort='counter_total_month desc'))
## using wt = csv
solr_search(cli, params = list(q='*:*', rows=50, fl=c('id','score'),
fq='doc_type:full', wt="csv"))
solr_search(cli, params = list(q='*:*', rows=50, fl=c('id','score'),
fq='doc_type:full'))
# using a proxy
# cli <- SolrClient$new(host = "api.plos.org", path = "search", port = NULL,
# proxy = list(url = "http://186.249.1.146:80"))
# solr_search(cli, q='*:*', rows=2, fl='id', callopts=list(verbose=TRUE))
# Pass on curl options to modify request
## verbose
solr_search(cli, params = list(q='*:*', rows=2, fl='id'),
callopts = list(verbose=TRUE))
# using a cursor for deep paging
(cli <- SolrClient$new(host = "api.plos.org", path = "search", port = NULL))
## json, raw data
res <- solr_search(cli, params = list(q = '*:*', rows = 100, sort = "id asc", cursorMark = "*"),
parsetype = "json", raw = TRUE, callopts=list(verbose=TRUE))
res
## data.frame
res <- solr_search(cli, params = list(q = '*:*', rows = 100, sort = "id asc", cursorMark = "*"))
res
attributes(res)
attr(res, "nextCursorMark")
## list
res <- solr_search(cli, params = list(q = '*:*', rows = 100, sort = "id asc", cursorMark = "*"),
parsetype = "list")
res
attributes(res)
attr(res, "nextCursorMark")
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.