solr_mlt: Solr "more like this" search
In solr: General Purpose R Interface to Solr

Description Usage Arguments Value References Examples

Solr "more like this" search

solr_mlt(q = "*:*", fq = NULL, mlt.count = NULL, mlt.fl = NULL,
  mlt.mintf = NULL, mlt.mindf = NULL, mlt.minwl = NULL,
  mlt.maxwl = NULL, mlt.maxqt = NULL, mlt.maxntp = NULL,
  mlt.boost = NULL, mlt.qf = NULL, fl = NULL, wt = "json", start = 0,
  rows = NULL, key = NULL, base = NULL, callopts = list(),
  raw = FALSE, parsetype = "df", concat = ",", verbose = TRUE)

`q`	Query terms, defaults to ':', or everything.
`fq`	Filter query, this does not affect the search, only what gets returned
`mlt.count`	The number of similar documents to return for each result. Default is 5.
`mlt.fl`	The fields to use for similarity. NOTE: if possible these should have a stored TermVector DEFAULT_FIELD_NAMES = new String[] "contents"
`mlt.mintf`	Minimum Term Frequency - the frequency below which terms will be ignored in the source doc. DEFAULT_MIN_TERM_FREQ = 2
`mlt.mindf`	Minimum Document Frequency - the frequency at which words will be ignored which do not occur in at least this many docs. DEFAULT_MIN_DOC_FREQ = 5
`mlt.minwl`	minimum word length below which words will be ignored. DEFAULT_MIN_WORD_LENGTH = 0
`mlt.maxwl`	maximum word length above which words will be ignored. DEFAULT_MAX_WORD_LENGTH = 0
`mlt.maxqt`	maximum number of query terms that will be included in any generated query. DEFAULT_MAX_QUERY_TERMS = 25
`mlt.maxntp`	maximum number of tokens to parse in each example doc field that is not stored with TermVector support. DEFAULT_MAX_NUM_TOKENS_PARSED = 5000
`mlt.boost`	[true/false] set if the query will be boosted by the interesting term relevance. DEFAULT_BOOST = false
`mlt.qf`	Query fields and their boosts using the same format as that used in DisMaxQParserPlugin. These fields must also be specified in mlt.fl.
`fl`	Fields to return. We force 'id' to be returned so that there is a unique identifier with each record.
`wt`	Data type returned, defaults to 'json'
`start`	Record to start at, default to beginning.
`rows`	Number of records to return. Defaults to 10.
`key`	API key, if needed.
`base`	URL endpoint.
`callopts`	Call options passed on to httr::GET
`raw`	(logical) If TRUE, returns raw data in format specified by wt param
`parsetype`	(character) One of 'list' or 'df'
`concat`	(character) Character to concatenate elements of longer than length 1. Note that this only works reliably when data format is json (wt='json'). The parsing is more complicated in XML format, but you can do that on your own.
`verbose`	If TRUE (default) the url call used printed to console.

XML, JSON, a list, or data.frame

See http://wiki.apache.org/solr/MoreLikeThis for more information.

## Not run: 
url <- 'http://api.plos.org/search'

solr_mlt(q='*:*', mlt.count=2, mlt.fl='abstract', fl='score', base=url,
   fq="doc_type:full")
solr_mlt(q='*:*', rows=2, mlt.fl='title', mlt.mindf=1, mlt.mintf=1, fl='alm_twitterCount',
   base=url)
solr_mlt(q='title:"ecology" AND body:"cell"', mlt.fl='title', mlt.mindf=1, mlt.mintf=1,
   fl='counter_total_all', rows=5, base=url)
solr_mlt(q='ecology', mlt.fl='abstract', fl='title', rows=5, base=url)
solr_mlt(q='ecology', mlt.fl='abstract', fl=c('score','eissn'), rows=5, base=url)
solr_mlt(q='ecology', mlt.fl='abstract', fl=c('score','eissn'), rows=5, base=url)

# get raw data, and parse later if needed
out=solr_mlt(q='ecology', mlt.fl='abstract', fl='title', rows=2, base=url,
   raw=TRUE)
library(rjson)
fromJSON(out)
solr_parse(out, "df")

## End(Not run)