index: Index API operations

Description Usage Arguments Details Author(s) References Examples

Description

Index API operations

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
index_get(index = NULL, features = NULL, raw = FALSE,
  verbose = TRUE, ...)

index_exists(index, ...)

index_delete(index, raw = FALSE, verbose = TRUE, ...)

index_create(index = NULL, body = NULL, raw = FALSE,
  verbose = TRUE, ...)

index_recreate(index = NULL, body = NULL, raw = FALSE,
  verbose = TRUE, ...)

index_close(index, ...)

index_open(index, ...)

index_stats(index = NULL, metric = NULL, completion_fields = NULL,
  fielddata_fields = NULL, fields = NULL, groups = NULL,
  level = "indices", ...)

index_settings(index = "_all", ...)

index_settings_update(index = NULL, body, ...)

index_segments(index = NULL, ...)

index_recovery(index = NULL, detailed = FALSE, active_only = FALSE,
  ...)

index_optimize(index = NULL, max_num_segments = NULL,
  only_expunge_deletes = FALSE, flush = TRUE, wait_for_merge = TRUE,
  ...)

index_forcemerge(index = NULL, max_num_segments = NULL,
  only_expunge_deletes = FALSE, flush = TRUE, ...)

index_upgrade(index = NULL, wait_for_completion = FALSE, ...)

index_analyze(text = NULL, field = NULL, index = NULL,
  analyzer = NULL, tokenizer = NULL, filters = NULL,
  char_filters = NULL, body = list(), ...)

index_flush(index = NULL, force = FALSE, full = FALSE,
  wait_if_ongoing = FALSE, ...)

index_clear_cache(index = NULL, filter = FALSE, filter_keys = NULL,
  fielddata = FALSE, query_cache = FALSE, id_cache = FALSE, ...)

Arguments

index

(character) A character vector of index names

features

(character) A single feature. One of settings, mappings, or aliases

raw

If TRUE (default), data is parsed to list. If FALSE, then raw JSON.

verbose

If TRUE (default) the url call used printed to console.

...

Curl args passed on to httr::POST(), httr::GET(), httr::PUT(), httr::HEAD(), or httr::DELETE()

body

Query, either a list or json.

metric

(character) A character vector of metrics to display. Possible values: "_all", "completion", "docs", "fielddata", "filter_cache", "flush", "get", "id_cache", "indexing", "merge", "percolate", "refresh", "search", "segments", "store", "warmer".

completion_fields

(character) A character vector of fields for completion metric (supports wildcards)

fielddata_fields

(character) A character vector of fields for fielddata metric (supports wildcards)

fields

(character) Fields to add.

groups

(character) A character vector of search groups for search statistics.

level

(character) Return stats aggregated on "cluster", "indices" (default) or "shards"

detailed

(logical) Whether to display detailed information about shard recovery. Default: FALSE

active_only

(logical) Display only those recoveries that are currently on-going. Default: FALSE

max_num_segments

(character) The number of segments the index should be merged into. Default: "dynamic"

only_expunge_deletes

(logical) Specify whether the operation should only expunge deleted documents

flush

(logical) Specify whether the index should be flushed after performing the operation. Default: TRUE

wait_for_merge

(logical) Specify whether the request should block until the merge process is finished. Default: TRUE

wait_for_completion

(logical) Should the request wait for the upgrade to complete. Default: FALSE

text

The text on which the analysis should be performed (when request body is not used)

field

Use the analyzer configured for this field (instead of passing the analyzer name)

analyzer

The name of the analyzer to use

tokenizer

The name of the tokenizer to use for the analysis

filters

A character vector of filters to use for the analysis

char_filters

A character vector of character filters to use for the analysis

force

(logical) Whether a flush should be forced even if it is not necessarily needed ie. if no changes will be committed to the index.

full

(logical) If set to TRUE a new index writer is created and settings that have been changed related to the index writer will be refreshed.

wait_if_ongoing

If TRUE, the flush operation will block until the flush can be executed if another flush operation is already executing. The default is false and will cause an exception to be thrown on the shard level if another flush operation is already running.

filter

(logical) Clear filter caches

filter_keys

(character) A vector of keys to clear when using the filter_cache parameter (default: all)

fielddata

(logical) Clear field data

query_cache

(logical) Clear query caches

id_cache

(logical) Clear ID caches for parent/child

Details

index_analyze: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-analyze.html This method can accept a string of text in the body, but this function passes it as a parameter in a GET request to simplify.

index_flush: https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-flush.html From the ES website: The flush process of an index basically frees memory from the index by flushing data to the index storage and clearing the internal transaction log. By default, Elasticsearch uses memory heuristics in order to automatically trigger flush operations as required in order to clear memory.

index_status: The API endpoint for this function was deprecated in Elasticsearch v1.2.0, and will likely be removed soon. Use index_recovery() instead.

index_settings_update: There are a lot of options you can change with this function. See https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-update-settings.html for all the options.

Author(s)

Scott Chamberlain [email protected]

References

https://www.elastic.co/guide/en/elasticsearch/reference/current/indices.html

Examples

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
## Not run: 
# get information on an index
index_get(index='shakespeare')
## this one is the same as running index_settings('shakespeare')
index_get(index='shakespeare', features='settings')
index_get(index='shakespeare', features='mappings')
index_get(index='shakespeare', features='alias')

# check for index existence
index_exists(index='shakespeare')
index_exists(index='plos')

# create an index
if (index_exists('twitter')) index_delete('twitter')
index_create(index='twitter')
if (index_exists('things')) index_delete('things')
index_create(index='things')
if (index_exists('plos')) index_delete('plos')
index_create(index='plos')

# re-create an index
index_recreate("deer")
index_recreate("deer", verbose = FALSE)

# delete an index
if (index_exists('plos')) index_delete(index='plos')

## with a body
body <- '{
 "settings" : {
  "index" : {
    "number_of_shards" : 3,
    "number_of_replicas" : 2
   }
 }
}'
if (index_exists('alsothat')) index_delete('alsothat')
index_create(index='alsothat', body=body)

## with mappings
body <- '{
 "mappings": {
   "record": {
     "properties": {
       "location" : {"type" : "geo_point"}
      }
   }
 }
}'
if (!index_exists('gbifnewgeo')) index_create(index='gbifnewgeo', body=body)
gbifgeo <- system.file("examples", "gbif_geosmall.json", package = "elastic")
docs_bulk(gbifgeo)

# close an index
index_create('plos')
index_close('plos')

# open an index
index_open('plos')

# Get stats on an index
index_stats('plos')
index_stats(c('plos','gbif'))
index_stats(c('plos','gbif'), metric='refresh')
index_stats(metric = "indexing")
index_stats('shakespeare', metric='completion')
index_stats('shakespeare', metric='completion', completion_fields = "completion")
index_stats('shakespeare', metric='fielddata')
index_stats('shakespeare', metric='fielddata', fielddata_fields = "evictions")
index_stats('plos', level="indices")
index_stats('plos', level="cluster")
index_stats('plos', level="shards")

# Get segments information that a Lucene index (shard level) is built with
index_segments()
index_segments('plos')
index_segments(c('plos','gbif'))

# Get recovery information that provides insight into on-going index shard recoveries
index_recovery()
index_recovery('plos')
index_recovery(c('plos','gbif'))
index_recovery("plos", detailed = TRUE)
index_recovery("plos", active_only = TRUE)

# Optimize an index, or many indices
if (gsub("\\.", "", ping()$version$number) < 500) {
  ### ES < v5 - use optimize
  index_optimize('plos')
  index_optimize(c('plos','gbif'))
  index_optimize('plos')
} else {
  ### ES > v5 - use forcemerge
  index_forcemerge('plos')
}

# Upgrade one or more indices to the latest format. The upgrade process converts any
# segments written with previous formats.
if (gsub("\\.", "", ping()$version$number) < 500) {
  index_upgrade('plos')
  index_upgrade(c('plos','gbif'))
}

# Performs the analysis process on a text and return the tokens breakdown 
# of the text
index_analyze(text = 'this is a test', analyzer='standard')
index_analyze(text = 'this is a test', analyzer='whitespace')
index_analyze(text = 'this is a test', analyzer='stop')
index_analyze(text = 'this is a test', tokenizer='keyword', 
  filters='lowercase')
index_analyze(text = 'this is a test', tokenizer='keyword', filters='lowercase',
   char_filters='html_strip')
index_analyze(text = 'this is a test', index = 'plos')
index_analyze(text = 'this is a test', index = 'shakespeare')
index_analyze(text = 'this is a test', index = 'shakespeare', 
  config=verbose())

## NGram tokenizer
body <- '{
        "settings" : {
             "analysis" : {
                 "analyzer" : {
                     "my_ngram_analyzer" : {
                         "tokenizer" : "my_ngram_tokenizer"
                     }
                 },
                 "tokenizer" : {
                     "my_ngram_tokenizer" : {
                         "type" : "nGram",
                         "min_gram" : "2",
                         "max_gram" : "3",
                         "token_chars": [ "letter", "digit" ]
                     }
                 }
             }
      }
}'
if (index_exists("shakespeare2")) {
   index_delete("shakespeare2")
}
tokenizer_set(index = "shakespeare2", body=body)
index_analyze(text = "art thouh", index = "shakespeare2", 
  analyzer='my_ngram_analyzer')

# Explicitly flush one or more indices.
index_flush(index = "plos")
index_flush(index = "shakespeare")
index_flush(index = c("plos","shakespeare"))
index_flush(index = "plos", wait_if_ongoing = TRUE)
library('httr')
index_flush(index = "plos", config=verbose())

# Clear either all caches or specific cached associated with one ore more indices.
index_clear_cache()
index_clear_cache(index = "plos")
index_clear_cache(index = "shakespeare")
index_clear_cache(index = c("plos","shakespeare"))
index_clear_cache(filter = TRUE)
library('httr')
index_clear_cache(config=verbose())

# Index settings
## get settings
index_settings()
index_settings("_all")
index_settings('gbif')
index_settings(c('gbif','plos'))
index_settings('*s')
## update settings
if (index_exists('foobar')) index_delete('foobar')
index_create("foobar")
settings <- list(index = list(number_of_replicas = 4))
index_settings_update("foobar", body = settings)
index_get("foobar")$foobar$settings

## End(Not run)

ropensci/elastic documentation built on Dec. 17, 2018, 11:08 a.m.