chomp_hits: Hits to data.tables

Description Usage Arguments Examples

View source: R/chomp_hits.R

Description

A function for converting Elasticsearch docs into R data.tables. It uses fromJSON with flatten = TRUE to convert a JSON into an R data.frame, and formats it into a data.table.

Usage

1
chomp_hits(hits_json = NULL, keep_nested_data_cols = TRUE)

Arguments

hits_json

A character vector. If its length is greater than 1, its elements will be pasted together. This can contain a JSON returned from a search query in Elasticsearch, or a filepath or URL pointing at one.

keep_nested_data_cols

a boolean (default TRUE); whether to keep columns that are nested arrays in the original JSON. A warning will be given if these columns are deleted.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# A sample raw result from a hits query:
result <- '[{"_source":{"timestamp":"2017-01-01","cust_name":"Austin","details":{
"cust_class":"big_spender","location":"chicago","pastPurchases":[{"film":"The Notebook",
"pmt_amount":6.25},{"film":"The Town","pmt_amount":8.00},{"film":"Zootopia","pmt_amount":7.50,
"matinee":true}]}}},{"_source":{"timestamp":"2017-02-02","cust_name":"James","details":{
"cust_class":"peasant","location":"chicago","pastPurchases":[{"film":"Minions",
"pmt_amount":6.25,"matinee":true},{"film":"Rogue One","pmt_amount":10.25},{"film":"Bridesmaids",
"pmt_amount":8.75},{"film":"Bridesmaids","pmt_amount":6.25,"matinee":true}]}}},{"_source":{
"timestamp":"2017-03-03","cust_name":"Nick","details":{"cust_class":"critic","location":"cannes",
"pastPurchases":[{"film":"Aala Kaf Ifrit","pmt_amount":0,"matinee":true},{
"film":"Dopo la guerra (Apres la Guerre)","pmt_amount":0,"matinee":true},{
"film":"Avengers: Infinity War","pmt_amount":12.75}]}}}]'

# Chomp into a data.table
sampleChompedDT <- chomp_hits(hits_json = result, keep_nested_data_cols = TRUE)
print(sampleChompedDT)

# (Note: use es_search() to get here in one step)

# Unpack by details.pastPurchases
unpackedDT <- unpack_nested_data(chomped_df = sampleChompedDT
                                 , col_to_unpack = "details.pastPurchases")
print(unpackedDT)

Example output

INFO [2018-04-26 10:26:16] Keeping the following nested data columns. Consider using unpack_nested_data for one:
 details.pastPurchases
    timestamp cust_name details.cust_class details.location
1: 2017-01-01    Austin        big_spender          chicago
2: 2017-02-02     James            peasant          chicago
3: 2017-03-03      Nick             critic           cannes
   details.pastPurchases
1:          <data.frame>
2:          <data.frame>
3:          <data.frame>
                                film pmt_amount matinee  timestamp cust_name
 1:                     The Notebook       6.25      NA 2017-01-01    Austin
 2:                         The Town       8.00      NA 2017-01-01    Austin
 3:                         Zootopia       7.50    TRUE 2017-01-01    Austin
 4:                          Minions       6.25    TRUE 2017-02-02     James
 5:                        Rogue One      10.25      NA 2017-02-02     James
 6:                      Bridesmaids       8.75      NA 2017-02-02     James
 7:                      Bridesmaids       6.25    TRUE 2017-02-02     James
 8:                   Aala Kaf Ifrit       0.00    TRUE 2017-03-03      Nick
 9: Dopo la guerra (Apres la Guerre)       0.00    TRUE 2017-03-03      Nick
10:           Avengers: Infinity War      12.75      NA 2017-03-03      Nick
    details.cust_class details.location
 1:        big_spender          chicago
 2:        big_spender          chicago
 3:        big_spender          chicago
 4:            peasant          chicago
 5:            peasant          chicago
 6:            peasant          chicago
 7:            peasant          chicago
 8:             critic           cannes
 9:             critic           cannes
10:             critic           cannes

uptasticsearch documentation built on Sept. 12, 2019, 1:04 a.m.