enigma_stats: Get statistics on columns of a dataset from Enigma.

Description Usage Arguments Value References Examples

View source: R/enigma_stats.r

Description

Get statistics on columns of a dataset from Enigma.

Usage

1
2
3
enigma_stats(dataset = NULL, select, conjunction = NULL, operation = NULL,
  by = NULL, of = NULL, limit = 500, search = NULL, where = NULL,
  sort = NULL, page = NULL, key = NULL, ...)

Arguments

dataset

Dataset name. Required.

select

(character) Column to get statistics on. Required.

conjunction

one of "and" or "or". Only applicable when more than one search or where parameter is provided. Default: "and"

operation

(character) Operation to run on a given column. For a numerical column, valid operations are sum, avg, stddev, variance, max, min and frequency. For a date column, valid operations are max, min and frequency. For all other columns, the only valid operation is frequency. Defaults to all available operations based on the column's type.

by

(character) Compound operation to run on a given pair of columns. Valid compound operations are sum and avg. When running a compound operation query, the of parameter is required (see below).

of

(character) Numerical column to compare against when running a compound operation. Required when using the by parameter. Must be a numerical column.

limit

(numeric) Limit the number of frequency, compound sum, or compound average results returned. Max: 500; Default: 500.

search

(character) Filter results by only returning rows that match a search query. By default this searches the entire table for matching text. To search particular fields only, use the query format "@fieldname query". To match multiple queries, the | (or) operator can be used eg. "query1|query2".

where

(character) Filter results with a SQL-style "where" clause. Only applies to numerical columns - use the search parameter for strings. Valid operators are >, < and =. Only one where clause per request is currently supported.

sort

(character) Sort frequency, compound sum, or compound average results in a given direction. + denotes ascending order, - denotes descending

page

(numeric) Paginate frequency, compound sum, or compound average results and return the nth page of results. Pages are calculated based on the current limit, which defaults to 500.

key

(character) Required. An Enigma API key. Supply in the function call, or store in your .Renviron file like ENIGMA_KEY=your key), or in your .Rprofile file as options(enigmaKey = "<your key>"), Obtain an API key by creating an account with Enigma at http://enigma.io, then obtain an API key from your account page.

...

Named curl options passed on to HttpClient

Value

A list with items:

References

https://app.enigma.io/api#stats

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
## Not run: 
# After obtaining an API key from Enigma's website, pass in your key to 
# the function call or set in your options (see above instructions for the 
# key parameter) If you pass in your key to the function call use the 
# key parameter

# stats on a varchar column
x <- 'gov.mx.imss.compras.main'
enigma_stats(x, select='provider_id', limit = 10)

# stats on a numeric column
enigma_stats(x, select='serialid', limit = 10)

# stats on a date column
pakistan <- 'gov.pk.secp.business-registry.all-entities'
enigma_metadata(dataset=pakistan)
enigma_stats(dataset=pakistan, select='registration_date', limit = 10)

# stats on a date column, by the average of a numeric column
aust <- 'gov.au.government-spending.federal-contracts'
enigma_metadata(dataset=aust)
enigma_stats(dataset=aust, select='contractstart', by='avg', of='value', 
  limit = 10)

# Get frequency of distances traveled
## get columns for the air carrier dataset
dset <- 'us.gov.dot.rita.trans-stats.air-carrier-statistics.t100d-market-all-carrier'
enigma_metadata(dset)$columns$table[,c(1:4)]
enigma_stats(dset, select='distance', limit = 10)

## End(Not run)

enigma documentation built on May 29, 2017, 12:28 p.m.