druid.query.timeseries: Query time series data

Description Usage Arguments Value See Also Examples

Description

Queries druid for timeseries data and returns it as a data frame

Usage

1
2
3
druid.query.timeseries(url = druid.url(), dataSource, intervals, aggregations,
  filter = NULL, granularity = "all", postAggregations = NULL,
  context = NULL, rawData = FALSE, verbose = F, ...)

Arguments

url

URL to connect to druid, defaults to druid.url()

dataSource

name of the data source to query

intervals

time period to retrieve data for as an interval object or list of interval objects

aggregations

list of metric aggregations to compute for this datasource

filter

filter specifying the subset of the data to extract.

granularity

time granularity at which to aggregate

postAggregations

post-aggregations to perform on the aggregations

context

query context

rawData

if set, returns the result object as is, without converting to a data frame

verbose

prints out the JSON query sent to druid

...

additional parameters to pass to druid.resulttodf

Value

Returns a data frame where each column represents a time series

See Also

druid.query.groupBy druid.query.topN granularity

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
## Not run: 

   # Get the time series associated with the twitter hashtag #druid, by hour
   druid.query.timeseries(url = druid.url(host = "<hostname>"),
                         dataSource   = "twitter",
                         intervals    = interval(ymd("2012-07-01"), ymd("2012-07-15")),
                         aggregations = sum(metric("count")),
                         filter       = dimension("hashtag") == "druid",
                         granularity  = granularity("hour"))

   # Average tweet length for a combination of hashtags in a given time zone
   druid.query.timeseries(url = druid.url("<hostname>"),
                         dataSource   = "twitter",
                         intervals    = interval(ymd("2012-07-01"), ymd("2012-08-30")),
                         aggregations = list(
                                           sum(metric("count")),
                                           sum(metric("length")
                                        ),
                         postAggregations = list(
                                           avg_length = field("length") / field("count")
                                        )
                         filter       =   dimension("hashtag") == "london2012"
                                        | dimension("hashtag") == "olympics",
                         granularity  = granularity("PT6H", timeZone="Europe/London"))
  
## End(Not run)

druid-io/RDruid documentation built on May 15, 2019, 2:54 p.m.