filter_time()
attempts to make filtering data frames by date much easier
than dplyr::filter()
. It includes a flexible shorthand notation that allows
you to specify entire date ranges with very little typing. The general form
of the time_formula
that you will use to filter rows is from ~ to
, where
the left hand side (LHS) is the character start date, and the right hand side (RHS) is
the character end date. Both endpoints are included in the result.
Each side of the time_formula
can be maximally specified as
the character 'YYYY-MM-DD HH:MM:SS'
.
library(tibbletime) library(dplyr) # Facebook stock prices. data(FB) # Convert FB to tbl_time FB <- as_tbl_time(FB, index = date) # FANG stock prices data(FANG) # Convert FANG to tbl_time and group FANG <- as_tbl_time(FANG, index = date) %>% group_by(symbol)
In dplyr
, if you wanted to get the dates for 2013
in the FB
dataset, you
might do something like this:
filter(FB, date >= as.Date("2013-01-01"), date <= as.Date("2013-12-31"))
That's a lot of typing for one filter step. With tibbletime
, because
the index
was specified at creation, we can do this:
filter_time(FB, time_formula = '2013-01-01' ~ '2013-12-31')
At first glance, this might not look like less code, but this is before any
shorthand is applied. Note how the filtering condition is specified as a
formula
separated by a ~
.
Using filter_time
shorthand, this can be written:
filter_time(FB, '2013' ~ '2013')
Or even more succinctly as:
filter_time(FB, ~'2013')
The shorthand notation works as follows. In the first example, '2013' ~ '2013'
is
expanded to '2013-01-01 + 00:00:00' ~ '2013-12-31 + 23:59:59'
. It works by
identifying the periodicity of the provided input (yearly), and expanding it
to the beginning and end of that period. The one sided formula ~'2013'
works
similarly, and is useful when you want to select every date inside a period.
As another example of this shorthand, if you wanted to select every date in March, 2015:
filter_time(FB, ~'2015-03') # In dplyr it looks like this # (and you have to think, does March have 30 or 31 days?) # filter(FB, date >= as.Date("2015-03-01"), date <= as.Date("2015-03-31"))
Two keywords are available to assist with filtering:
'start'
- The start of the series'end'
- The end of the seriesThis filters from the start of the series to the end of 2015.
filter_time(FB, 'start' ~ '2015')
Working with grouped tbl_time
objects is just as you might expect.
FANG %>% filter_time('2013-01-01' ~ '2013-01-04')
Filtering can also be done by hour / minute / second. Note that the form of this
is slightly different than the standard, 'YYYY-MM-DD HH:MM:SS'
.
# Dummy example. Every second in a day example <- create_series(~'2013-01-01', period = 's') # The first 2 minutes of the day example %>% filter_time('2013-01-01' ~ '2013-01-01 00:02') # 3 specific hours of the day # Equivalent to: # '2013-01-01 + 03:00:00' ~ '2013-01-01 + 06:59:59' example %>% filter_time('2013-01-01 3' ~ '2013-01-01 6')
[
syntaxFor interactive use, to get an even quicker look at a dataset you can use
the traditional extraction operator [
with the formula syntax.
FB[~'2013']
FB['2013'~'2014-02', c(1,2,3)]
Each side of the time formula is unquoted and evaluated in the environment
that is was created using rlang
. This means that you can use variables inside
the call the filter_time()
.
date_var <- as.Date("2014-01-01") filter_time(FB, 'start' ~ date_var) date_char <- "2014-02" filter_time(FB, ~ date_char)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.