This package contains several functions that I find myself constantly having to rewrite while doing data analysis. It is very much under development at this point in time.
This is a very simple but useful function. Often I like to break time-series data up by season to create separate models or plots. The season
function allows this to be accomplished easily. For example,
require(lubridate) require(dplyr) require(myhelpr) x <- dmy("3/2/2015") season(x, label = FALSE) x <- data_frame( ts = dmy_hm(c("1/5/2015 12:00", "1/7/2015 15:00", "1/11/2016 2:00")) ) x %>% mutate(Season = season(ts))
The label
parameter specifies whether to return a numeric value (1-4 where Summer = 1) or a character string label. This parameter defaults to TRUE
.
Often we want to calculate the financial year of a date-time object. The fye
and fyb
functions calculate the financial year ending and financial year ending values, respectively. For example,
x <- data_frame(ts = dmy("1/1/2010") + months(0:11)) x %>% mutate(fye = fye(ts), fyb = fyb(ts))
This can be particularly useful when grouping data frames based on year and season. A particularly easy to make mistake is grouping on season and year without accounting for the fact that December will be grouped with January and February within the same year, rather than with January and February in the following year as would be preferred. The below code shows a simple way to avoid this issue,
# Create data frame with clear trend and some noise x <- data_frame(ts = dmy("1/1/2010") + months(0:35), value = 1:36 + rnorm(36, sd = 0.2)) # Incorrect analysis x_bad_summary <- x %>% mutate(Season = season(ts), Year = year(ts)) %>% group_by(Year, Season) %>% summarise(mean_val = mean(value)) with(x_bad_summary, plot(mean_val)) # Fixed analysis x_good_summary <- x %>% mutate(Season = season(ts), Year = ifelse(Season == "Summer", fye(ts), year(ts))) %>% group_by(Year, Season) %>% summarise(mean_val = mean(value)) with(x_good_summary, plot(mean_val))
In the first plot January and February has been grouped with the next Summer's December which causes an unexpected spike for each summer. The second plot corrects this by setting all summer Year
values to financial year ending which allows for correct grouping.
A new function season_year
has been added to streamline the above code,
x_season_year <- x %>% mutate(Season = season(ts), Year = season_year(ts)) %>% group_by(Year, Season) %>% summarise(mean_val = mean(value)) with(x_season_year, plot(mean_val))
which gives the same result. This is likely to be a bit slower because the season is effectively calculated twice, but I think the convenience and clarity is worth it.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.