pre_aggregated | R Documentation |
Used whenever the df to be analyzed is preaggregated, i.e. the data has already by grouped into periods (corresponding to itemsets).
aggregate_sequences(df, include_date = FALSE, multiset = FALSE, summary_stats = TRUE, output_directory = "~")
df |
A dataframe that has either 3 or 4 columns; 3 columns in the order of id, date, event if the date is not desired to be included; or 4 columns in the order of id, date, period, event if the date is to be included. |
include_date |
Logical indicator which controls the inclusion of the date variable in the returning data. If creating reports using the -generate_reports- function of approxmapR, then the dates will be included in the alignment_with_date output file if this argument is equal to TRUE - default value is FALSE. |
multiset |
Beta; Logical indicator which controls the exclusion of multiple events within the same event set. |
summary_stats |
Logical controlling printing of summary statistics regarding aggregation. Defaults to TRUE |
output_directory |
The path to where the exports should be placed. |
Returns a dataframe that has the properly classes dataframe
library(approxmapR) library(tidyverse) data("demo1") demo1 <- data.frame(do.call("rbind", strsplit(as.character(demo1$id.date.item), ","))) names(demo1) <- c("id", "period", "event") # Identifying the earliest date per -id- and setting it as the -index_dt- demo1 <- demo1 %>% group_by(id) %>% mutate(index_dt = min(as.Date(period, "%m/%d/%Y"))) # Creating an Index from the earliest date demo1 <- demo1 %>% mutate(date = as.Date(period, "%m/%d/%Y")) %>% mutate(period = as.numeric(difftime(date, index_dt, units = "days"))) %>% select(id, period, event) %>% arrange(id, period) # Aggregating custom aggregation frames with the following groupings: # [] index date will be first period (1), # [] the first 28 days after the index date will be grouped into weekly periods (2 - 4), and then # [] events which occurred on the 29th day or more from the index day will be grouped in a monthly frame (5+) demo1 <- demo1 %>% group_by(id) %>% mutate(date = period, n_ndays7 = period / 7, period = as.integer(case_when(period == 0 ~ 1, ceiling(n_ndays7) < 5 ~ ceiling(n_ndays7) + 1, TRUE ~ floor(n_ndays7) + 2)) ) %>% select(id, date, period, event) # Since -demo1- has the date column, need to select only the id, period, and event columns if the dates are not # to be included agg <- demo1 %>% select(id, period, event) %>% pre_aggregated() # No need to select specific columns if the dates are desired to be included agg <- demo1 %>% pre_aggregated(include_date = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.