trafficaccidents: Traffic Accidents in Slovenia

Traffic Accidents in Slovenia

Darko Bergant August 2017

pre code { font-size: 0.8rem; } code { line-height: 1.1; }

About

This tutorial is an overview of trafficaccidents data package. The package contains information about traffic accidents in Slovenia examined by the police from 2005 to 2016. Original files^[The original data is published on http://www.policija.si/] were converted to R data frames without changes, except for:

Additional accident geo positions in WGS 84
Additional calendar table (with local holidays)
Translation to English

The tutorial was created using R language [@R-base] and several R extensions. See the list of all used packages at the end of the document.

Note that this is not a thorough analysis of traffic accidents. The only goal is to bring the data and visualization tools to a larger group of users.

Data Structure

Information about accidents is stored in three data frames:

event contains accident events with date, time, position and other accident attributes.
party contains attributes of the parties involved.
calendar contains all dates from 2005 to 2016.

{"x":{"diagram":"#data_model\ndigraph {\ngraph [rankdir=BT tooltip=\"Data Model\" rankdir = LR]\n\nnode [margin=0 fontcolor = \"#444444\" ]\n\nedge [color = \"#555555\", arrowsize = 1, ]\n\n\n \"calendar\" [label = <<TABLE ALIGN=\"LEFT\" BORDER=\"1\" CELLBORDER=\"0\" CELLSPACING=\"0\" COLOR=\"#555555\">\n <TR>\n <TD COLSPAN=\"3\" BGCOLOR=\"#EFEBDD\" BORDER=\"0\"><FONT COLOR=\"#000000\">calendar<\/FONT>\n<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><U>date<\/U><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">Date<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">date_POSIXct<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">POSIXct, POSIXt<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">year<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">month<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">day<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">wday<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">year_day<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">working_day<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <\/TABLE>>, shape = \"plaintext\"] \n\n \"event\" [label = <<TABLE ALIGN=\"LEFT\" BORDER=\"1\" CELLBORDER=\"0\" CELLSPACING=\"0\" COLOR=\"#555555\">\n <TR>\n <TD COLSPAN=\"3\" BGCOLOR=\"#EFEBDD\" BORDER=\"0\"><FONT COLOR=\"#000000\">event<\/FONT>\n<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><U>src_file<\/U><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">character<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><U>event_id<\/U><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">injury<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">~<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">date<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">Date<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">hour<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">in_city<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">road_type<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">location_id<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">road_section_id<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">pos_number<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">location_type<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">cause<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">event_type<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">weather<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">traffic<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">surface_conditions<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">surface<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">pos_x<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">pos_y<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">lon<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">lat<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <\/TABLE>>, shape = \"plaintext\"] \n\n \"party\" [label = <<TABLE ALIGN=\"LEFT\" BORDER=\"1\" CELLBORDER=\"0\" CELLSPACING=\"0\" COLOR=\"#555555\">\n <TR>\n <TD COLSPAN=\"3\" BGCOLOR=\"#EFEBDD\" BORDER=\"0\"><FONT COLOR=\"#000000\">party<\/FONT>\n<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">~<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">src_file<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">character<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">~<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">event_id<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">accident_role<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">age<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">gender<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">country<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">p_injury<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">traffic_role<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">seat_belt_or_helmet<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">experience_y<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">experience_m<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">integer<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">alcotest<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">alco_exam<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">numeric<\/TD>\n <\/TR>\n <TR>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\"><\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">country_iso<\/TD>\n <TD ALIGN=\"LEFT\" BGCOLOR=\"#FFFFFF\">factor<\/TD>\n <\/TR>\n <\/TABLE>>, shape = \"plaintext\"] \n\n\"event\"->\"calendar\"\n\"party\"->\"event\"\n}","config":{"engine":null,"options":null}},"evals":[],"jsHooks":[]}

Data structure

Accident Locations

The majority of accidents (93%) are geocoded. Use pos_x and pos_y columns for geographic location.^[ Columns pos_x and pos_y are geocoded in D48/GK coordinate reference system. Use columns lon and lat for WGS84] For example, we can plot all accidents locations as points, coloured by road category (road_type):

library(dplyr)
library(ggplot2)
library(viridis)
library(trafficaccidents)

dat_roads <- 
  event %>% 
  filter(!is.na(road_type), !is.na(pos_x)) %>% 
  arrange(desc(road_type))

ggplot(data = dat_roads, aes(pos_y, pos_x, color = road_type) ) +
  geom_point(alpha = 0.03, size = 0.2) +
  coord_fixed() +
  scale_color_viridis(discrete = TRUE, end = 0.7, option = "A") +
  guides(colour = guide_legend(
    override.aes = list(alpha = 1, size = 2), title = "Road category")) +
  theme_void() +
  theme(legend.position = c(0.95,0.3))

Traffic accidents locations by road category

Time

Working days have different time distribution than weekends and holidays.

days_count <-
  calendar %>% 
  group_by(year, working_day) %>% 
  summarise(days = n())

accidents_per_hour <- 
  event %>% 
  left_join(calendar, by = "date") %>% 
  group_by(year, working_day, hour) %>% 
  count %>% 
  left_join(days_count, by = c("year", "working_day")) %>% 
  mutate(p = n / days)

ggplot(accidents_per_hour, aes(x = hour, y = p)) +
    geom_point(alpha = 0.2, size = 0.2, position = position_jitter(0.1)) +
    geom_smooth(method = "loess", span = 1/3, color = "gray") +
    scale_x_continuous(breaks = 1:7*3, name = "Hour", minor_breaks = NULL) +
    scale_y_continuous(limits = c(0, NA), breaks = 0:6 * 2, 
                       name = "Average number of accidents") +
    coord_cartesian(y=c(0, 9)) +
    facet_wrap(~ working_day, ncol = 1, scales = "fixed") + 
    theme_minimal()

Commuting Pattern

With "joy plot" from ggjoy package [@R-ggjoy] we can discover the differences of time distributions between road categories. The peaks at rush hour are more visible on roads than on city streets.

Time distribution of accidents on working day

library(ggjoy)

dat_rt_wd <- 
  event %>% 
  left_join(calendar, by = "date") %>% 
  filter(
    working_day == "Working day", 
    !is.na(road_type)
  )

ggplot(dat_rt_wd) +
  geom_joy(aes(x = hour, y = road_type, fill = road_type), alpha = 0.5 ) +
  scale_x_continuous(breaks = 1:7*3, minor_breaks = NULL,
                     limits = c(0,23), name = "Hour") +
  scale_fill_viridis(discrete = TRUE, end = 0.7, option = "A")+
  scale_y_discrete(limits = rev(levels(dat_rt_wd$road_type)), name = NULL)+
  theme_minimal() +
  theme(legend.position = "none")

Time and Place

With animation it is possible to see accident locations by hour. The following animation was created with gganimate package [@R-gganimate].

event_wd <-
  event %>%
  filter(!is.na(pos_x)) %>%
  mutate(alpha = 0.2) %>%
  left_join(calendar, by = "date") %>%
  filter(working_day == "Working day")

shift_hour <- function(x, n, a) {
  mutate(x, alpha = a, hour = (hour + n) %% 24)
}

dat_wd_shift <-
  bind_rows(
    event_wd, shift_hour(event_wd, 1, 0.1), shift_hour(event_wd, 2, 0.07)
  ) %>%
  arrange(hour, desc(road_type))

clock <- list(x = 378938 + 30000, y = 191590 - 10000, width = 8500)

p1 <-
  ggplot(data = dat_wd_shift, aes(frame = hour)) +
  geom_point(aes(pos_y, pos_x, alpha = alpha, color = road_type), size = 0.4) +
  scale_alpha_identity() +
  scale_color_viridis(discrete = TRUE, end = 0.7, option = "A", guide = "none",
                      limits = levels(dat_wd_shift$road_type)) +
  coord_fixed() +
  geom_text(aes(label = sprintf("%02d", hour)), color = "gray",
            x = clock$x, y = clock$y, size = 9) +
  geom_segment(size = 2, aes(
    x = clock$x + clock$width * sin(hour/6 * pi),
    y = clock$y + clock$width * cos(hour/6 * pi),
    xend = clock$x + clock$width*1.4 * sin(hour/6 * pi),
    yend = clock$y + clock$width*1.4 * cos(hour/6 * pi),
    color = levels(dat_wd_shift$road_type)[as.integer(hour >= 12)*8+1]
  )) +
  theme_void()


library(gganimate)
library(animation)

magickPath <- shortPathName("magick.exe")
ani.options(convert=magickPath)
#ani.options(ani.width = 1024, ani.height = 640+55)
ani.options(ani.width = 1024*1.3, ani.height = (640+55)*1.3)
ani.options(interval = 0.1)
gganimate(p1, "docs/animation/accidents_animation.gif", title_frame = FALSE )

Severity

All accidents are marked with injury severity (column injury). The plot below compares time distributions of accidents for different levels of severities.

Distribution of accidents over time

dat <- 
  event %>% 
  left_join(calendar, by = "date")



ggplot(dat) +
  geom_joy(aes(x = hour, y = injury, alpha = injury), fill = "darkred") +
  scale_x_continuous(breaks = 1:7*3, minor_breaks = NULL, 
                     limits = c(0,23), name = "Hour") +
  scale_y_discrete(limits = rev(levels(dat$injury)), name = NULL)+
  scale_alpha_discrete(range = c(0.06, 0.7), guide = "none") +
  theme_minimal() +
  facet_wrap(~working_day, ncol = 1)

Cause

Police classifies each accident by main cause. The plot below represent the number of accidents as a circle size and injury severity as color opacity for different accident causes:

dat_type <- 
  event %>% 
  group_by(cause, injury) %>% 
  count 

ggplot(dat_type, aes(x = 1, y=1, size = sqrt(n) / 7, alpha = injury)) +
  scale_size_identity() +
  geom_point(color = "darkred") +
  scale_y_discrete(limits = rev(levels(dat_type$cause))) +
  coord_fixed(0.7) +
  scale_alpha_discrete(range = c(0.06, 0.7)) +
  guides(alpha = guide_legend(override.aes = list(size = 6))) +
  theme_void(base_size = 12) +
  #theme(strip.text = element_text(size = 10)) +
  labs(x = NULL, y = NULL, alpha = "Severity") +
  facet_wrap(~cause, ncol = 3)

Causes and Accident Types

The accident_type column defines what happened (e.g. side collision or vehicle rollover). The plot below explores its relation to accident cause.

dat_cause_type <- 
  event %>% 
  group_by(cause, event_type, injury) %>% 
  count

ggplot(dat_cause_type, 
       aes(x = event_type, y=cause, size = sqrt(n) / 7, alpha = injury)) +
  geom_point(color = "darkred") +
  scale_size_identity() +
  scale_alpha_discrete(range = c(0.06, 0.7)) +
  guides(alpha = guide_legend(override.aes = list(size = 10))) +
  scale_x_discrete(position = "top") +
  scale_y_discrete(limits = rev(levels(dat_cause_type$cause))) +
  coord_fixed() +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle=90,hjust = 0, vjust = 0.3, size = 12),
    axis.text.y = element_text(size = 12),
    legend.position = "bottom"
  ) +
  labs(x = "Accident type", y = "Cause", alpha = "Severity")

Parties

There can be several parties involved in same accident. The data includes also some cases where there are several parties at fault for the same accident. Here the events_party data frame is created with single event and some attributes from one party at fault.

party_at_fault <- 
  party %>% 
  filter(accident_role == "At fault") 

other_party <- 
  party %>% 
  filter(accident_role != "At fault") 

events_party <- 
  party_at_fault %>% 
  group_by(src_file, event_id) %>%
  summarise(
    age = first(age),
    gender = first(gender),
    alco =  first(alcotest)  > 0, 
    experience = first(experience_y) + first(experience_m)/12,
    traffic_role = first(traffic_role)
  )

events_party %>% 
  group_by(traffic_role) %>% 
  summarise(accidents = n()) %>% 
  arrange(desc(accidents)) %>% 
  head(8) %>% 
  kable

traffic_role accidents

Car driver 192340 Truck driver 24174 Cyclist 7862 Motorcycle driver 5290 Motorized bicycle driver 3344 Bus driver 1704 Pedestrian 1637 Tractor driver 1492

Car Driver's Age

Most frequent parties at fault are car drivers. Is it true, that there are more accidents involving young drivers, driving at night?

hour_intervals <- setNames(
  c(0, 6, 22), 
  c("Night (22h - 06h)", "Day (6h-21h)", "Night (22h - 06h)")
)

events_party_car_drivers <-
  events_party %>% 
  filter(traffic_role %in% c("Car driver")) %>% 
  filter(age >= 18) %>% 
  inner_join(event, by = c("src_file", "event_id")) %>% 
  inner_join(calendar, by = "date") %>% 
  mutate(
    hour_interval = findInterval(hour, hour_intervals),
    hour_interval = names(hour_intervals)[hour_interval],
    age = round(age) 
  )

events_party_car_drivers %>% 
  filter(date >= as.Date("2011-01-01")) %>% 
  group_by(age, hour_interval, src_file) %>% 
  count %>% 

  ggplot(aes(x = age, y = n)) +
  #geom_point(alpha = 0.7) +
  geom_point(alpha = 0.5, mapping = aes(color = src_file)) +
  geom_smooth(alpha = 0.2, method = "loess") +
  facet_wrap(~hour_interval, scales = "free_y") +
  labs(x = "Age", y = "Number of accidents", color = NULL) +
  scale_x_continuous(breaks = 1:10*10, limits = c(18, 85)) +
  scale_color_brewer(palette = "Blues") +
  theme_bw()

Number of accidents per year by car driver's age

Alcohol Involved Accidents by Age

Alcohol involved accidents, number of accidents by age at day/night:

events_party_car_drivers %>% 
  filter(date >= as.Date("2011-01-01"), alco) %>% 
  group_by(age, hour_interval, src_file) %>% 
  count %>% 

  ggplot(aes(x = age, y = n)) +
  #geom_point(alpha = 0.7) +
  geom_point(alpha = 0.3, mapping = aes(color = src_file)) +
  geom_smooth(method = "loess", color = "orange", span = 0.6) +
  facet_wrap(~hour_interval) +
  labs(x = "Age", y = "Number of accidents", color = NULL) +
  scale_x_continuous(breaks = 1:10*10, limits = c(18, 80)) +
  scale_color_brewer(palette = "Oranges") +
  theme_bw()+
  theme(legend.position = "none")

Number of alcohol induced accidents per year, by car driver age

R Packages

knitr 1.16

Xie Y (2017). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.16, http://yihui.name/knitr/.

rmarkdown 1.5

Allaire J, Cheng J, Xie Y, McPherson J, Chang W, Allen J, Wickham H, Atkins A, Hyndman R and Arslan R (2017). rmarkdown: Dynamic Documents for R. R package version 1.5, https://CRAN.R-project.org/package=rmarkdown.

tidyr 0.6.2

Wickham H (2017). tidyr: Easily Tidy Data with 'spread()' and 'gather()' Functions. R package version 0.6.2, https://CRAN.R-project.org/package=tidyr.

dplyr 0.5.0

Wickham H and Francois R (2016). dplyr: A Grammar of Data Manipulation. R package version 0.5.0, https://CRAN.R-project.org/package=dplyr.

ggjoy 0.2.0

Wilke C (2017). ggjoy: Joyplots in 'ggplot2'. R package version 0.2.0, https://CRAN.R-project.org/package=ggjoy.

viridis 0.4.0

Garnier S (2017). viridis: Default Color Maps from 'matplotlib'. R package version 0.4.0, https://CRAN.R-project.org/package=viridis.

viridisLite 0.2.0

Garnier S (2017). viridisLite: Default Color Maps from 'matplotlib' (Lite Version). R package version 0.2.0, https://CRAN.R-project.org/package=viridisLite.

ggplot2 2.2.1

Wickham H (2009). ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York. ISBN 978-0-387-98140-6, http://ggplot2.org.

datamodelr 0.2.1.9001

Bergant D (2017). datamodelr: Define and Plot Data Model Diagrams. R package version 0.2.1.9001, https://github.com/bergant/datamodelr.

trafficaccidents 0.1.0

Bergant D (2018). trafficaccidents: Traffic Accidents in Slovenia. R package version 0.1.0, http://github.com/bergant/trafficaccidents.

tufte 0.2

Xie Y and Allaire J (2016). tufte: Tufte's Styles for R Markdown Documents. R package version 0.2, https://CRAN.R-project.org/package=tufte.

bergant/trafficaccidents documentation built on May 6, 2019, 3:30 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

bergant/trafficaccidents
Traffic Accidents in Slovenia

docs/index.md
In bergant/trafficaccidents: Traffic Accidents in Slovenia

Traffic Accidents in Slovenia

About

Data Structure

Accident Locations

Time

Commuting Pattern

Time and Place

Severity

Cause

Causes and Accident Types

Parties

Car Driver's Age

Alcohol Involved Accidents by Age

R Packages

R Package Documentation

Browse R Packages

We want your feedback!

bergant/trafficaccidents Traffic Accidents in Slovenia

docs/index.md In bergant/trafficaccidents: Traffic Accidents in Slovenia

Traffic Accidents in Slovenia

About

Data Structure

Accident Locations

Time

Commuting Pattern

Time and Place

Severity

Cause

Causes and Accident Types

Parties

Car Driver's Age

Alcohol Involved Accidents by Age

R Packages

R Package Documentation

Browse R Packages

We want your feedback!

bergant/trafficaccidents
Traffic Accidents in Slovenia

docs/index.md
In bergant/trafficaccidents: Traffic Accidents in Slovenia