AirlineArrival | R Documentation |
Flights categorized by destination city, airline, and whether or not the flight was on time.
A data frame with 11000 observations on the following 3 variables.
a factor with levels LosAngeles
,
Phoenix
, SanDiego
, SanFrancisco
, Seattle
a factor with levels Delayed
, OnTime
a factor with levels Alaska
, AmericaWest
Barnett, Arnold. 1994. “How numbers can trick you.” Technology Review, vol. 97, no. 7, pp. 38–45.
These and similar data appear in many text books under the topic of Simpson's paradox.
tally(
airline ~ result, data = AirlineArrival,
format = "perc", margins = TRUE)
tally(
result ~ airline + airport,
data = AirlineArrival, format = "perc", margins = TRUE)
AirlineArrival2 <-
AirlineArrival %>%
group_by(airport, airline, result) %>%
summarise(count = n()) %>%
group_by(airport, airline) %>%
mutate(total = sum(count), percent = count/total * 100) %>%
filter(result == "Delayed")
AirlineArrival3 <-
AirlineArrival %>%
group_by(airline, result) %>%
summarise(count = n()) %>%
group_by(airline) %>%
mutate(total = sum(count), percent = count/total * 100) %>%
filter(result == "Delayed")
gf_line(percent ~ airport, color = ~ airline, group = ~ airline,
data = AirlineArrival2) %>%
gf_point(percent ~ airport, color = ~ airline, size = ~total,
data = AirlineArrival2) %>%
gf_hline(yintercept = ~ percent, color = ~airline,
data = AirlineArrival3, linetype = "dashed") %>%
gf_labs(y = "percent delayed")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.