Introduction

"It's not the fall that kills you, it's the sudden stop." (Douglas Adams)

Data analysis is a growing field within aviation safety. The United States' civil aviation authority, the Federal Aviation Administration, requires large commercial operators to develop and operate a Safety Management System as outlined in 14 CFR ยง5 and has issued guidance on how to comply with the regulation through Advisory Circular 120-92. In both the regulation and the Advisory Circular, data management is mentioned several times, but data analysis techniques and best practices are still a work in progress. The 'safetydata' library is a R package to help aviation safety data analysts perform common tasks.

Datasets

The following datasets are currently available:

Functions

The following functions are currently available:

Examples

Combine several functions to determine ground speed from raw aircraft data.

library(safetydata)

heading <- 048 # aircraft true heading in bearing degrees
wind_from <- 290 # wind direction (from) in bearing degrees
wind_speed <- 12 # wind speed in knots
temperature <- 26 # outside air temperature in degrees Celcius
dewpoint <- 10 # dewpoint in degrees Celcius
altimeter <- 29.90 # altimeter in inches of mercury
airspeed <- 280 # airspeed in equivalent airspeed knots

# Calculate true airspeed
TAS <- EAS_to_TAS(airspeed, altimeter, temperature)

# Calculate headwind
headwind <- windVectors_headwind(heading, wind_from, wind_speed)

# Calculate ground speed
GS <- TAS_to_GS(TAS, headwind)

# Print results
TAS
headwind
GS

Create statistical process control p- and np-charts. Note that this function is currently built to work with .CSV files since data is commonly extracted from other systems. The function expects cretain specific formatting in the .CSV file and can be quite picky.

library(safetydata)
library(tidyverse)
library(nycflights13)
set.seed(78345)

# Create a .CSV file with formatting appropriate for this function
flights %>%
  group_by(month) %>%
  summarise(Operations = n()) %>%
  ungroup() %>%
  mutate(Deficiencies = rnorm(12, mean = 20, sd = 5)) %>% # Note: The "deficiencies" for this are pseudo-randomly created
  mutate(month = month.abb[month]) %>%
  mutate(Date = paste(month, 2013, sep = ", ")) %>%
  select(Date,
         Deficiencies,
         Operations) %>%
  write_csv(path = paste(getwd(), "example.csv", sep = "/"))

# np-chart
spcChart("example.csv", title = "np-Chart", type = "np", startdate = as.Date("2013-01-01"), enddate = as.Date("2013-12-31"))

# p-chart
spcChart("example.csv", title = "p-Chart", type = "p", startdate = as.Date("2013-01-01"), enddate = as.Date("2013-12-31"))

Plot actual vs. theoretical normal distribution quantiles to check for normality of data. In the example below, the sample does not come from a normally distributed dataset.

library(safetydata)
library(tidyverse)
library(nycflights13)
set.seed(78345)

# Use density plot or histogram to review the data visually
flights %>%
  ggplot(mapping = aes(x = arr_delay)) + geom_density()

# Use the Q_Q plot to support conclusions from the distribution plot
data <- flights %>%
  select(arr_delay) %>%
  sample_n(5000) %>%
  unlist()

ggplot(data = NULL, mapping = aes(sample = data)) + 
  geom_qq() + 
  geom_abline(slope = qq_abline_slope(data), intercept = qq_abline_intersect(data))

# Back up Q-Q plot with a Shapiro-Wilk test
flights %>%
  select(arr_delay) %>%
  sample_n(5000) %>%
  unlist() %>%
  shapiro.test()


peconeto/safetydata documentation built on May 24, 2019, 6:14 a.m.