"It's not the fall that kills you, it's the sudden stop." (Douglas Adams)
Data analysis is a growing field within aviation safety. The United States' civil aviation authority, the Federal Aviation Administration, requires large commercial operators to develop and operate a Safety Management System as outlined in 14 CFR ยง5 and has issued guidance on how to comply with the regulation through Advisory Circular 120-92. In both the regulation and the Advisory Circular, data management is mentioned several times, but data analysis techniques and best practices are still a work in progress. The 'safetydata' library is a R package to help aviation safety data analysts perform common tasks.
The following datasets are currently available:
The following functions are currently available:
densityAltitude is a function that calculates density altitude using the National Weather Service method and data points that are commonly available to pilots and airlines.
downloadNTSBdf is a function that downloads the NTSB accidents and incidents database as a R data.frame. This function can take some time to run, so the NTSBData_20170328 dataset is available as pre-downloaded static copy.
qq_abline_intersect and qq_abline_slope are helpful when checking for the normality of data. They are intended to be used with ggplot2.
EAS_to_TAS and TAS_to_GS are functions used to convert between common aircraft speeds.
spcChart is a function that plots np-charts or p-charts using calculations from the qcc library and graphics from the ggplot2 library.
windVectors_crosswind and windVectors_headwind are functions used to extract the longitudinal and transverse vectors of wind relative to an aircraft.
rare_events is a function used to determine rare events that require validation in EMS
Combine several functions to determine ground speed from raw aircraft data.
library(safetydata) heading <- 048 # aircraft true heading in bearing degrees wind_from <- 290 # wind direction (from) in bearing degrees wind_speed <- 12 # wind speed in knots temperature <- 26 # outside air temperature in degrees Celcius dewpoint <- 10 # dewpoint in degrees Celcius altimeter <- 29.90 # altimeter in inches of mercury airspeed <- 280 # airspeed in equivalent airspeed knots # Calculate true airspeed TAS <- EAS_to_TAS(airspeed, altimeter, temperature) # Calculate headwind headwind <- windVectors_headwind(heading, wind_from, wind_speed) # Calculate ground speed GS <- TAS_to_GS(TAS, headwind) # Print results TAS headwind GS
Create statistical process control p- and np-charts. Note that this function is currently built to work with .CSV files since data is commonly extracted from other systems. The function expects cretain specific formatting in the .CSV file and can be quite picky.
library(safetydata) library(tidyverse) library(nycflights13) set.seed(78345) # Create a .CSV file with formatting appropriate for this function flights %>% group_by(month) %>% summarise(Operations = n()) %>% ungroup() %>% mutate(Deficiencies = rnorm(12, mean = 20, sd = 5)) %>% # Note: The "deficiencies" for this are pseudo-randomly created mutate(month = month.abb[month]) %>% mutate(Date = paste(month, 2013, sep = ", ")) %>% select(Date, Deficiencies, Operations) %>% write_csv(path = paste(getwd(), "example.csv", sep = "/")) # np-chart spcChart("example.csv", title = "np-Chart", type = "np", startdate = as.Date("2013-01-01"), enddate = as.Date("2013-12-31")) # p-chart spcChart("example.csv", title = "p-Chart", type = "p", startdate = as.Date("2013-01-01"), enddate = as.Date("2013-12-31"))
Plot actual vs. theoretical normal distribution quantiles to check for normality of data. In the example below, the sample does not come from a normally distributed dataset.
library(safetydata) library(tidyverse) library(nycflights13) set.seed(78345) # Use density plot or histogram to review the data visually flights %>% ggplot(mapping = aes(x = arr_delay)) + geom_density() # Use the Q_Q plot to support conclusions from the distribution plot data <- flights %>% select(arr_delay) %>% sample_n(5000) %>% unlist() ggplot(data = NULL, mapping = aes(sample = data)) + geom_qq() + geom_abline(slope = qq_abline_slope(data), intercept = qq_abline_intersect(data)) # Back up Q-Q plot with a Shapiro-Wilk test flights %>% select(arr_delay) %>% sample_n(5000) %>% unlist() %>% shapiro.test()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.