This is an R-package that facilitates access to an 'analysis ready' dataset for county-level daily nClimGrid data from the National Oceanic and Atmospheric Administration (NOAA) National Centers for Environmental Information for streamlined use in health models. The complete dataset (6.5GB) used in this tutorial is available in this link. The data consists of a daily time-series of average temperature (Temp_avg), maximum temperature (Temp_max), minimum temperature (Temp_min), and precipitation (Precip) for every county in the continental United States (1951-present). The objective of this package is to provide functions for basic data-wrangling tasks. The package is developed as a complement to the NOAA Climate.gov data page originally developed for Infectious Disease modeling of COVID-19, but extends to other health models and modelers. The data used in this application are available for direct download from the NCEI FTP or HTTPS download here.

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)
library(EpiNOAA)
library(dplyr)
library(ggplot2)
library(magrittr)

As data is stored on the county level, the get_data_dictionary() function returns a list containing state and county level meta-data. Columns Temp_avg,Temp_min, Temp_max, Precipitation show the coverage (i.e. zero indicates all values are missing and one indicates no values missing.).

dailyTemperatures = read.csv(file = "E:/SecureData/CTL/R package/ARC3years.csv",
                               header =F,
                               col.names =  c("Date", "Year", "Month", "Day",
                                              "State","County_Name",
                                              "FIPS_corrected","Temp_avg",
                                              "Temp_max","Temp_min",
                                              "Precipitation"))
dailyTemperatures$Date <- as.Date(with(dailyTemperatures, paste(Year, Month, Day,sep="-")), "%Y-%m-%d")
EpiNOAA_meta <- get_data_dictionary(dailyTemperatures)
head(EpiNOAA_meta$County_Level)
head(EpiNOAA_meta$State_level)

The data(fips_codes) from the tidycensus package can be used to load a data.frame with states, counties and FIPS codes. The example below extracts data from ten counties in North Carolina from a given start and end date.

# ten counties from the state of NC. County information available in tidycensus
## Use data(fips_codes)  to load FIPS Codes in the United States
NC_fips_codes <- c(37001,37003,37005,37007,37009,
                   37011,37013,37015,37017,37019) 
NC_10_data <- get_daily_FIPS_averages(county_FIPS_codes = NC_fips_codes,dailyTemperatures = dailyTemperatures,
                                measurements = c("Temp_avg","Temp_min", "Temp_max", "Precipitation"),
                                start_date = "2018-09-21", end_date = "2018-10-21")
head(NC_10_data)

Below are some plots for visualization

NC_10_data$DayOfWeek = weekdays(NC_10_data$Date)
ggplot(NC_10_data,aes(x = Date, y = Temp_avg,group = County_Name)) +
  geom_point()+geom_line()+ylab("Mean Temperature (C)")+ 
  ggtitle("Time series of mean temperatures grouped by county.")

Use the get_historic_data to access data from a specific date.

get_historic_data(NC_10_data,historic_date = "2018-09-21") %>% head()


shrikanth95/EpiNOAA documentation built on Dec. 23, 2021, 1:27 a.m.