filter_hourly: Filter NOAA ISD stations based on "coverage" requirements,...

View source: R/hourly_helpers.R

filter_hourlyR Documentation

Filter NOAA ISD stations based on "coverage" requirements, and calculate coverage and statistical information for each station-variable combination.

Description

Filters available weather stations based on a specified minimum coverage (i.e., percent non-missing hourly observations). Weather stations with non-missing data for fewer days than specified by coverage will be excluded from the county average.

Usage

filter_hourly(fips, hourly_data, coverage = NULL)

Arguments

fips

A character string giving the five-digit U.S. FIPS county code of the county for which the user wants to pull weather data.

hourly_data

A dataframe as returned by the df element from an isd_monitors_data call.

coverage

A numeric value in the range of 0 to 1 that specifies the desired percentage coverage for each weather variable (i.e., what percent of each weather variable must be non-missing to include the data from a station when calculating hourly values averaged across stations).

Value

A list with two elements: df and stations. df is a dataframe of hourly weather data filtered based on the specfified coverage, as well as columns ("var"_reporting) for each weather variable showing the number of stations contributing to the average for that variable for each hour. The second element, stations, is a dataframe giving statistical information for stations that meet the specified coverage requirements. The column station gives the station id (USAF and WBAN identification numbers pasted together, separated by "-"). Note: One of these identification ids is sometimes missing. For example, a value in station might be 722029-NA. The column var gives the weather variable associated with the row of statistical values for each station and variable combination. calc_coverage gives the percentage coverage for each station-weather variable combination. These values will all be greater than or equal to the specified coverage value. standard_dev gives the standard deviation of values for each station-weather variable combination. max and min give the minimum and maximum values, and range gives the range of values in each station-weather variable combination. These last four statistical calculations (standard_dev, max, min, and range) are only included for the seven core hourly weather variables, which include "wind_direction", "wind_speed", "ceiling_height", "visibility_distance", "temperature", "temperature_dewpoint", and "air_pressure". (The values of these columns are set to NA for other variables, such as quality flag data.)


leighseverson/countyweather documentation built on April 9, 2022, 11:38 a.m.