daily_fips | R Documentation |
Given a particular county FIPS code, this function returns data and meta-data
for weather data, either for all available dates or for dates within a
requested date range. For a few of the most common weather variables, we
convert from the original units to more commonly-used units. For example, in
NOAA's data, the temperature values (tmin
, tmax
, and tavg
)
are recorded in tenths of degrees Celsius. This code converts those to degrees
Celsius. Similarly, in NOAA's data the precipitation (prcp
) is recorded
in tenths of millimeters, which this code converts to millimeters. All other
units are left as in NOAA's original data. See NOAA's README file for the
GHCND data (https://www1.ncdc.noaa.gov/pub/data/ghcn/daily/readme.txt) for
more information on the source data that are being pulled.
daily_fips( fips, coverage = NULL, date_min = NULL, date_max = NULL, var = "all", average_data = TRUE, station_label = FALSE, limit_20_longest = TRUE, exclude_less_than_one_year = FALSE, verbose = TRUE )
fips |
A string with the five-digit U.S. FIPS code of a county in numeric, character, or factor format. |
coverage |
A numeric value in the range of 0 to 1 that specifies
the desired percentage coverage for the weather variable (i.e., what
percent of each weather variable must be non-missing to include data from
a monitor when calculating daily values averaged across monitors. The
default is to include all monitors with any available data (i.e.,
|
date_min |
A string with the desired starting date in character, ISO format ("yyyy-mm-dd"). The dataframe returned will include only stations that have data for dates including and after the specified date. In other words, if you specify that this equals "1981-02-16", then it will return only the stations with at least some data recorded after Feb. 16, 1981. If a station stopped recording data before Feb. 16, 1981, it will be removed from the set of stations. If not specified, the function will include available stations, regardless of the date when the station started recording data. |
date_max |
A string with the desired ending date in character, ISO format ("yyyy-mm-dd"). The dataframe returned will include only stations that have data for dates up to and including the specified date. If not specified, the function will include available stations, regardless of the date when the station stopped recording data. |
var |
A character vector specifying desired weather variables. For
example, |
average_data |
TRUE / FALSE to indicate if you want the function to average daily weather data across multiple monitors. If you choose FALSE, the function will return a dataframe with separate entries for each monitor, while TRUE (the default) outputs a single estimate for each day in the dataset, giving the average value of the weather metric across all available monitors in the county that day. |
station_label |
TRUE / FALSE to indicate if you want your plot of weather station locations to include labels with station ids. |
limit_20_longest |
A logical value, indicating whether the stations should be limited to the 20 with the longest records of data (otherwise, there may be so many stations that it will take extremely long to pull data from all of them). The default is FALSE, but you may want to change to TRUE if it is taking a long time to pull your data. |
exclude_less_than_one_year |
A logical value, indicating whether stations with less than one year's worth of data should be automatically excluded. The default value is TRUE. |
verbose |
TRUE / FALSE to indicate if you want the function to print out the name of the county it's processing. |
A list with three elements. The first element (daily_data
) is a
dataframe of daily weather data averaged across multiple stations, as well
as columns ("var"_reporting
) for each weather variable showing the
number of stations contributing to the average for that variable on that
day. The second element (station_metadata
) is a dataframe of station
metadata for stations included in the daily_data
dataframe, as well
as statistical information about these values. Columns
include id
, name
, var
, latitude
,
longitude
, calc_coverage
, standard_dev
, min
,
max
, and range
. The third element (station_map
)
is a plot showing locations of all weather stations for a particular county
satisfying the conditions present in daily_fips
's arguments
(coverage
, date_min
, date_max
, and/or var
).
Because this function uses the NOAA API to identify the weather
monitors within a U.S. county, you will need to get an access token from
NOAA to use this function. Visit NOAA's token request page
(http://www.ncdc.noaa.gov/cdo-web/token) to request a token by
email. You then need to set that API code in your R session (e.g., using
options(noaakey = "your key")
, replacing "your key" with the API
key you've requested from NOAA). See the package vignette for more details.
For some weather observations pulled using this function, missing values are coded as a series of "9"s, in some cases starting with a negative symbol. The function underlying this one will automatically convert any value of -9999 to a missing value for the variables "prcp", "tmax", "tmin", "tavg", "snow", and "snwd". However, for some weather observations, there still may be missing values coded using a series of "9"s of some length. You will want to check your final data to see if there are lurking missing values given with series of "9"s.
## Not run: denver_ex <- daily_fips("08031", coverage = 0.90, date_min = "2010-01-01", date_max = "2010-02-01", var = "prcp") head(denver_ex$daily_data) denver_ex$station_map mobile_ex <- daily_fips("01097", date_min = "1997-07-13", date_max = "1997-07-25", var = "prcp", average_data = FALSE) library(ggplot2) ggplot(mobile_ex$daily_data, aes(x = date, y = prcp, color = id)) + geom_line() ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.