R/weather_history.R

#' Weather History Data
#'
#' @format A data frame with 1000 observations of 8 variables.
#' \describe{
#'   \item{temperature_c}{Mean temperature in degrees Celsios}
#'   \item{humidity}{Mean humidity (percentage)}
#'   \item{wind_speed_km_h}{Mean wind speed}
#'   \item{wind_bearing_degrees}{Mean wind bearing (degrees)}
#'   \item{visibility_km}{Mean visibility}
#'   \item{pressure_millibars}{Mean pressure}
#'   \item{summary}{Mode of the hourly summaries. Five levels: Clear, Foggy, Mostly Cloudy, Overcast, Partly Cloudy}
#'   \item{precip_type}{Mode of type of hourly precipitation. Three levels: rain, snow, null}
#' }
#'
#' This data was modified from the original by taking the mean value over days for numerical data, and the mode value over days
#' for categorical data. Pressure had many values of 0, which were converted to NA. 
#' 
#' The data was "decimated" so that only every one out of every 4 (or so) days appears in the data set. This was done to alleviate concerns of independence.
#' 
#' The dates have been removed; there is definitely independence across years for similar dates that could have been modeled. 
#'
#' @source https://www.kaggle.com/budincsevity/szeged-weather
"weather_history"
speegled/ardata documentation built on March 26, 2022, 5 a.m.