join_weather: Join gridded weather data to an event table

View source: R/join_weather.R

join_weatherR Documentation

Join gridded weather data to an event table

Description

Attach gridded weather variables from NASA POWER to rows of an event table. The function:

  • standardizes/validates time input (single timestamp column or multiple time columns),

  • plans efficient provider calls by clustering locations (default) and splitting sparse time ranges,

  • caches downloaded weather segments locally and reuses them,

  • joins weather back to events using exact or rolling joins.

Usage

join_weather(
  x,
  params,
  time,
  lat_col = "lat",
  lon_col = "lon",
  time_api = c("guess", "hourly", "daily"),
  tz = "UTC",
  roll = c("nearest", "last", "none"),
  roll_max_hours = NULL,
  spatial_mode = c("cluster", "exact", "by_group"),
  group_col = NULL,
  cluster_radius_m = 250,
  site_elevation = c("constant", "auto"),
  elev_constant = 100,
  elev_fun = NULL,
  community = "ag",
  cache_scope = c("user", "project"),
  cache_dir = NULL,
  verbose = FALSE,
  ...
)

Arguments

x

A data.frame/data.table with event rows.

params

Character vector of NASA POWER parameter codes (e.g. "T2M").

time

A single column name containing time (POSIXct/Date/character/numeric) OR a character vector of column names used to assemble a timestamp (e.g. c("YEAR","MO","DY","HR")).

lat_col, lon_col

Column names for latitude and longitude (decimal degrees).

time_api

One of "guess", "hourly", "daily". If "daily" is chosen while the input contains time-of-day information, timestamps are downsampled to dates (with a fixed hour). If "hourly" is chosen but the input has no time-of-day information, an error is raised.

tz

Time zone used to interpret/construct input timestamps (default "UTC"). Weather is requested from NASA POWER in UTC.

roll

Join behaviour when matching timestamps: "nearest" (default, recommended), "last", or "none" (exact). Rolling is applied when joining hourly weather to event times.

roll_max_hours

Maximum allowed time distance (hours) for a rolling match. If NULL, a safe default is used: 1 hour for hourly joins and 24 hours for daily joins.

spatial_mode

How to reduce many points to representative locations before calling POWER: "cluster" (default), "exact", or "by_group". Clustering reduces accidental explosion of provider calls and matches POWER's coarse spatial resolution.

group_col

Grouping column used when spatial_mode="by_group".

cluster_radius_m

Clustering radius in meters when spatial_mode="cluster".

site_elevation

Elevation strategy for POWER calls: "constant" or "auto". Elevation is resolved for representative locations and becomes part of the cache identity.

elev_constant

Constant elevation (meters) used when site_elevation="constant" and as a fallback for "auto".

elev_fun

Optional function function(lon, lat, ...) returning elevation (meters) for representative points.

community

Passed to nasapower::get_power() (e.g. "ag").

cache_scope

Where to store cache by default: "user" or "project".

cache_dir

Optional explicit cache directory. If NULL, determined by cache_scope.

verbose

If TRUE, print progress messages.

...

Passed through to nasapower::get_power().

Value

A data.table with weather columns appended. Rows with missing/invalid inputs keep their original values and receive NA weather.

See Also

wj_cache_list, wj_cache_clear, weatherjoin_options


weatherjoin documentation built on Feb. 4, 2026, 5:11 p.m.