prepare_data_for_modelling: Prepare Data for Training a model

View source: R/modelling.R

prepare_data_for_modellingR Documentation

Prepare Data for Training a model

Description

Prepares environmental data by filtering for relevant components, converting the data to a wide format, and adding temporal features. Should be called before split_data_counterfactual()

Usage

prepare_data_for_modelling(env_data, params)

Arguments

env_data

A data table in long format. Must include the following columns:

Station

Station identifier for the data.

Komponente

The environmental component being measured (e.g., temperature, NO2).

Wert

The measured value of the component.

date

Timestamp as POSIXct object in ⁠YYYY-MM-DD HH:MM:SS⁠ format.

Komponente_txt

A textual description of the component.

params

A list of modelling parameters loaded from params.yaml. Must include:

meteo_variables

A vector of meteorological variable names.

target

The name of the target variable.

Value

A data.table in wide format, with columns: date, one column per component, and temporal features like date_unix, day_julian, weekday, and hour.

Examples

env_data <- data.table::data.table(
  Station = c("StationA", "StationA", "StationA"),
  Komponente = c("NO2", "TMP", "NO2"),
  Wert = c(50, 20, 40),
  date = as.POSIXct(c("2023-01-01 10:00:00", "2023-01-01 11:00:00", "2023-01-02 12:00:00"))
)
params <- list(meteo_variables = c("TMP"), target = "NO2")
prepared_data <- prepare_data_for_modelling(env_data, params)
print(prepared_data)


ubair documentation built on April 12, 2025, 2:12 a.m.