View source: R/data_cleaning.R
clean_data | R Documentation |
Cleans a data table of environmental measurements by filtering for a specific station, removing duplicates, and optionally aggregating the data on a daily basis using the mean.
clean_data(env_data, station, aggregate_daily = FALSE)
env_data |
A data table in long format. Must include columns:
|
station |
Character. Name of the station to filter by. |
aggregate_daily |
Logical. If |
Duplicate rows (by date
, Komponente
, and Station
) are removed. A warning is issued
if duplicates are found.
A data.table
:
If aggregate_daily = TRUE
: Contains columns for station, component, day, year,
and the daily mean value of the measurements.
If aggregate_daily = FALSE
: Contains cleaned data with duplicates removed.
# Example data
env_data <- data.table::data.table(
Station = c("DENW094", "DENW094", "DENW006", "DENW094"),
Komponente = c("NO2", "O3", "NO2", "NO2"),
Wert = c(45, 30, 50, 40),
date = as.POSIXct(c(
"2023-01-01 08:00:00", "2023-01-01 09:00:00",
"2023-01-01 08:00:00", "2023-01-02 08:00:00"
)),
Komponente_txt = c(
"Nitrogen Dioxide", "Ozone", "Nitrogen Dioxide", "Nitrogen Dioxide"
)
)
# Clean data for StationA without aggregation
cleaned_data <- clean_data(env_data, station = "DENW094", aggregate_daily = FALSE)
print(cleaned_data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.