View source: R/Data_handling.R
merge_eddy | R Documentation |
Merge generated regular date-time sequence with single or multiple data frames.
merge_eddy(
x,
start = NULL,
end = NULL,
check_dupl = TRUE,
interval = NULL,
format = "%Y-%m-%d %H:%M",
tz = "GMT"
)
x |
List of data frames, each with |
start , end |
A value specifying the first (last) value of the generated
date-time sequence. If |
check_dupl |
A logical value specifying whether rows with duplicated
date-time values checked across |
interval |
A numeric value specifying the time interval (in seconds) of the generated date-time sequence. |
format |
A character string. Format of |
tz |
A time zone (see |
The primary purpose of merge_eddy
is to combine chunks of data
vertically along their column "timestamp"
with date-time information.
This "timestamp"
is expected to be regular with given time
interval
. Resulting data frame contains added rows with expected
date-time values that were missing in "timestamp"
column, followed by
NA
s. In case that check_dupl = TRUE
and "timestamp"
values across x
elements overlap, detected duplicated rows are removed
(the order in which duplicates are evaluated depends on the order of x
elements). A special case when x
has only one element allows to fill
missing date-time values in "timestamp"
column of given data frame.
Storage mode of "timestamp"
column is set to be integer instead
of double. This simplifies application of round_df
but could
lead to unexpected behavior if the date-time information is expected to
resolve fractional seconds.
The list of data frames, each with column "timestamp"
, is sequentially
merge
d using Reduce
. A (full) outer join,
i.e. merge(..., all = TRUE)
, is performed to keep all columns of
x
elements. The order of x
elements can affect the result.
Duplicated column names within x
elements are corrected using
make.unique
. The merged data frame is then merged on the
validated "timestamp"
column that can be either automatically
extracted from x
or manually specified.
For horizontal merging (adding columns instead of rows) check_dupl =
FALSE
must be set but simple merge
could be preferred.
Combination of vertical and horizontal merging should be avoided as it
depends on the order of x
elements and can lead to row duplication.
Instead, data chunks from different data sources should be first separately
vertically merged and then merged horizontally in a following step.
A data frame with attributes varnames
and units
for
each column, containing date-time information in column "timestamp"
.
merge
, Reduce
, strptime
,
time zones
, make.unique
set.seed(123)
n <- 20 # number of half-hourly records in one non-leap year
tstamp <- seq(c(ISOdate(2021,3,20)), by = "30 mins", length.out = n)
x <- data.frame(
timestamp = tstamp,
H = rf(n, 1, 2, 1),
LE = rf(n, 1, 2, 1),
qc_flag = sample(c(0:2, NA), n, replace = TRUE)
)
openeddy::varnames(x) <- c("timestamp", "sensible heat", "latent heat",
"quality flag")
openeddy::units(x) <- c("-", "W m-2", "W m-2", "-")
str(x)
y1 <- ex(x, 1:10)
y2 <- ex(x, 11:20)
y <- merge_eddy(list(y1, y2))
str(y)
attributes(y$timestamp)
typeof(y$timestamp)
# Duplicated rows and different number of columns
z1 <- ex(x, 8:20, 1:3)
z <- merge_eddy(list(y1, z1))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.