Gapfill_em | R Documentation |
This function automatically gap-fills the missing data points (marked as "NA") in the flux dataset using expectation-maximization (EM) algorithm with up to 3 parallel measured reference flux time series. The function is based on the algorithms in the package 'mtsdi'.
Gapfill_em( data, ref1, ref2 = NULL, ref3 = NULL, Flux = "Flux", Flux1 = Flux, Flux2 = Flux, Flux3 = Flux, Date = "Date", Date_form = "ymd_hms", win = 5, interval = 10, ts = TRUE, method = "spline", sp_df = 10, fail = "ave", ... )
data |
a data frame that includes the flux (with NA indicating the missing data) |
ref1 |
a data frame that includes the parallel measured reference flux time series #1, does not require to have the same length as the target data to be filled |
ref2 |
a data frame that includes the parallel measured reference flux time series #2 (optional), does not require to have the same length as the target data to be filled. Default: NULL |
ref3 |
a data frame that includes the parallel measured reference flux time series #3 (optional), does not require to have the same length as the target data to be filled. Default: NULL |
Flux |
a string indicates the column name of the flux variable to be gap-filled |
Flux1 |
a string indicates the column name of the reference time series in ref1. Default: same as Flux |
Flux2 |
a string indicates the column name of the reference time series in ref2. Default: same as Flux |
Flux3 |
a string indicates the column name of the reference time series in ref3. Default: same as Flux |
Date |
a string indicates the column name for the date in data, ref1, ref2 and ref3, and it HAS to include the time information. Note that all the data frames should have the same name for the date column. |
Date_form |
a string indicates the format of the date in data, ref1, ref2 and ref3, either "ymd_hms" (default), "mdy_hms" or "dmy_hms". Note that all the data frames should have the same date format. |
win |
a number indicates the required sampling window length around each gap (total number in two sides), unit: days (default: 5) |
interval |
a number indicates the temporal resolution of the measurements in the dataset, unit: minutes (default: 10) |
ts |
logical. TRUE if it is time series. Default: TRUE |
method |
a string indicates the method for univariate time series filtering, either "spline" (default),"arima", or "gam". See details in the package 'mtsdi'. |
sp_df |
an integer indicates the degrees of freedom to be used for the splines (Default: 10). In case set to NULL, the degrees of freedom will be chosen by cross-validation. See details in the package 'mtsdi'. |
fail |
a string or a number indicates what to do when model fails to converge: 1. use the mean value in the sampling window to fill the gap ("ave", default), or 2. use any value assigned here to fill the gap (e.g., 9999, NA, etc.) |
... |
other arguments pass to 'mnimput' |
A data frame that includes the original data, gap-filled data ("filled") and a "mark" column that indicates the value in each row of the "filled" is either: 0. original, 1. gap-filled, or 2. failed to converge
# read example data df <- read.csv(file = system.file("extdata", "Soil_resp_example.csv", package = "FluxGapsR"),header = T) df_ref <- read.csv(file = system.file("extdata", "Soil_resp_ref_example.csv", package = "FluxGapsR"),header = T) df_filled <- Gapfill_em(data = df,ref1 = df_ref) # visualize the gapfilled results plot(df_filled$filled,col="red") points(df_filled$Flux)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.