library(fluxtools)
fluxtools is an R package that provides an interactive Shiny‐based QA/QC environment for data in the AmeriFlux BASE format. In just a few clicks, you can:
NA
and an R script for reproducibilityThis vignette shows you how to install, launch, and use the main Shiny app—run_flux_qaqc()
—and walks through a typical workflow.
You can install fluxtools from CRAN, or directly from GitHub:
# Install from CRAN install.packages("fluxtools") # Install from GitHub library(devtools) devtools::install_github("kesondrakey/fluxtools")
Load fluxtools and launch the QA/QC application:
library(fluxtools) # Run the app run_fluxtools()
Example workflow
Upload: Select your AmeriFlux-style CSV (e.g., US_VT1_HH_202401010000_202501010000.csv
). Files can be up to 500MB (larger file sizes might be harder on the Shiny interface)
Choose Year(s): By default “all” is selected, but you can subset to specific years
Choose variables: TIMESTAMP_START
is on the x-axis by default. Change the y-axis to your variable of interest (e.g., FC_1_1_1
). The generated R code focuses on removing the y-axis variable
Select data: Use the box or lasso to select points. This populates the “Current” code box with something like:
r
df <- df %>%
mutate(
FC_1_1_1 = case_when(
TIMESTAMP_START == '202401261830' ~ NA_real_,
TIMESTAMP_START == '202401270530' ~ NA_real_,
…
TRUE ~ FC_1_1_1
)
)
Flag data and Accumulate code: With points still selected, click “Flag data.” Selected points turn orange, and code is appended to the “Accumulated” box, allowing multiple selections per session.
Unflag data: Use the box or lasso to de-select points and remove from the Accumulated code box.
Clear Selection: To reset all selections from the current y-variable, click "Clear Selection" to reset the current view.
Switch variables: Change y to any other variable (e.g., SWC_1_1_1
) and select more points. Click “Flag data” Code for both variables to appear:
```r df <- df %>% mutate( FC_1_1_1 = case_when( TIMESTAMP_START == '202401261830' ~ NA_real_, TIMESTAMP_START == '202401270530' ~ NA_real_, … TRUE ~ FC_1_1_1 ) )
df <- df %>% mutate( SWC_1_1_1 = case_when( TIMESTAMP_START == '202403261130' ~ NA_real_, TIMESTAMP_START == '202403270800' ~ NA_real_, … TRUE ~ SWC_1_1_1 ) ) ```
Compare variables: Change to variables you would like to compare (e.g., change y to TA_1_1_1
and x to T_SONIC_1_1_1
). The app computes an R² via simple linear regression. The top R² is based on points before removals, and once data is selected, a second R² will pop up - calculating the linear regression assuming the selected points have been removed
Highlight outliers: Use the slider to select ±σ residuals. Click “Select all ±σ outliers” to append them to the Accumulated code. Click “Clear ±σ outliers” to deselect and remove from the code box
Copy all: Click the Copy Icon to the right of the current or accumulated code box and paste into your own R script for documentation
Apply Removals: Click “Apply Removals” to remove each selected data points, from the current y-variable, to replace points with NA
in a new .csv (raw data is unaffected), available using 'export cleaned data' and remove these values from view
Reload original data: Make a mistake or want a fresh start? Click Reload original data to reload the .csv from above to start over
Export cleaned data: Download the cleaned .csv reflecting your confirmed removals. This button will download a zip file containing your .csv, reflecting changes from using the "apply removals" button, and includes a compiled R script with the R code for those removals.
The Physical Range Module (PRM) removes out-of-range values to NA
based on similar variables using patterns like ^SWC($|_)
or ^P($|_)
.
Columns containing "QC"
are skipped by default. No columns are removed.
Source of ranges: AmeriFlux Technical Documents, Table A1 (Physical Range Module).
# tiny demo dataset with a few out-of-range values set.seed(1) df <- tibble::tibble( TIMESTAMP_START = seq.POSIXt(as.POSIXct("2024-01-01", tz = "UTC"), length.out = 10, by = "30 min"), SWC_1_1_1 = c(10, 20, 150, NA, 0.5, 99, 101, 50, 80, -3), # bad: 150, 101, -3; 0.5 triggers SWC unit note P = c(0, 10, 60, NA, 51, 3, 0, 5, 100, -1), # bad: 60, 51, 100, -1 RH_1_1_1 = c(10, 110, 50, NA, 0, 100, -5, 101, 75, 30), # bad: 110, -5, 101 SWC_QC = sample(0:2, 10, replace = TRUE) # QC col should be ignored ) # To see the Physical Boundary Module (PRM) rules: get_prm_rules() #Apply filter to all relevant variables res <- apply_prm(df) # PRM summary (counts and % replaced per column) res$summary # Only set range for SWC df_filtered_swc <- apply_prm(df, include = "SWC") # Only set range for SWC + P df_filtered_swc_P <- apply_prm(df, include = c("SWC", "P"))
library(dplyr) library(knitr) rules_tbl <- get_prm_rules() |> transmute( Variable = variable, Description = description, Units = units, 'Min to Max' = sprintf("%s to %s", ifelse(is.na(min), "NA", min), ifelse(is.na(max), "NA", max)) ) |> arrange(Variable) note_txt <- if (knitr::is_latex_output()) { "\\emph{Source: AmeriFlux Technical Documents}" } else { "<em>Source: AmeriFlux Technical Documents</em>" } use_kableExtra <- requireNamespace("kableExtra", quietly = TRUE) tbl <- fluxtools::get_prm_rules() if (use_kableExtra) { kableExtra::kbl(tbl, booktabs = TRUE) |> kableExtra::kable_styling(full_width = FALSE) } else { knitr::kable(tbl) } # kbl(rules_tbl, # caption = "Physical Range Module (PRM) bounds", # booktabs = TRUE, escape = FALSE) |> # kable_styling(full_width = FALSE, latex_options = c("threeparttable")) |> # footnote(general = note_txt, general_title = "", escape = FALSE)
Fluxtools is an independent project and is not affiliated with or endorsed by the AmeriFlux Network. “AmeriFlux” is a registered trademark of Lawrence Berkeley National Laboratory and is used here for identification purposes only.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.