View source: R/remove_partial_data.R
remove_partial_data | R Documentation |
This function removes groups from a dataframe that do not have sufficient
data points. Groups of one data point will automatically be removed. Single
data points are common after using aggregate_Datetime()
.
remove_partial_data(
dataset,
Variable.colname = Datetime,
threshold.missing = 0.2,
by.date = FALSE,
Datetime.colname = Datetime,
show.result = FALSE,
handle.gaps = FALSE
)
dataset |
A light logger dataset. Expects a dataframe. If not imported by LightLogR, take care to choose sensible variables for the Datetime.colname and Variable.colname. |
Variable.colname |
Column name that contains the variable for which to
assess sufficient datapoints. Expects a symbol. Needs to be part of the
dataset. Default is |
threshold.missing |
either
|
by.date |
Logical. Should the data be (additionally) grouped by day?
Defaults to |
Datetime.colname |
Column name that contains the datetime. Defaults to "Datetime" which is automatically correct for data imported with LightLogR. Expects a symbol. Needs to be part of the dataset. Must be of type POSIXct. |
show.result |
Logical, whether the output of the function is summary of the data (TRUE), or the reduced dataset (FALSE, the default) |
handle.gaps |
Logical, whether the data shall be treated with
|
if show.result = FALSE
(default), a reduced dataframe without the
groups that did not have sufficient data
#create sample data with gaps
gapped_data <-
sample.data.environment |>
dplyr::filter(MEDI < 30000)
#check their status, based on the MEDI variable
gapped_data |> remove_partial_data(MEDI, handle.gaps = TRUE, show.result = TRUE)
#the function will produce a warning if implicit gaps are present
gapped_data |> remove_partial_data(MEDI, show.result = TRUE)
#one group (Environment) does not make the cut of 20% missing data
gapped_data |> remove_partial_data(MEDI, handle.gaps = TRUE) |> dplyr::count(Id)
#for comparison
gapped_data |> dplyr::count(Id)
#If the threshold is set differently, e.g., to 2 days allowed missing, results vary
gapped_data |>
remove_partial_data(MEDI, handle.gaps = TRUE, threshold.missing = "2 days") |>
dplyr::count(Id)
#The removal can be automatically switched to daily detections within groups
gapped_data |>
remove_partial_data(MEDI, handle.gaps = TRUE, by.date = TRUE, show.result = TRUE) |>
head()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.