clean_playground_data: Clean tracking data (complete pipeline)

View source: R/clean_playground_data.R

clean_playground_dataR Documentation

Clean tracking data (complete pipeline)

Description

Master function that runs the complete data cleaning pipeline:

  1. Map raw IDs to participant IDs

  2. Mark analysis and bell time periods

  3. Standardize to fixed time intervals

  4. Interpolate gaps (two-phase)

  5. Optionally export to CSV

Usage

clean_playground_data(
  data,
  id_mapping,
  exclude_ids = NULL,
  analyze_start,
  analyze_end,
  bell_start = NULL,
  bell_end = NULL,
  unit = "second",
  time_step = 1,
  max_gap_small = 10,
  max_gap_large = NULL,
  max_position_change = 0.3,
  output_file = NULL,
  verbose = TRUE,
  time_col = "At",
  x_col = "X",
  y_col = "Y",
  raw_id_col = "ID",
  id_col = "id_code",
  analyze_col = "Analyze",
  bell_col = "Bell"
)

Arguments

data

Raw tracking data frame

id_mapping

Path to ID mapping CSV file or mapping data frame

exclude_ids

Vector of raw IDs to exclude from analysis

analyze_start

Start time for analysis period (character or POSIXct)

analyze_end

End time for analysis period (character or POSIXct)

bell_start

Start time for bell period (optional)

bell_end

End time for bell period (optional)

unit

Time interval for standardization, passed to standardize_to_seconds() (default: "second"). Use "2 seconds", "5 seconds", etc. for coarser intervals.

time_step

Expected time step in seconds between consecutive observations after standardization (default: 1). Must match the numeric value of unit, e.g. set time_step = 2 when unit = "2 seconds".

max_gap_small

Maximum gap for phase 1 interpolation in seconds (default: 10)

max_gap_large

Maximum gap for phase 2 interpolation in seconds (default: NULL)

max_position_change

Maximum position change for phase 2 in meters (default: 0.3)

output_file

Path to save cleaned data as CSV (optional)

verbose

Print progress messages (default: TRUE)

time_col

Name of the timestamp column (default: "At")

x_col

Name of the x-coordinate column (default: "X")

y_col

Name of the y-coordinate column (default: "Y")

raw_id_col

Name of the raw device ID column in the input data (default: "ID")

id_col

Name of the output column for standardized participant IDs (default: "id_code")

analyze_col

Name of the analysis period flag column (default: "Analyze")

bell_col

Name of the bell period flag column (default: "Bell")

Value

Cleaned data frame

Examples

# Complete pipeline using bundled example data
library(readr)
raw_data <- read_csv(system.file("extdata", "raw_tracking_data.csv",
                                 package = "trackclean"))

cleaned_data <- clean_playground_data(
  data = raw_data,
  id_mapping = system.file("extdata", "id_mapping.csv", package = "trackclean"),
  analyze_start = "2025-03-18 11:47:00",
  analyze_end   = "2025-03-18 11:57:00",
  bell_start    = "2025-03-18 11:53:00",
  bell_end      = "2025-03-18 11:58:00"
)

trackclean documentation built on July 1, 2026, 5:07 p.m.