rune: Parse composite data frame into component data frames by...

View source: R/dataParse.R

runeR Documentation

Parse composite data frame into component data frames by variable prefix

Description

This function takes a data frame containing multiple measures and separates it into individual data frames for each measure detected in the data. It identifies the appropriate identifier column (e.g., participantId, workerId) and splits the data based on column name prefixes.

Usage

rune(df, lower = TRUE)

Arguments

df

a dataframe containing multiple, prefixed measures

lower

default TRUE convert prefixes to lower case

Details

The function performs the following steps:

  • Identifies which identifier column to use (participantId, workerId, PROLIFIC_PID, or src_subject_id)

  • Determines survey prefixes by analyzing column names

  • Creates separate dataframes for each survey prefix found

  • Assigns each dataframe to the global environment with names matching the survey prefixes

Value

Creates multiple dataframes in the global environment, one for each survey detected in the data. Each dataframe is named after its survey prefix.

Examples

# Parse a data frame containing multiple surveys
combined_df <- data.frame(
  record_id = c("REC001", "REC002", "REC003", "REC004"),
  src_subject_id = c("SUB001", "SUB002", "SUB003", "SUB004"),
  subjectkey = c("KEY001", "KEY002", "KEY003", "KEY004"),
  site = c("Yale", "NU", "Yale", "NU"),
  phenotype = c("A", "B", "A", "C"),
  visit = c(1, 2, 2, 1),
  state = c("complete", "completed baseline", "in progress", NA),
  status = c(NA, NA, NA, "complete"),
  lost_to_followup = c(FALSE, FALSE, TRUE, NA),
  interview_date = c("2023-01-15", "2023/02/20", NA, "2023-03-10"),
  foo_1 = c(1, 3, 5, 7),
  foo_2 = c("a", "b", "c", "d"),
  bar_1 = c(2, 4, 6, 8),
  bar_2 = c("w", "x", "y", "z")
)
rune(combined_df)

# After running, access individual survey dataframes directly:
head(foo)  # Access the foo dataframe
head(bar)  # Access the bar dataframe


wizaRdry documentation built on June 8, 2025, 11:30 a.m.