camr_SWA_linking_code: Link Records for School-Wide Assessment Data

View source: R/R12-SWA_linking_code.R

camr_SWA_linking_codeR Documentation

Link Records for School-Wide Assessment Data

Description

Function to link records (e.g., across different time points) using a set of linking items.

Usage

camr_SWA_linking_code(
  dtf_long,
  lst_link_across = NULL,
  obj_link_using = NULL,
  lst_link_combo = NULL,
  lst_link_threshold = NULL,
  lgc_progress = TRUE
)

Arguments

dtf_long

A data frame, must have a column with integer values for time points ('SSS.INT.Time_point') and the relevant columns for the linking items.

lst_link_across

A list of lists, with each sublist specifying 'Base' and 'Add' logical vectors for the pair of data subsets in dtf_long to link over (e.g., 'Base' would subset the first time point and 'Add' would subset the second time point). If NULL the functions infers all possible pairings over time points from the 'SSS.INT.Time_point' variable. If the 'Base' and 'Add' logical vectors are for the same subset, the function checks for duplicate records instead.

obj_link_using

Either a character vector with the column names for the linking items, or a list of character vectors, one vector for each set defined in lst_link_across. Pass a list with separate vectors allows using different linking items for different sets when necessary. If NULL assumes the standard set of linking items: SSS.INT.School.Code, IDX.INT.Origin.LASID, SBJ.FCT.Sex, SBJ.FCT.Link.BirthMonth, SBJ.FCT.Link.OlderSiblings, SBJ.FCT.Link.OlderSiblings, SBJ.FCT.Link.EyeColor, SBJ.FCT.Link.EyeColor, SBJ.FCT.Link.MiddleInitial, SBJ.CHR.Link.Streetname, and SBJ.INT.Link.KindergartenYearEst.

lst_link_combo

A list of lists, where each sublist consists of an integer vector indexing the combination of linking items to consider in order of priority. One sublist of integer vectors must be defined for each set defined by lst_link_across. For a given sublist, indices apply to the character vector defined for the relevant set in obj_link_using, meaning that if character vectors differ across sets, indices should be defined accordingly.

lst_link_threshold

A list of lists, where each sublist consists of integer vectors for each combination of linking items defined in lst_link_combo. Integer vectors consist of 4 values, the minimum dissimilarity score and unique count of that score for the base set and linking set, respectively.

lgc_progress

A logical value; if TRUE displays progress of the function call.

Value

A data frame.

Author(s)

Michael Pascale and Kevin Potter

Examples

# Linking across time points
dtf_demo <- camr_SWA_linking_code_simulate('demo')
dtf_demo_linked <- camr_SWA_linking_code(dtf_demo)

# Identifying duplicate records
dtf_dup <- camr_SWA_linking_code_simulate( 'duplicate' )
dtf_dup_linked <- camr_SWA_linking_code(
  dtf_dup,
  lst_link_across = list(
    DR2023F = list(
      Base = rep( TRUE, nrow(dtf_dup) ),
      Add = rep( TRUE, nrow(dtf_dup) )
    )
  )
)


rettopnivek/camrprojects documentation built on Dec. 20, 2024, 10:17 p.m.