readHSTS: Connect to HSTS Data

View source: R/readHSTS.R

readHSTSR Documentation

Connect to HSTS Data

Description

Opens a connection to a High School Transcript Study (HSTS) data files for years 2019. Returns an edsurvey.data.frame with information about the file and data.

Usage

readHSTS(
  dataFilePath = getwd(),
  spssPrgPath = dataFilePath,
  year = c("2019"),
  verbose = TRUE
)

Arguments

dataFilePath

a character value to the root directory path of extracted set of ASCII data files (.txt or .dat file extension). readHSTS will search within sub-directories of this parameter for expected data files based on the specified year parameter.

spssPrgPath

a character value to the directory path of where the extracted set of .sps program files are located. The data file and associated SPSS program filenames *must match* (having different file extensions) to determine which files are associated together. readHSTS will search within sub-directories of this parameter for expected SPSS programe files based on the specified year parameter.

year

a character value to indicate the year of the dataset. Only one year is supported for a single readHSTS data call. The year is required to help determine specific study information. Only 2019 study is currently supported.

verbose

a logical value that will determine if you want verbose output while the readHSTS function is running to indicate processing progress. The default value is TRUE.

Details

The HSTS data has a complex structure and unique characteristics all handled internally within EdSurvey. The structure allows for automatic dynamic linking across all various data 'levels' based the requested variables. The student data level is the primary analysis unit. Dynamic linking for variables that include both tests and transcript level details will result in an error, as they cannot be simultaneously returned in a single call. Situations may arise where the analyst must derive variables for analysis. See the documentation for merge and $<- functions for more detail. All merge operations are done at the student level (the main analysis unit).

File Layout for HSTS 2019:

  • School (school.dat) - School level variables.

    • School Catalog (catalog.dat) - Catalog variables joined to School data. Variables renamed to begin with SchCat_ to distinguish from Transcript Catalog. Cannot be merged with any Student data.

  • Student (student.dat) - Student level variables. Primary analysis unit, all merged/cached data must be at this level.

    • NAEP Math (naepmath.dat) - Subset of students containing NAEP Math variables. Variables begin with math_ to ensure they are unique from the NAEP Science variables.

    • NAEP Science (naepsci.dat) - Subset of students containing NAEP Science variables. Variables begin with sci_ to ensure they are unique from the NAEP Math variables.

    • Tests (tests.dat) - Students may have many test records. Contains ACT/SAT testing score details for students. Cannot be merged together with any Transcript or Transcript Catalog data.

    • Transcripts (trnscrpt.dat) - Students may have many transcript records. Contains transcript level details. Cannot be merged together with Test data.

      • Transcript Catalog (catalog.dat) - Each transcript record is associated to a catalog record for giving context to the transcript record. 2019 uses SCED codes for categorizing courses.

Value

an edsurvey.data.frame for the HSTS dataset.

Author(s)

Tom Fink

See Also

showCodebook, searchSDF, edsurvey.data.frame, merge.edsurvey.data.frame, and getData


EdSurvey documentation built on Nov. 2, 2023, 6:25 p.m.