dct.parser: Parse a Stata dictionary file for use in R

Description Usage Arguments Details Author(s) References See Also Examples

Description

R cannot read Stata's dictionary files directly. This function parses the dictionary file to a data.frame that can be used to further process the data files and make them usable with R.

Usage

1
2
3
  dct.parser(dct,
    includes = c("StartPos", "StorageType", "ColName", "ColWidth", "VarLabel"),
    preview = FALSE)

Arguments

dct

Stata dictionary file, most often with a .dct extension.

includes

A complete dictionary file includes (usually in this order), the column starting position, the storage type of the variable, the variable name, the width of the column, and the variable label. Delete any which are not relevant to your dictionary file.

preview

If you are not sure what values to select for includes, use the preview = TRUE argument to see the first few lines of the relevant portion of the dictionary file to decide what the dictionary file structure is.

Details

Many datasets are distributed as a combination of Stata .dat (data, usually fixed-width-format), .dct (dictionary), and .do (other commands for Stata, for example recoding the data and so on) files. The dictionary files are used to tell Stata details like which column in the data file represents the starting position of the data for a given variable, how many columns should be read for that given variable, what the storage type of that variable is, and what that variable's name and label shoud be.

The expected workflow might include (1) parsing the dictionary file using dct.parser, (2) converting the fixed width data file to a csv file using csvkit after generating a csvkit schema file using csvkit.schema, (3) reading in the file using your preferred method (for example, fread, sqldf, read.csv, or another appropriate method), (4) re-assigning some of the metadata extracted from the dictionary file to your newly imported dataset.

Author(s)

Ananda Mahto

References

See Also

read.dta

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Read an example dictionary file
data(sampleDctData)
## Write the data to a dictionary file
currentdir <- getwd()
setwd(tempdir())
writeLines(sipp84fp_dct, "sipp84fp.dct")
dct.parser("sipp84fp.dct", preview = TRUE)
sipp84_R_dict <- dct.parser("sipp84fp.dct")
head(sipp84_R_dict)
setwd(currentdir)

mrdwab/StataDCTutils documentation built on May 23, 2019, 7:15 a.m.