clinical.import: Import multiple files exported from SAS

Description Usage Arguments Details Value Author(s) Examples

View source: R/clinical.R

Description

Provides a convenience function to build a list of data frames, where each data frame is generated by reading a .txt file assumed to have been generated by SAS proc export.

Usage

1
2
3
4
clinical.import(d, pattern = "^[a-zA-Z][a-zA-Z1-9]*\\.txt", 
                usubjid = getOption("gtx.usubjid", "USUBJID"), 
                verbose = TRUE, convert.YN = TRUE, convert.Date = TRUE, 
                only)

Arguments

d

Path to directory containing files exported from SAS

pattern

Regular expression for files to be imported

usubjid

Name of variable used for unique subject identifier

verbose

Whether to print progress messages

convert.YN

Whether to convert columns of Y and N to logical type

convert.Date

Whether to convert ddMMMyyyy columns to R Date type

only

Character vector

Details

clinical.import provides a convenient method to read in clinical data from one or more .txt files generated by SAS proc export.

The convenience features are that (i) data from multiple files are imported into a single object, a list of data frames, with one data frame for each imported file; (ii) data coded as Y/N and as text dates, are converted to appropriate R types; (iii) text data are imported as factors except for subject identifiers, which are imported as text; (iv) factor levels for text data are converted when necessary from latin-9 encoding (used by SAS proc export) to UTF-8 encoding.

These features are intended to make the imported data structure work well with other gtx functions including clinical.derive() and link{demographics}().

By default, all files matching pattern in directory d are imported into the list of data frames. The default for pattern is to exclude files with underscores because the authors work in an environment where the SAS exported data in .txt files is accompanied by metadata in corresponding _spec.txt files. This behaviour can be changed but the function will only work if all the files targeted have .txt extensions. The only argument allows a subset of files to be read, the arguments to only should be the names of files without the .txt extensions.

The variables named usubjid is imported as class character, all other character variables are imported as factors.

Value

A list of dataframes.

Author(s)

Toby Johnson Toby.x.Johnson@gsk.com

Examples

1
2
3
4
5
6
7
## Not run: 
clindata <- clinical.import("path/to/clinical/export/")
data(derivations.standard)
gxvars <- clinical.derive(clindata, derivations.standard)
summary(gxvars)

## End(Not run)

tobyjohnson/gtx documentation built on Aug. 30, 2019, 8:07 p.m.