read_ds_file: Read in a data file

Description Usage Arguments Details Value

View source: R/read_functions.R

Description

Works for tab-delimited (.txt) data files

Usage

1
2
read_ds_file(filename, dd = FALSE, na_vals = c("NA", "N/A", "na",
  "n/a"), remove_empty_row = TRUE, remove_empty_col = FALSE)

Arguments

filename

The path to the file on disk

dd

Logical, where TRUE indicates a data dictionary file

na_vals

Vector of strings that should be read in as NA/missing (see details)

remove_empty_row

Logical of whether to exclude empty (i.e. all missing values) rows. Defaults to TRUE

remove_empty_col

Logical of whether to exclude empty (i.e. all missing values) rowcolumns. Defaults to FALSE

Details

Missing values: The blank string "" will always be considered an NA or missing value. Additional strings that should be read in as missing values can be specified in the na_vals argument. The default set of additional NA values is "NA","N/A","na","n/a." Users should change the default if these values represent something beside missing — for example, "NA" could be an encoded value meaning "North America". Users may wish to add a value to the list, e.g. na_vals=c("NA","N/A","na","n/a", "9999").

dbGaP dataset files should have column headers as the first row. If the input violates this, e.g. additional header rows are present, a warning is returned but the file is still read in.

Value

A data frame from the file


UW-GAC/dbgaptools documentation built on Nov. 3, 2020, 12:19 a.m.