new_file_definition_sas_: Helper function for 'new_file_definition_sas()'

Description Usage Arguments

View source: R/file_definition.R

Description

Helper function for new_file_definition_sas()

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
new_file_definition_sas_(
  file_path,
  specification_files = NULL,
  file_meta = NULL,
  skip_rows,
  n_max,
  encoding = NULL,
  to_lower,
  rename_cols,
  retype_cols,
  adapters = new_adapters(),
  cols_keep,
  extra_col_name = NULL,
  extra_col_val = NULL,
  extra_col_file_path,
  err_h,
  ...
)

Arguments

file_path

A string holding the path to the data file.

specification_files

An optional character vector holding the paths to the files, where the file structure is described.

file_meta

An optional file_meta class object, holding some meta information for each data column (column description, possible column values + descriptions of possible column values). For details see section meta information. If the argument cols is not NULL, then the argument file_meta must be omitted.

skip_rows

The number of rows to be skipped. In the case of DSV or EXCEL files: If the argument header is set to TRUE, then the first row is always assumed to be the header row.

n_max

A number, defining the maximum number of rows to be read. If n_max = Inf, then all available rows will be read.

encoding

A string, defining which encoding should be assumed when reading the data file. The following valuels are allowed:

  • "UTF-8": For UTF-8 encoded files.

  • "latin1": For ISO 8859-1 (also called Latin-1) encoded files. This encoding is almost the same as Windows-1252 (also called ANSI). They differ only in 32 symbol codes (special symbols that are rarely used). In the case of SAS files, it is possible to set encoding = NULL. In this case, the encoding defined in the SAS data file header will be used.

to_lower

A logical flag, defining if the names of the columns should be transformed to lower case after reading the data set (by calling read_data()). This transformation will be applied before comparing the column names (in the case of SAS-Files or DSV- and EXCE-Files with header = TRUE). In the case of new_file_definition() the to_lower argument overwrites the to_lower argument in the file_structure class object given in file_structure. If to_lower is omitted, then the file_structure class object remains unchanged. In the case of new_file_definition_fwf(), new_file_definition_dsv(), new_file_definition_excel() or new_file_definition_sas() the argument to_lower must either be TRUE or FALSE.

rename_cols

A logical value, which defines if the columns given in the data file should be overwritten by the columns given in argument col_names. If col_names is not given, then rename_cols has no effect.

retype_cols

A logical value, which defines if the types of the columns given in SAS file changed to the types given in the col_types argument. If col_types is not given, then retype_cols has no effect.

adapters

An optional list argument, holding a list of adapter functions (See section adapters).

cols_keep

Either TRUE or a character vector. If set to TRUE, then all columns of the data are kept when calling read_data(). If cols_keep character vector, then the values in cols_keep represent the names of the columns, which are kept when calling read_data().

extra_col_name

An optional string, which defines the column, which will be added to the data set (after reading it with function read_data()). Each entry of the column will have the single value given in extra_col_val. For example: This column is useful when reading similar data files for separate years (one could pass the current data set year to extra_col_name and set extra_col_name = "year"). If extra_col_name is omitted, no column will be added to the data set and then extra_col_val must be omitted as well. additional column with the column name, given in extra_col_name. If omitted, then no column will be added to the data set and the argument extra_col_name must be omitted as well.

extra_col_val

An optional value (any atomic type), which will be added (after reading the data set with function read_data()) as an additional column with the column name, given in extra_col_name. For example: This column is useful when reading similar data files for separate years (one could pass the current data set year to extra_col_name and set extra_col_name = "year"). If omitted, then no column will be added to the data set and the argument extra_col_name must be omitted as well.

extra_col_file_path

Either FALSE or a string. If set to FALSE no file-path-column will be added to the data set, when calling read_data(). If the argument extra_col_file_path is a string, then a column holding the file path of the data file will be added to the read data set, when calling read_data(). The string of extra_col_file_path will be used as column name for this additional column.

err_h

An error handler

...

Additional function arguments for

  • readr::read_fwf() in case of FWF files

  • utils::read.delim() in case of DSV files

  • readxl::read_excel() in case of EXCEL files


a-maldet/readall documentation built on Dec. 18, 2021, 9:23 p.m.