readSheet: Read data from a DataPack object

View source: R/loadDataPack.R

readSheetR Documentation

Read data from a DataPack object

Description

Reads data from a sheet in a DataPack object. This function is essentially a wrapper for readxl's read_excel function, but with additional support for selecting default parameters per DataPack setup.

Usage

readSheet(
  d,
  sheet = 1,
  range = NULL,
  col_names = TRUE,
  col_types = "text",
  na = "",
  guess_max = 1000,
  progress = readxl::readxl_progress(),
  .name_repair = "minimal"
)

Arguments

d

DataPack object, created via loadDataPack.

sheet

Sheet to read. Either a string (the name of a sheet), or an integer (the position of the sheet). Ignored if the sheet is specified via range. If neither argument specifies the sheet, defaults to the first sheet.

range

A cell range to read from, as described in cell-specification. Includes typical Excel ranges like "B3:D87", possibly including the sheet name like "Budget!B2:G14", and more. Interpreted strictly, even if the range forces the inclusion of leading or trailing empty rows or columns. Takes precedence over skip, n_max and sheet.

col_names

TRUE to use the first row as column names, FALSE to get default names, or a character vector giving a name for each column. If user provides col_types as a vector, col_names can have one entry per column, i.e. have the same length as col_types, or one entry per unskipped column.

col_types

Either NULL to guess all from the spreadsheet or a character vector containing one entry per column from these options: "skip", "guess", "logical", "numeric", "date", "text" or "list". If exactly one col_type is specified, it will be recycled. The content of a cell in a skipped column is never read and that column will not appear in the data frame output. A list cell loads a column as a list of length 1 vectors, which are typed using the type guessing logic from col_types = NULL, but on a cell-by-cell basis.

na

Character vector of strings to interpret as missing values. By default, readxl treats blank cells as missing data.

guess_max

Maximum number of data rows to use for guessing column types.

progress

Display a progress spinner? By default, the spinner appears only in an interactive session, outside the context of knitting a document, and when the call is likely to run for several seconds or more. See readxl_progress() for more details.

.name_repair

Handling of column names. Passed along to tibble::as_tibble(). readxl's default is '.name_repair = "unique", which ensures column names are not empty and are unique.

Value

A tibble

Author(s)

Scott Jackson


pepfar-datim/datapackr documentation built on April 14, 2024, 10:35 p.m.