qread: Read xls and xlsx worksheet
In epicentre-msf/qxl: Quick Customized Excel Files

View source: R/qread.R

qread

R Documentation

Read xls and xlsx worksheet

Description

Wrapper to readxl::read_excel with minor changes to default settings:

columns of dates with no time component have class "Date" rather than "POSIX"
empty columns are read in as class "character" rather than "logical"
the max number of rows used to guess column types is 10k rather than 1k

Usage

qread(
  path,
  sheet = NULL,
  range = NULL,
  col_names = TRUE,
  col_types = NULL,
  simplify_dates = TRUE,
  empty_cols_to_chr = TRUE,
  na = "",
  trim_ws = TRUE,
  skip = 0,
  n_max = Inf,
  guess_max = min(10000, n_max),
  progress = FALSE,
  .name_repair = "unique"
)

Arguments

`path`	Path to the xls/xlsx file.
`sheet`	Sheet to read. Either a string (the name of a sheet), or an integer (the position of the sheet). Ignored if the sheet is specified via `range`. If neither argument specifies the sheet, defaults to the first sheet.
`range`	A cell range to read from, as described in cell-specification. Includes typical Excel ranges like "B3:D87", possibly including the sheet name like "Budget!B2:G14", and more. Interpreted strictly, even if the range forces the inclusion of leading or trailing empty rows or columns. Takes precedence over `skip`, `n_max` and `sheet`.
`col_names`	`TRUE` to use the first row as column names, `FALSE` to get default names, or a character vector giving a name for each column. If user provides `col_types` as a vector, `col_names` can have one entry per column, i.e. have the same length as `col_types`, or one entry per unskipped column.
`col_types`	Either `NULL` to guess all from the spreadsheet or a character vector containing one entry per column from these options: "skip", "guess", "logical", "numeric", "date", "text" or "list". If exactly one `col_type` is specified, it will be recycled. The content of a cell in a skipped column is never read and that column will not appear in the data frame output. A list cell loads a column as a list of length 1 vectors, which are typed using the type guessing logic from `col_types = NULL`, but on a cell-by-cell basis.
`simplify_dates`	Logical indicating whether to convert date columns lacking a time component to class "Date". By default readxl::read_excel reads columns containing dates or datetimes as class POSIX, even if there is no time component (i.e. in which case the times will all be "00:00:00"). If `simplify_posix` is TRUE (the default), columns containing dates with no nonzero time values are converted to class "Date" using lubridate::as_date.
`empty_cols_to_chr`	Logical indicating whether columns of class "logical" containing all missing values should be converted to class "character". If argument `col_types` is NULL (the default), columns containing all missing values are read in by readxl::read_excel as class "logical". If empty_cols_to_chr is TRUE (the default), such columns are converted to class "character".
`na`	Character vector of strings to interpret as missing values. By default, readxl treats blank cells as missing data.
`trim_ws`	Should leading and trailing whitespace be trimmed?
`skip`	Minimum number of rows to skip before reading anything, be it column names or data. Leading empty rows are automatically skipped, so this is a lower bound. Ignored if `range` is given.
`n_max`	Maximum number of data rows to read. Trailing empty rows are automatically skipped, so this is an upper bound on the number of rows in the returned tibble. Ignored if `range` is given.
`guess_max`	Maximum number of data rows to use for guessing column types.
`progress`	Display a progress spinner? By default, the spinner appears only in an interactive session, outside the context of knitting a document, and when the call is likely to run for several seconds or more. See `readxl_progress()` for more details.
`.name_repair`	Handling of column names. Passed along to `tibble::as_tibble()`. readxl's default is '.name_repair = "unique", which ensures column names are not empty and are unique.