read.xlsx: Read the contents of an Excel '07 workbook

read.xlsxR Documentation

Read the contents of an Excel '07 workbook


This function and its methods provide a high-level mechanism for reading the contents of a modern (xlsx) Excel document and all of its worksheets. It is similar to read.table but is capable of reading multiple data frames from a single file. For this reason, it does not make sense to specify classes for the columns as they do not necessarily apply to all of the worksheets.


read.xlsx(doc, which = NA, na = logical(), header = NA, skip = 0L, ..., as.list = FALSE)



the Excel document. This can be the name of the xlsx file, the ExcelArchive object created via excelDoc, or the Worbook object.


an optional vector that is used to specify a subset of the worksheets to be read. This allows the caller to skip work sheets that are not of interest


an optional value or vector of values that is used to identify cell values that should be mapped to NA values in R. For example, if the author of the spreadsheet used the string NA or a value -999 to represent a missing value, we would specify that value as the value of na and such cells would be returned as NA in R. If -999 and "Not Available" indicated missing values, we could specify these as a vector c(-999, "Not Available").


a logical vector with an element for each sheet to be read (or else it is recycled) that indicates whether a particular sheet has column names in the first row of the actual sheet data/cells. The concept of "first row" is further controlled by skip.


a number for each sheet to be read (see which) that indicates how many rows to skip of before the data start. The functions automatically skip empty rows so these rows are not included in skip. The purpose of this argument is to allow us to ignore rows that contain, e.g., a title or footnotes before the data. Note that instead of using skip, one can subset the worksheet directly, e.g. workbook(file)[[2]][ 4:10, ] to start at row 4.

skip is recycled to have the same lnth as the number of sheets being read.


additional parameters for the methods


a logical value. This controls how the contents of a workbook with a single worksheet is returned. If this is TRUE, the function will return a list with the data frame as the single element. If this is FALSE, the data frame is returned. This has no effect if there is not exactly 1 worksheet in the workbook. This is parameter is provided for covenience for interactive use but to allow programmatic use to ensure that the return type is a list and not depend on the contents of the xlsx file.


Typically a list with as many elements as there are worksheets within the workbook. If as.list is FALSE and there is a single worksheet in the workbook, the data frame for that worksheet is returned directly.


Duncan Temple Lang


  f = system.file("SampleDocs", "Workbook1.xlsx", package = "RExcelXML")

duncantl/RExcelXML documentation built on Nov. 23, 2023, 4:21 p.m.