readOpenxlsx | R Documentation |
Import one or more data.frame from 'Excel' 'xlsx' format
Description
Import one or more data.frame from 'Excel' 'xlsx' format
Usage
readOpenxlsx(
xlsx,
sheet = NULL,
startRow = 1,
startCol = 1,
rows = NULL,
cols = NULL,
check.names = FALSE,
check_header = FALSE,
check_header_n = 10,
verbose = FALSE,
...
)
Arguments
xlsx |
character path to an 'Excel' file in xlsx format,
compatible with openxlsx::read.xlsx() .
|
sheet |
one of NULL , character , or integer vector,
where: sheet=NULL will import every sheet; character is
a vector of sheet names; and integer is a vector of sheet
index values. The sheet names are determined with
openxlsx::getSheetNames() .
|
startRow |
integer indicating the row number to start
importing each sheet .
Note startRow can be a vector
with length length(sheet) , to specify the startRow for
each sheet .
Note startRow is ignored when rows is defined for the same sheet,
to minimize confusion about using both togetheer.
|
startCol |
integer indicating the first column number to retain
after importing each sheet .
Note startCol can be a vector with length length(sheet) ,
to specify the startCol for each sheet .
Note startCol is ignored when cols is defined for the same sheet,
to minimize confusion about using both togetheer.
|
rows |
integer vector indicating specific rows to import
for each sheet .
To specify different rows for each sheet ,
supply rows as a list of integer vectors.
Note that when rows is defined for a sheet, it will be used
and startRow will be ignored for that same sheet.
|
cols |
integer vector indicating specific column numbers to import
for each sheet .
To specify different cols for each sheet , supply cols
as a list of integer vectors.
Note that when cols is defined for a sheet, it will be used
and startCol will be ignored for that same sheet.
|
check.names |
logical indicating whether to call make.names()
on the colnames of each data.frame .
Note that openxlsx::read.xlsx() does not honor check.names=FALSE ,
so a workaround is applied which loads a single line
without column headers, in order to obtain the same data without
mangling column headers. If this process fails, another workaround
is to use startRow=2 (one higher than previous) and colNames=FALSE .
|
|
logical indicating whether to test for presence
of header rows, which may be multi-line column headers.
When check_header=TRUE , this
method simply tests for the presence of rows that have ncol
different than the remaining rows of data in the given sheet.
When header rows are detected, the values are assigned to column
dimnames of the data.frame .
|
|
integer number of rows to test for header rows,
only used when check_header=TRUE . This step is intended when
the top row(s) contain fewer columns with headers, above actual
column headers, for example the first row c("Sample", "", "", "Lane", "") ,
and the second row c("Name", "Type", "Label", "Name", "Type") .
In this case the desired output is
"Sample_Name","Sample_Type","Sample_Label","Lane_Name","Lane_Type") .
This option default is FALSE due to the number of exceptions
seen in real data.
|
verbose |
logical indicating whether to print verbose output.
|
... |
additional arguments are passed to openxlsx::read.xlsx() .
|
Details
This function is equivalent to openxlsx::read.xlsx()
with a few minor additions:
It returns a list
of data.frame
objects, one per sheet
.
It properly reads the colnames
with check.names=FALSE
.
By default this function returns every sheet
for a given
xlsx
file.
Some useful details:
Empty columns are not skipped during loading, which means a worksheet
whose data starts at column 3 will be returned with two empty columns,
followed by data from that worksheet. Similarly, any empty columns
in the middle of the data in that worksheet will be included in the
output.
When both startRow
and rows
are applied, rows
takes priority
and will be used instead of startRows
. In fact startRows
will be
defined startRows <- min(rows)
for each relevant worksheet. However,
for each worksheet either argument can be NULL
.
Value
list
of data.frame
objects, one per sheet in xlsx
.
See Also
Other jam export functions:
applyXlsxCategoricalFormat()
,
applyXlsxConditionalFormat()
,
set_xlsx_colwidths()
,
set_xlsx_rowheights()
,
writeOpenxlsx()
Examples
# set up a test data.frame
set.seed(123);
lfc <- -3:3 + stats::rnorm(7)/3;
colorSub <- nameVector(
rainbow2(7),
LETTERS[1:7])
df <- data.frame(name=LETTERS[1:7],
int=round(4^(1:7)),
num=(1:7)*4-2 + stats::rnorm(7),
fold=2^abs(lfc)*sign(lfc),
lfc=lfc,
pvalue=10^(-1:-7 + stats::rnorm(7)),
hit=sample(c(-1,0,0,1,1), replace=TRUE, size=7));
df;
# write to tempfile for examples
if (check_pkg_installed("openxlsx")) {
out_xlsx <- tempfile(pattern="writeOpenxlsx_", fileext=".xlsx")
writeOpenxlsx(x=df,
file=out_xlsx,
sheetName="jamba_test",
append=FALSE);
# now read it back
df_list <- readOpenxlsx(xlsx=out_xlsx);
df_list[[1]]
}