import_list: Import list of data frames

View source: R/import_list.R

import_listR Documentation

Import list of data frames

Description

Use import() to import a list of data frames from a vector of file names or from a multi-object file (Excel workbook, .Rdata file, compressed directory in a zip file or tar archive, or HTML file)

Usage

import_list(
  file,
  setclass = getOption("rio.import.class", "data.frame"),
  which,
  rbind = FALSE,
  rbind_label = "_file",
  rbind_fill = TRUE,
  ...
)

Arguments

file

A character string containing a single file name for a multi-object file (e.g., Excel workbook, zip file, tar archive, or HTML file), or a vector of file paths for multiple files to be imported.

setclass

An optional character vector specifying one or more classes to set on the import. By default, the return object is always a “data.frame”. Allowed values include “tbl_df”, “tbl”, or “tibble” (if using tibble), “arrow”, “arrow_table” (if using arrow table; the suggested package arrow must be installed) or “data.table” (if using data.table). Other values are ignored, such that a data.frame is returned. The parameter takes precedents over parameters in ... which set a different class.

which

If file is a single file path, this specifies which objects should be extracted (passed to import()'s which argument). Ignored otherwise.

rbind

A logical indicating whether to pass the import list of data frames through data.table::rbindlist().

rbind_label

If rbind = TRUE, a character string specifying the name of a column to add to the data frame indicating its source file.

rbind_fill

If rbind = TRUE, a logical indicating whether to set the fill = TRUE (and fill missing columns with NA).

...

Additional arguments passed to import(). Behavior may be unexpected if files are of different formats.

Details

When file is a vector of file paths and any files are missing, those files are ignored (with warnings) and this function will not raise any error. For compressed files, the file name must also contain information about the file format of all compressed files, e.g. files.csv.zip for this function to work.

Value

If rbind=FALSE (the default), a list of a data frames. Otherwise, that list is passed to data.table::rbindlist() with fill = TRUE and returns a data frame object of class set by the setclass argument; if this operation fails, the list is returned.

Trust

For serialization formats (.R, .RDS, and .RData), please note that you should only load these files from trusted sources. It is because these formats are not necessarily for storing rectangular data and can also be used to store many things, e.g. code. Importing these files could lead to arbitary code execution. Please read the security principles by the R Project (Plummer, 2024). When importing these files via rio, you should affirm that you trust these files, i.e. trust = TRUE. See example below. If this affirmation is missing, the current version assumes trust to be true for backward compatibility and a deprecation notice will be printed. In the next major release (2.0.0), you must explicitly affirm your trust when importing these files.

Which

For compressed archives (zip and tar, where a compressed file can contain multiple files), it is possible to come to a situation where the parameter which is used twice to indicate two different concepts. For example, it is unclear for .xlsx.zipwhether which refers to the selection of an exact file in the archive or the selection of an exact sheet in the decompressed Excel file. In these cases, rio assumes that which is only used for the selection of file. After the selection of file with which, rio will return the first item, e.g. the first sheet.

Please note, however, .gz and .bz2 (e.g. .xlsx.gz) are compressed, but not archive format. In those cases, which is used the same way as the non-compressed format, e.g. selection of sheet for Excel.

References

Plummer, M (2024). Statement on CVE-2024-27322. https://blog.r-project.org/2024/05/10/statement-on-cve-2024-27322/

See Also

import(), export_list(), export()

Examples

## For demo, a temp. file path is created with the file extension .xlsx
xlsx_file <- tempfile(fileext = ".xlsx")
export(
    list(
        mtcars1 = mtcars[1:10, ],
        mtcars2 = mtcars[11:20, ],
        mtcars3 = mtcars[21:32, ]
    ),
    xlsx_file
)

# import a single file from multi-object workbook
import(xlsx_file, sheet = "mtcars1")
# import all worksheets, the return value is a list
import_list(xlsx_file)

# import and rbind all worksheets, the return value is a data frame
import_list(xlsx_file, rbind = TRUE)

rio documentation built on Sept. 26, 2024, 1:07 a.m.