knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
The main motivation for developing the libr package is to create and use
data libraries and data dictionaries. These concepts are useful when
dealing with sets of related data files. The libname()
function allows
you to define a library for an entire directory of data files. The library
can then be manipulated as a whole using the lib_*
functions in the libr
package.
There are four main libr functions for creating and using a data library:
libname()
lib_load()
lib_unload()
lib_write()
The libname()
function creates a data library. The function has parameters
for the library name and a directory to associate it with. If the directory
has existing data files, those data files will be automatically loaded
into the library. Once in the library, the data can be accessed using list
syntax.
You may create a data library for several different types of files: 'rds', 'Rdata',
'rda', 'csv', 'xlsx', 'xls', 'sas7bdat', 'xpt', and 'dbf'. The type of library is
defined using the engine
parameter on the libname()
function. The default
data engine is 'rds'. The data engines will attempt to identify the correct
data type for each column of data. You may also control the data type of
the columns using the import_specs
parameter and the specs()
and
import_spec()
functions.
If you prefer to access the data via the workspace, call
the lib_load()
function on the library. This function will load the
library data into the parent frame, where it can be accessed using a two-level
(<library>.<dataset>) name.
When you are done with the data, call the lib_unload()
function to remove
the data from the parent frame and put it back in the library list. To write
any added or modified data to disk, call the lib_write()
function. The
lib_write()
function will only write data that has changed since the last
write.
The following example will illustrate some basic functionality of the libr package regarding the creation of libnames and use of dictionaries. The example first places some sample data in a temp directory for illustration purposes. Then the example creates a libname from the temp directory, loads it into memory, adds data to it, and then unloads and writes everything to disk:
library(libr) # Create temp directory tmp <- tempdir() # Save some data to temp directory # for illustration purposes saveRDS(trees, file.path(tmp, "trees.rds")) saveRDS(rock, file.path(tmp, "rocks.rds")) # Create library libname(dat, tmp) # library 'dat': 2 items # - attributes: not loaded # - path: C:\Users\User\AppData\Local\Temp\RtmpCSJ6Gc # - items: # Name Extension Rows Cols Size LastModified # 1 rocks rds 48 4 3.1 Kb 2020-11-05 23:25:34 # 2 trees rds 31 3 2.4 Kb 2020-11-05 23:25:34 # Examine data dictionary for library dictionary(dat) # A tibble: 7 x 9 # Name Column Class Label Description Format Width Rows NAs # <chr> <chr> <chr> <lgl> <lgl> <lgl> <lgl> <int> <int> # 1 rocks area integer NA NA NA NA 48 0 # 2 rocks peri numeric NA NA NA NA 48 0 # 3 rocks shape numeric NA NA NA NA 48 0 # 4 rocks perm numeric NA NA NA NA 48 0 # 5 trees Girth numeric NA NA NA NA 31 0 # 6 trees Height numeric NA NA NA NA 31 0 # 7 trees Volume numeric NA NA NA NA 31 0 # Load library lib_load(dat) # Examine workspace ls() # [1] "dat" "dat.rocks" "dat.trees" "tmp" # Use data from the library summary(dat.rocks) # area peri shape perm # Min. : 1016 Min. : 308.6 Min. :0.09033 Min. : 6.30 # 1st Qu.: 5305 1st Qu.:1414.9 1st Qu.:0.16226 1st Qu.: 76.45 # Median : 7487 Median :2536.2 Median :0.19886 Median : 130.50 # Mean : 7188 Mean :2682.2 Mean :0.21811 Mean : 415.45 # 3rd Qu.: 8870 3rd Qu.:3989.5 3rd Qu.:0.26267 3rd Qu.: 777.50 # Max. :12212 Max. :4864.2 Max. :0.46413 Max. :1300.00 # Add data to the library dat.trees_subset <- subset(dat.trees, Girth > 11) # Add more data to the library dat.cars <- mtcars # Unload the library from memory lib_unload(dat) # Examine workspace again ls() # [1] "dat" "tmp" # Write the library to disk lib_write(dat) # library 'dat': 4 items # - attributes: not loaded # - path: C:\Users\User\AppData\Local\Temp\RtmpCSJ6Gc # - items: # Name Extension Rows Cols Size LastModified # 1 rocks rds 48 4 3.1 Kb 2020-11-05 23:37:45 # 2 trees rds 31 3 2.4 Kb 2020-11-05 23:37:45 # 3 cars rds 32 11 7.3 Kb 2020-11-05 23:37:45 # 4 trees_subset rds 23 3 1.8 Kb 2020-11-05 23:37:45 # Clean up lib_delete(dat) # Examine workspace again ls() # [1] "tmp"
Next: Library Management
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.