read_fsd | R Documentation |
Efficiently read fusion output that was written to disk, optionally returning a subset of rows and/or columns. Since a .fsd
file is simply a fst
file under the hood, this function also works on any .fst
file.
read_fsd(
path,
columns = NULL,
M = 1,
df = NULL,
cores = max(1, parallel::detectCores(logical = FALSE) - 1)
)
path |
Character. Path to a |
columns |
Character. Column names to read. The default is to return all columns. |
M |
Integer. The first |
df |
Data frame. Data frame used to identify a subset of rows to return. Default is to return all rows. |
cores |
Integer. Number of cores used by |
If df
is provided and the file size on disk is less than 100 MB, then a full read and inner join
is performed. For larger files, a manual read of the required rows is performed, using fmatch
for the matching operation.
A data.table
; keys are preserved if present in the on-disk data. When path
points to a .fsd
file, it includes an integer column "M" indicating the implicate assignment of each observation (unless explicitly ignored by columns
).
# Build a fusion model using RECS microdata
# Note that "fusion_model.fsn" will be written to working directory
?recs
fusion.vars <- c("electricity", "natural_gas", "aircon")
predictor.vars <- names(recs)[2:12]
fsn.path <- train(data = recs, y = fusion.vars, x = predictor.vars)
# Write fusion output directly to disk
# Note that "results.fsd" will be written to working directory
recipient <- recs[predictor.vars]
sim <- fuse(data = recipient, fsn = fsn.path, M = 5, fsd = "results.fsd")
# Read the fusion output saved to disk
sim <- read_fsd(sim)
head(sim)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.