grpoutputs: Load and group outputs from files

View source: R/grpoutputs.R

grpoutputsR Documentation

Load and group outputs from files

Description

Load and group outputs from files containing multiple observations of the groups to be compared.

Usage

grpoutputs(
  outputs,
  folders,
  files,
  lvls = NULL,
  concat = F,
  centscal = "range",
  ...
)

Arguments

outputs

A vector with the labels of each output, or an integer with the number of outputs (in which case output labels will be assigned automatically). In either case, the number of outputs should account for an additional concatenated output, as specified in the concat parameter.

folders

Vector of folder names where to read files from. These are recycled if length(folders) < length(files).

files

Vector of filenames or file sets to load in each folder. File sets can be given as regular expressions, or as wildcards by wrapping them with glob2rx.

lvls

Vector of factor levels (groups). Must be the same length as files, i.e. each file set will be associated with a different level or group. If not given, default group names will be used.

concat

If TRUE add an additional output which corresponds to the concatenation of all outputs, properly centered and scaled.

centscal

Method for centering and scaling outputs if concat is TRUE. It can be one of "center", "auto", "range" (default), "iqrange", "vast", "pareto" or "level". Centering and scaling is performed by the centerscale function.

...

Options passed to read.table, which is used to read the files specified in the files parameter.

Details

Each file corresponds to an observation, and should have a tabular format where columns correspond to outputs and rows to variables or dimensions. Observations (files) are grouped by factor levels which correspond to the file groups given in the files parameter. Factor levels differentiate observations from distinct groups.

Value

Object of class grpoutputs containing the following data:

data

List of all outputs, each one grouped into a n x m matrix, where n is the total number of output observations and m is the number of variables or dimensions (i.e. output length).

groupsize

Vector containing number of observations for each level or group.

obs_lvls

Factor vector of levels or groups associated with each observation.

lvls

Vector of factor levels in the order they occur (as given in parameter with the same name).

concat

Boolean indicating if this object was created with an additional concatenated output.

Examples

# Determine paths for data folders, each containing outputs for 10 runs of
# the PPHPC model
dir_nl_ok <- system.file("extdata", "nl_ok", package = "micompr")
dir_jex_ok <- system.file("extdata", "j_ex_ok", package = "micompr")
files <- glob2rx("stats400v1*.tsv")

# Create a grouped outputs object using outputs from NetLogo and Java
# implementations of the PPHPC model
go <- grpoutputs(7, c(dir_nl_ok, dir_jex_ok), c(files, files),
                 lvls = c("NL", "JEX"), concat = TRUE)

# Do the same, but specify output names and don't specify levels
go <- grpoutputs(c("a", "b", "c", "d", "e", "f"),
                 c(dir_nl_ok, dir_jex_ok), c(files, files))

fakenmc/micompr documentation built on Aug. 6, 2024, 8:29 p.m.