nm_join: Return a single data frame with model output and input data

View source: R/nm-join.R

nm_joinR Documentation

Return a single data frame with model output and input data

Description

For NONMEM models, when a unique row identifier (e.g an integer numbering the rows) is included in the input data set (i.e. the file in ⁠$DATA⁠) and carried into each table output, nm_join() can read in all output table files and join back to the input data set. By default, the input data is joined to the table files so that the number of rows in the result will match the number of rows in the table files (i.e. the number of rows not bypassed via ⁠$IGNORE⁠). Use the .superset argument to join table outputs to the (complete) input data set. This function will print the number of rows and columns when each file is loaded, as well as some information about the joins. This printing can be suppressed by setting options(bbr.verbose = FALSE).

Usage

nm_join(
  .mod,
  .join_col = "NUM",
  .files = nm_table_files(.mod),
  .superset = FALSE,
  .bbi_args = list(no_grd_file = TRUE, no_shk_file = TRUE)
)

Arguments

.mod

A bbi_nonmem_model or bbi_nonmem_summary object, or a path to a NONMEM run.

.join_col

Character column name to use to join table files. Defaults to NUM. See Details.

.files

Character vector of file paths to table files to read in. Defaults to calling nm_table_files() on .mod, which will parse all file names from ⁠$TABLE⁠ blocks in the control stream. If passing manually, paths should be either absolute, or relative to get_output_dir(.mod).

.superset

If FALSE, the default, the data will be joined to the NONMEM output and if TRUE, the NONMEM output will be joined to the data; that is, if you use .superset, you will get the same number of rows as you have in the input data and NONMEM output columns like PRED and CWRES will be filled with NA.

.bbi_args

Named list passed to model_summary(.bbi_args). See print_bbi_args() for valid options. Defaults to list(no_grd_file = TRUE, no_shk_file = TRUE) because model_summary() is only called internally to extract the number of records and individuals, so those files are irrelevant.

Details

Join column

The .join_col is the name of a single column that should appear in both the input data set and any tables you want to join. We recommend you make this column a simple integer numbering the rows in the input data set (for example NUM). When this column is carried into the output table files, there will be unambiguous matching from the table file back to the input data set.

The one exception to this are FIRSTONLY tables. If a table file has the same number of rows as the there are individuals in the input data set (accounting for any filtering of data in the NONMEM control stream), it will assumed to be a FIRSTONLY table. In this case, the table will be joined to the input data by the ID column. If ID is not present in the table, it will be using .join_col. Note that if neither ID or the column passed to .join_col are present in the table, the join will fail.

Note also that, when .join_col is carried into table outputs, there is no need to table any other columns from the input data as long as the nm_join() approach is used; any column in the input data set, regardless of whether it is listed in ⁠$INPUT⁠ or not, will be carried through from the input data and therefore available in the joined result.

Duplicate columns are dropped

If a table has columns with the same name as columns in the input data set, or a table that has already been joined, those columns will be dropped from the joined data. If getOption(bbr.verbose) == TRUE a message will be printed about any columns dropped this way.

The one exception to this is the DV column. If DV is present in the input data and at least one of the table files, the DV column from the input data will be renamed to DV.DATA and the column from the table file kept as DV.

The origin of each column is attached to the return value via the "nm_join_origin" attribute, a list that maps each source (as named by nm_tables()) to the columns that came from that source.

Duplicate Rows Warning for Join Column

If there are duplicate rows found in the specified .join_col, a warning will be raised specifying a subset of the repeated rows. Duplicates may be caused by lack of output width. FORMAT may be need to be stated in control stream to have sufficient width to avoid truncating .join_col.

Multiple tables per file incompatibility

Because nm_tables() calls nm_file() internally, it is not compatible with multiple tables written to a single file. See "Details" in nm_file() for alternatives.

See Also

nm_tables(), nm_table_files(), nm_file()


metrumresearchgroup/rbabylon documentation built on April 21, 2024, 3:26 a.m.