interact_with_tool: IARC CRG Tools R Interface

View source: R/user_interface.R

interact_with_toolR Documentation

IARC CRG Tools R Interface

Description

Write data for IARC CRG Tools and read it back into R after the run has finished.

Usage

interact_with_tool(data, tool.name, clean = FALSE, verbose = FALSE)

connect_tool_results_to_observations(record.ids, tool.results)

Arguments

data

a data.frame; required columns depend on the IARC CRG Tools tool used

tool.name

⁠[character]⁠ (mandatory, no default)

name of tool to run; see output of tool_clean_names() for options

clean

⁠[logical]⁠ (optional, default FALSE)

  • TRUE: all input and output files for IARC CRG Tools will be removed from disk after the results have been read into R

  • FALSE: all files are left to be in peace

verbose

if TRUE, this functions is chatty and gives messages durings its run; if FALSE, you will only see necessary messages

record.ids

⁠[integer]⁠ (mandatory, no default)

IDs of records for which to retrieve any record-specific results from tool.results

tool.results

⁠[list]⁠ (mandatory, no default)

list of tables and/or log texts as output by one of the interface functions to IARC CRG Tools (e.g. interact_with_tool)

Details

iarccrgtools::interact_with_tool performs the following steps. First, a subset of columns from data is collected based on tool.name. Then the cache is checked for pre-existing results for the given data and tool.name. See e.g. ⁠[iarccrgtools::cache_metadata_read]⁠. If pre-existing results are found, the user is prompted whether to use the pre-existing results from disk and skip IARC CRG Tools altogether, or to proceed to running IARC CRG Tools (again). If there were no cached results / the user did not want to read them, ⁠[iarccrgtools::iarc_input_write]⁠ is called. Cache metadata is then updated by calling ⁠[iarccrgtools::cache_metadata_append_or_replace]⁠. iarccrgtools attempts to write (sensible, default) parameters (e.g. path to input / output) for use by IARC CRG Tools. The user has the responsibility to make sure that the parameters are correct for their dataset. You will see them when you run IARC CRG Tools. The location of the parameter is fairly involved because IARC CRG Tools is an older programme. Newer versions of Windows do not allow the user to write anything into the dir where IARC CRG Tools is installed, which IARC CRG Tools nevertheless wants to do. Microsoft has solved this by creating a "virtual" directory where the user can write stuff. However, in some situations (e.g. with admin permissions) you ARE able to write into the IARC CRG Tools installation dir. So first the subdir pgm is tested, whether it is writable or not. If it is, the parameter file will be stored there. If not, the virtual dir as attempted to be used. The virtual dir is assumed to be found at ⁠%LOCALAPPDATA%/VirtualStore/Program Files (x86)/IARCcrgTools/pgm/⁠. It is attempted to be created if it does not exist. It is possible that there are versions of Windows where this will not work, either because the virtual dir has a different location or maybe because virtual dirs are not used at all. If all else fails, IARC CRG Tools works best with admin permissions. If R package iarccrgtools contains pre-defined (default) settings (e.g. positions of specific columns in the file on disk), those are written into the dir given by ⁠[iarc_toolworkdir_get]⁠. The user has the responsibility to make sure that the settings are correct for their dataset. You will see them when you run IARC CRG Tools. With the data and parameters in place, the user is next informed what they have to do in IARC CRG Tools. When permission is given to read the data into R, ⁠[iarccrgtools::iarc_output_read]⁠ is called. Finally, if clean = TRUE, ⁠[iarccrgtools::cache_clean_hash]⁠ for the hash of the dataset given by the user.

iarccrgtools::connect_tool_results_to_observations returns a data.table with length(record.ids) rows; it has column record_id and additional columns depending on results in tool.results; this function goes through each object in tool.results and if an object is a data.table with columns record_id and tool_text, each record appearing in that data.table is marked in the output data.table in a logical column (e.g. in_multiple_primary_input.exl) and any text in tool_text is collected into a separate column (e.g. multiple_primary_input.exl); therefore the columns in the output of connect_tool_results_to_observations vary by tool used.

Examples


# iarccrgtools::interact_with_tool
## Not run: 
dir_path <- tempdir()
iarccrgtools::iarc_workdir_set(dir_path)

tool_name <- "check"
subset <- "mandatory"
iarc_df <- iarccrgtools::tool_colnameset_example_dataset(
  paste0(subset, "_", tool_name), n.rows = 10L
)
results <- iarccrgtools::interact_with_tool(
  iarc_df, tool.name = tool_name, clean = TRUE
)

result_df <- iarccrgtools::connect_tool_results_to_observations(
  record.ids = iarc_df[["record_id"]], tool.results = results
)

## End(Not run)



WetRobot/iarccrgtools documentation built on Feb. 1, 2024, 6:33 a.m.