library(knitr) opts_chunk$set(fig.path='plots/',fig.align='center',fig.show='hold',size='footnotesize',cache=F) library(ggplot2) theme_set(theme_bw(base_size=14)+ theme(legend.position="bottom")+ ## purple facet strip label background and white text theme(strip.background =element_rect(fill="#52247F") ,strip.text=element_text(color = "#ffffff")) ) ## theme_set(theme_bw(base_size=14)+ ## theme(legend.position="bottom")+ ## theme(strip.background =element_rect(fill="#ff9dff")) ## ) library(data.table) unloadNamespace("NMdata") ## some setup options(width=60) # make the printing fit on the page set.seed(1121) # make the results repeatable ### shortcuts to examples in NMdata file.data <- function(...) system.file("examples/data",..., package="NMdata") file.nm <- function(...) system.file("examples/nonmem",..., package="NMdata")
```{css, echo=FALSE} .watch-out { background-color: lightpink; border: 3px solid red; font-weight: bold; } .smaller { background-color: lightgreen; font-size: 4pt; }
## Outline \tableofcontents[hideallsubsections] # Introduction ## What is NMdata? ::: columns :::: column ### NMdata is An R package that can help * Creating and checking event-based data sets for PK/PD modeling * Keeping Nonmem code updated to match contents of datasets * Read all output data and combine with input data from Nonmem runs - supply output list file (.lst), and the reader is very flexible and automated Designed to fit in to the user's setup and coding preferences * NMdata comes with a configuration tool that can be used to tailor default behaviour to the user's system configuration and preferences. :::: :::: column ### NMdata is not * A plotting package * A tool to retrieve details about model runs * A calculation or simulation toolbox * A "silo" that requires you to do things in a certain way - No tools in NMdata requires other NMdata tools to be used :::: ::: $$\vspace{.01in}$$ * The data creation tools should be relevant independently of estimation/simulation tool. * Latest stable release is 0.0.12 and is available on CRAN and MPN (starting from 2022-06-15 snapshot). ## NMdata 0.0.12 on MPN \includegraphics[width=3.5in]{figures/nmdata_mpn_2022-06-27 22-03-35.png} <!-- ## Who can find NMdata useful? --> <!-- * The data set creation tools are relevant no matter the estimation and simulations tools. --> <!-- * Nonmem users will find additional tools for handling the exchange of data between R and Nonmem. --> <!-- ## About the author --> <!-- * Pharmacometrician with experience from biostatistics --> <!-- * Background in engineering, experience as system administrator, 15 years of R experience --> <!-- * Very concerned with code robustness and ensuring data consistency. --> <!-- * Authored an R package on safe data transfer from SAS to R and one on survival analysis. --> <!-- I hate being stuck in leg work and having too little time for modeling, --> <!-- reflection, and understanding key questions. `NMdata` is a big help for --> <!-- me personally in freeing time to more high-level tasks. --> <!-- Lots of work missing on this one --> <!-- ## Motivation --> <!-- PK/PD modeling is technically extremely heavy. We want to do provide clarity to decision making, but spend a lot of our time in deep mud. --> <!-- `NMdata` is my humble experience collected in efficient functions that fill some holes and help with some of the most annoying design --> ## How to update to recent MPN snapshot Update the `pkgr.yml` file: (example: `prod_vx123_001_analysis/trunk/analysis/vx_123_001_project/pkgr.yml`)
Version: 1 Threads: 1 Packages: - NMdata
Cache: /data/prod_vx708_001_pkgcache-2022-06-15 Repos: - MPN: https://mpn.metworx.com/snapshots/stable/2022-06-15 Lockfile: Type: renv
Then go to `prod_vx123_001_analysis/trunk/analysis/vx_123_001_project` and install/update packages from the linux terminal (not R):
$ cd /data/prod_vx123_001_analysis/trunk/analysis/vx_123_001_project $ pkgr --update install
## Motivation * The workflow of a pharmacometrician is very technical, with many risks of errors. * Technical workload takes time from modeling, reflection, and understanding key questions. * During the first 2-3 years I spent in pharmacometrics, I must have spent half the time coding, desparately trying to get Nonmem to behave, and to understand the properties of the estimates I obtained. * Most of us develop our own ways to avoid some of the many difficulties in this process. This takes a lot of time and is most often only because we don't have adequate tools at hand. Or don't know them. * I generalized some of my solutions and collected them in `NMdata`. * Almost every single line of code in the package is motivated by bad experiences. Errors, fear of errors, time wasted on debugging and double checking. * I have no intention of missioning these approaches to others. But if you find something interesting, feel free to take advantage. <!-- This could become a good slide, but so far not ready at all --> <!-- ## Overview of NMdata functionality --> <!-- * Data creation --> <!-- - Checking of compatibility of data.frames. --> <!-- - Merge with automated checks --> <!-- * Nonmem control stream editing --> <!-- * Retrieve data from Nonmem --> ## Getting started Install from `CRAN` or from `MPN` using `pkgr`. ```r library(NMdata)
library(devtools) load_all()
NMdataConf(check.time=FALSE) NMdataConf(as.fun="data.table")
Three vignettes are available so far (see "Vignettes" tab when visiting URL above):
For a quick overview (after installation), do:
help(package="NMdata")
All functions and their arguments are documented in ?manual pages.
pk <- readRDS(file=system.file("examples/data/xgxr2.rds",package="NMdata")) pk[,trtact:=NULL] ## will create this in the example pk[,ROW:=NULL] pk.reduced <- copy(pk) pk.reduced <- pk.reduced[1:(.N%/%2)] pk.reduced[,CYCLE:=NULL] pk.reduced[,AMT:=as.character(AMT)]
compareCols
::: columns :::: column
compareCols
provides an overview of these properties for any number of data sets. diff.only=FALSE
will give the complete list of columns in the two datasets.A slightly modified version of the pk
dataset has been created.
CYCLE
has been removed, andAMT
has been recoded to characteropt.old <- options(width=50)
:::: :::: column
compareCols(pk,pk.reduced)
options(opt.old)
\vspace{12pt}
Before merging or stacking, we may want to
AMT
in one of the datasets to get the class we needCYCLE
in one of the datasets :::: :::
missings <- listMissings(pk) head(missings)
You can specify
From ?listMissings
:
Usage: listMissings(data, cols, by, na.strings = c("", "."), quiet = FALSE, as.fun)
::: columns :::: column
renameByContents
renames columns if a function of their contents returns TRUE
.NMisNumeric
NMisNumeric
is a function that tests if the contents are numeric to Nonmem
. "1039"
(character class) will be a numeric in Nonmem, "1-039"
will not. ::::
:::: column All column names are capital case. We rename to lowercase those that Nonmem will not be able to interpret as numeric. \footnotesize
pk.old <- copy(pk)
pk <- renameByContents(data=pk, fun.test=NMisNumeric, fun.rename = tolower, invert.test = TRUE)
compareCols
shows that four columns were renamed:
compareCols(pk.old,pk)
\normalsize ::::
:::
mergeCheck
is a wrapper of merge
which only accepts the results if The rows that come out of the merge are the exact same as in one of the existing datasets, only columns added from the second dataset
This limitation of the scope of the merge allows for a high degree of automated checks of consistency of the results.
This is not to say that merges beyond the scope of mergeCheck
are
relevant or necessary. But if mergeCheck
covers your needs, it's a
real time saver in terms of automated checks.
mergeCheck is not a new implementation of merge. It's an implementation of checks.
mergeCheck
uses merge.data.table
. The contribution is the checks that no rows are lost, duplicated or added.
The order of rows in the resulting data is always the same as the first dataset supplied.
Is mergeCheck
slower?
mergeCheck
is likely to be way faster than what you use already. \framesubtitle{Example: Would your standard checks of merges capture this?}
dt.cov <- pk[,.(ID=unique(ID))] dt.cov[,COV:=sample(1:5,size=.N,replace=TRUE)] dt.cov <- dt.cov[c(1,1:(.N-1))]
Say we want to add a covariate from a
dt.cov
. We expect the number of rows to be unchanged from pk
. mergeCheck
more strictly requires that we get all and only the same rows:
::: columns :::: column
mergeCheck
\footnotesize
## The resulting dimensions are correct pkmerge <- merge(pk,dt.cov,by="ID") dims(pk,dt.cov,pkmerge) ## But we now have twice as many rows for this subject dims(pk[ID==31],pkmerge[ID==31])
::: :::: column
mergeCheck
throws an error...and suggests what is wrong \footnotesize
try(mergeCheck(pk,dt.cov,by="ID"))
:::: \normalsize :::
If you only want to add columns by a merge, mergeCheck
does all the necessary checks for you.
\framesubtitle{Keep track of data exclusions - don't discard!}
It is good practice not to discard unwanted records from a dataset but to flag them and omit them in model estimation.
When reporting the analysis, we need to account for how many data records were discarded due to which criteria.
The implementation in NMdata
is based on sequentially checking
exclusion conditions.
The information is represented in one numerical column for Nonmem, and one (value-to-value corresponding) character column for the rest of us.
::: columns :::: column
flagsAssign
applies the conditions sequentially, by increasing or decreasing
value of FLAG
. BLQ
has to exist in pk
.FLAG=0
means that none of the conditions were met and row is kept in analysis. This cannot be customized.Nonmem
, you can include IGNORE=(FLAG.NE.0)
in $DATA
or $INFILE
.::::
:::: column \footnotesize
pk[,`:=`(FLAG=NULL,flag=NULL)]
dt.flags <- fread(text="FLAG,flag,condition 10,Below LLOQ,BLQ==1 100,Negative time,TIME<0") pk <- flagsAssign(pk,tab.flags=dt.flags,subset.data="EVID==0") pk <- flagsAssign(pk,subset.data="EVID==1",flagc.0="Dosing")
::::
:::
flagsCount
An overview of the number of observations disregarded due to the
different conditions is then obtained using flagsCount
:
flagsCount
includes a file
argument to save the the table right
away.
opts <- options(width=100)
\footnotesize
flagsCount(data=pk[EVID==0],tab.flags=dt.flags)
options(opts)
::: columns :::: column
A unique identifier is needed in order to
Track rows in analysis data back to source data
Reliably combine (by merge) output with input data
It is not a problem if represented as a double
in R
Increasing
:::: :::: column
data.table
## order setorder(pk,ID,TIME,EVID) ## add counter pk[,ROW:=.I]
dplyr
(I'm not very familiar with dplyr
)pk <- pk %>% arrange(ID,TIME,EVID) %>% mutate(ROW=1:n())
:::: :::
::: columns :::: column \vspace{12pt} * The order of columns in Nonmem is important for two reasons.
The number of variables you can read into Nonmem is restricted (may not apply to recent Nonmem versions)
NMorderColumns
uses a mix of recognition of column names and analysis of the
column contents to sort the columns.
First: Standard columns (ID
, TIME
, EVID
etc.) and usable columns first
Columns that cannot be converted to numeric are put in the back
Additional columns to place earlier (argument first
) or late (last
) can be specified.
See ?NMorderColumns
for more options.
NMorderColumns
does not sort rows, nor does it modify any contents of columns.
:::: :::: column \footnotesize
pk.old <- copy(pk) pk <- NMorderColumns(pk,first="WEIGHTB")
\normalsize
We may want to add MDV
and rerun NMorderColumns
.
\footnotesize
data.table(old=colnames(pk.old),new=colnames(pk))
:::: \normalsize :::
NMcheckData
: Check data syntax for Nonmem compatibilityAim: check data for all potential Nonmem compatibility issues and other obvious errors.
Findings must be returned in a structure so related subsets of the data can easily be identified for further inspection.
::: columns
:::: column
* NMcheckData
contains a very long list of checks of especially the standard Nonmem columns (ID
, TIME
, EVID
, AMT
, DV
, MDV
, RATE
, SS
, etc.). They are all checked for allowed values (e.g. TIME
must be non-negative, EVID
must be one of 0:4
, etc).
ID-level checks (e.g. did all ID's receive doses, is time increasing, are rows disjoint?)
All used columns are checked for Nonmem compatibility in terms of how Nonmem translates to numeric values.
Column names are checked for uniqueness and for non-allowed characters.
If you supply the col.usubjid column, the ID column is checked to align with col.usubjid.
NMcheckData
is based on simple framework making it simple to define new checks.
:::: :::: column \scriptsize
pk <- pk[ID>59] res.check <- NMcheckData(pk) res.check pkmod <- copy(pk) pkmod[,MDV:=as.numeric(is.na(DV))] pkmod[ID==60&EVID==1,CMT:=NA] res.check <- NMcheckData(pkmod) res.check
:::: :::
::: columns :::: column
For the final step of writing the dataset, NMwriteData
is
provided.
NMwriteData
never modifies the data.csv
file with appropriate options for Nonmem compatibilityrds
file for R If you use NMscanData
to read Nonmem results, this information can be used automatically.
Provides a proposal for text to include in the
$INPUT
and $DATA
sections of the Nonmem control
streams.
These are the only steps involved between the supplied data set and the written csv.
scipen
is small to maximize precision.\footnotesize
file.csv <- fnExtension(file,".csv") fwrite(data,na=".",quote=FALSE,row.names=FALSE,scipen=0,file=file.csv)
\normalsize
All arguments to fwrite
can be modified using the args.fwrite
argument.
:::: :::: column \footnotesize
NMwriteData(pk,file="derived/pk.csv")
\normalsize
\vspace{12pt}
eff0
is the last column in pk
that Nonmem
can make use of (remember NMisNumeric
from earlier?)
NMwriteData
detected the exclusion flag and suggests to include it in $DATA
.
:::: :::
::: columns :::: column
NMwriteSection
is a function that replaces sections (like $DATA or
$TABLE) of nonmem control streams.
NMwriteData
returns a list that can be directly processed by
NMwriteSection
In NMwriteData
, several arguments modify the proposed text
the proposed text for the Nonmem run, see ?NMwriteData
.
NMwriteData
is very useful for many other sections, like $TABLE
,
or even $PK
. But not $THETA
and $OMEAGE
(because they are
specific to each model).
NMwriteData
by defaults saves a backup of the overwritten control
streams.
NMwriteData
has a section reader counterpart in NMreadSection
NMextractDataFile
takes a control stream/list file and extracts
the input data file name/path. You can use this to identify the
model runs in which to update $DATA
.
:::: :::: column
\footnotesize
nmCode <- NMwriteData(pk,file="derived/pk.csv", write.csv=FALSE, ### arguments that tailors text for Nonmem nmdir.data="../derived", nm.drop="PROFDAY", nm.copy=c(CONC="DV"), nm.rename=c(BBW="WEIGHTB"), ## PSN compatibility nm.capitalize=TRUE)
## example: pick run1*.mod models <- list.files("../models", pattern="run1.+\\.mod$", full.names=T) ## update $INPUT and $DATA lapply(models,NMwriteSection,list.sections=nmCode) ## update $INPUT lapply(models, NMwriteSection,section="INPUT",newlines=nmCode$INPUT)
## example: pick run1*.mod NMwriteSection(dir="../models", file.pattern="run1.+\\.mod$", section="INPUT", newlines=nmCode$INPUT)
\normalsize
:::: :::
\framesubtitle{Ensure that the data can be traced back to the data generation script}
::: columns
:::: column
* If the argument script
is
supplied to NMwriteData
, a little meta information is saved together with the output file(s).
For csv files, the meta data is written to a txt file next to the csv file.
For rds files, the meta data is attached to the object saved in the
rds
file.
NMstamp
is used under the hood. You can use NMstamp
on any R object to attach similar meta information. Additional arguments (essentially anything) can be passed from NMwriteData
to NMstamp
using the argument args.stamp
.
NMstamp
and NMinfo
write and read an "attribute" called NMdata
.
:::: :::: column
\footnotesize
NMwriteData(pk,file="derived/pk.csv", script = "NMdata_Rpackage.Rmd",quiet=T) list.files("derived") ## NMreadCsv reads the metadata .txt file if found pknm <- NMreadCsv("derived/pk.csv") NMinfo(pknm) ## The .rds file contains the metadata already pknm2 <- readRDS("derived/pk.rds") NMinfo(pknm2)
:::: :::
\normalsize
NMscanData
is an automated and general reader of Nonmem.
Based on the list file (.lst
) it will:
Nonmem
modelNonmem
(e.g. observations or subjects that are not part of the
analysis)\pause \footnotesize
::: columns :::: column
file1.lst <- system.file("examples/nonmem/xgxr003.lst", package="NMdata") res0 <- NMscanData(file1.lst,merge.by.row=FALSE)
:::: \pause :::: column
class(res0) dims(res0) head(res0,n=2)
\normalsize :::: :::
Using a unique row identifier for merging data is highly recommended:
\footnotesize
res1 <- NMscanData(file.nm("xgxr001.lst"),merge.by.row=TRUE) class(res0)
\normalsize
col.row
if found.col.row
is ROW
. We shall see later how to modify this.\framesubtitle{Example: quickly get from a list file to looking at the model}
\footnotesize :::::::::::::: {.columns} ::: {.column width="45%"}
## Using data.table for easy summarize res1 <- NMscanData(file1.lst,merge.by.row=TRUE, as.fun="data.table",quiet=TRUE) ## Derive geometric mean pop predictions by ## treatment and nominal sample time. Only ## use sample records. res1[EVID==0, gmPRED:=exp(mean(log(PRED))), by=.(trtact,NOMTIME)]
::: ::: {.column width="55%"}
\normalsize
## plot individual observations and geometric ## mean pop predictions. Split (facet) by treatment. ggplot(subset(res1,EVID==0))+ geom_point(aes(TIME,DV))+ geom_line(aes(NOMTIME,gmPRED),colour="red")+ scale_y_log10()+ facet_wrap(~trtact,scales="free_y",ncol=2)+ labs(x="Hours since administration", y="Concentration (ng/mL)")
::: ::::::::::::::
:::::::::::::: {.columns} ::: {.column width="45%"}
NMdataConf(as.fun="data.table") system.file("examples/nonmem/xgxr014.lst", package="NMdata")
\footnotesize
res2 <- NMscanData(file1.lst, merge.by.row=TRUE,recover.rows=TRUE)
:::
::: {.column width="55%"}
## Derive another data.table with geometric mean pop predictions by ## treatment and nominal sample time. Only use sample records. res2[EVID==0&nmout==TRUE, gmPRED:=exp(mean(log(PRED))), by=.(trtact,NOMTIME)] ## plot individual observations and geometric mean pop ## predictions. Split by treatment. ggplot(res2[EVID==0])+ geom_point(aes(TIME,DV,colour=flag))+ geom_line(aes(NOMTIME,gmPRED))+ scale_y_log10()+ facet_wrap(~trtact,scales="free_y",ncol=2)+ labs(x="Hours since administration",y="Concentration (ng/mL)")
::: ::::::::::::::
NMscanMultiple
A wrapper of NMscanData
that reads and stacks multiple models.
:::::::::::::: {.columns} ::: {.column width="45%"} \footnotesize
NMdataConf(as.fun="data.table") NMdataConf(col.row="ROW") NMdataConf(merge.by.row=TRUE)
## notice fill is an option to rbind with data.table lst.1 <- system.file("examples/nonmem/xgxr001.lst", package="NMdata") lst.2 <- system.file("examples/nonmem/xgxr014.lst", package="NMdata") res1.m <- NMscanData(lst.1,quiet=TRUE) res2.m <- NMscanData(lst.2,quiet=TRUE, modelname="single-compartment") res.mult <- rbind(res1.m,res2.m,fill=T) res.mult[EVID==0&nmout==TRUE, gmPRED:=exp(mean(log(PRED))), by=.(model,trtact,NOMTIME)] ## NMdata class gone because of rbind class(res.mult)
models <- file.nm(c("xgxr001.lst","xgxr014.lst")) res.mult <- NMscanMultiple(files=models,quiet=T) ## Deriving geometric mean PRED vs time for each ## model and treatment res.mult[EVID==0&nmout==TRUE, gmPRED:=exp(mean(log(PRED))), by=.(model,trtact,NOMTIME)]
NMscanMultiple
can search for models by matching file names to a regular expression, similarly to NMwriteSection
. ::: ::: {.column width="55%"} \normalsize
ggplot(res.mult,aes(NOMTIME,gmPRED,colour=model))+ geom_point(aes(TIME,DV), alpha=.5,colour="grey")+ geom_line(size=1.1)+ scale_y_log10()+ labs(x="Hours since administration",y="Concentration (ng/mL)")+ facet_wrap(~trtact,scales="free_y",ncol=2)
::: ::::::::::::::
::: columns
:::: column
By default, NMscanData
will look for an rds file next to the csv file (same file name, only extension .rds different).
If this is found, it will be read, providing an enriched (e.g. conserving factor levels and any other information).
There are no checks of consistency of rds
file against delimited file read by Nonmem
.
I am interested in ideas on how to do this. If we can avoid reading the csv file, it would be highly prefered.
You get the rds automatically if using NMwriteData
.
Disable looking for the rds by argument use.rds=FALSE
.
Default value of use.rds
can be modified with NMdataConf
.
::::
:::: column
The plots are correctly ordered by doses - because they are ordered by factor levels as in rds
input data.
\footnotesize
lst <- system.file("examples/nonmem/xgxr014.lst", package="NMdata") res14 <- NMscanData(lst,quiet=TRUE)
## Derive another data.table with geometric mean pop predictions by ## treatment and nominal sample time. Only use sample records. res14[EVID==0&nmout==TRUE, gmPRED:=exp(mean(log(PRED))), by=.(trtact,NOMTIME)] ## plot individual observations and geometric mean pop ## predictions. Split by treatment. ggplot(res14[EVID==0])+ geom_point(aes(TIME,DV,colour=flag))+ geom_line(aes(NOMTIME,gmPRED))+ scale_y_log10()+ facet_wrap(~trtact,scales="free_y",ncol=2)+ labs(x="Hours since administration",y="Concentration (ng/mL)")
:::: :::
\normalsize
::: columns
:::: column
Most important message: an NMdata
object can be used as if it weren't.
Methods defined for NMdata
:
summary
: The information that is written to the console if quiet=FALSE
.Simple other methods like rbind
and similar are defined by dropping the NMdata
class and then perform the operation.
NMinfo
lists metadata from NMdata
objects and only works on NMdata
objects. Components in metadata are (as available):NMinfo(res1,"details")
: How was the data read and combined?NMinfo(res1,"dataCreate")
: Meta data found attached to the input data file.NMinfo(res1,"input.colnames")
: The translation table of input column names from input to outputNMinfo(res1,"input.filters")
: The "filters" (IGNORE/ACCEPT) from Nonmem and how they are applied in R.NMinfo(res1,"tables")
: What tables were read and how?NMinfo(res1,"columns")
: What columns were read from what tables?
:::::::: column \tiny
class(res1) NMinfo(res1,"details")
::::
:::
\framesubtitle{What data was read?}
::: columns :::: column
\scriptsize
NMinfo(res1,"tables")
:::: :::: column
(The nrows
and topn
arguments are arguments to print.data.table
to get a top and bottom snip of the table.)
\scriptsize
print(NMinfo(res1,"columns"),nrows=20,topn=10)
:::: :::
\framesubtitle{Check of usual suspect: DATA}
::: columns
:::: column
NMcheckColnames
lists column names
- As in input data set
- As in Nonmem $DATA
- As inferred by NMscanInput
(and NMscanData
)
This will help you easily check if $DATA
matches the input data file.
This is a new function that will be available in the next NMdata
release
A more advanced idea is some automated guessing if mistakes were made. This is currently not on the todo list
::::
:::: column
In this case, input column names are aligned with $DATA
\footnotesize
NMcheckColnames(lst)
\normalsize :::: :::
NMscanData
?The answer to this should be as close to "nothing" as possible - that's more or less the aim of the function.
(As always) you just have to make sure that the information that you need is present in input data and output data.
No need to output information that is unchanged from
input, but make sure to output what you need (like IPRED
, CWRES
, CL
,
ETA1
etc which cannot be found in input). Always output the row identifier!
Some of these values can be
found from other files generated by Nonmem
but notice: NMscanData
only uses input and output data.
Including a unique row identifier in both input and output data is the most robust way to combine the tables.
I would not take "most likely" when robustness is available.
In firstonly
tables, include the subject ID or the row identifier.
NMscanData
limitationsThe most important limitation to have in mind is not related to NMscanData
iteself
NMfreezeModels
does that and will be included in NMdata
after a little more testing.Nonmem
can be run in a wrapper script that either copies the input
data, or runs NMscanData
and saves the output in a compressed file
format (like rds
). Even if limitations of NMscanData
may be several, they are all rare. There is a very good chance you will never run into any of them.
Not all data filter statements implemented. Nested ACCEPT
and IGNORE
statements are not supported at this
point. The resulting number of rows after applying filters is checked
against row-level output table dimensions (if any available).
Disjoint rows with common ID
values are currently not supported together with firstonly
or lastonly
tables. This is on the todo list.
The RECORDS
and NULL
options in $DATA
are not implemented. If using
RECORDS
, please use the col.row
option to merge by a unique row
identifier.
Character time variables not interpreted. If you need this, we can implement it relatively easily.
Only output tables returning either all rows or one row per
subject can be merged with input. Tables written with options like
FIRSTLASTONLY
(two rows per subject) and OBSONLY
are disregarded
with a warning (you can read them with NMscanTables
). LASTONLY
is
treated like FIRSTONLY
, i.e. as ID-level information if not
available elsewhere.
NMscanData
uses a few simpler functions to read all the data it can find. These functions may be useful when you don't want the full automatic package provided by NMscanData
.
NMreadTab
TABLE NO.
" counterNMscanTables
(uses NMreadTab
)NMreadCsv
NMscanInput
(uses NMreadCSV
)\framesubtitle{Tailor NMdata
default behavior to your setup and preferences}
::: columns
:::: column
NMdataConf
supports changing many default argument values, simplifying coding.
Notice, values are reset when library(NMdata)
or NMdataConf(reset=TRUE)
are called.
See all currently used values by NMdataConf()
.
::::
:::: column My initialization of scripts often contain this:
library(NMdata) NMdataConf(as.fun="data.table" ### this is the default value ,col.row="ROW" ### Recommended but _right now_ not default ,merge.by.row=TRUE ### You can switch this when script is final ,quiet=FALSE)
::::
:::
Other commonly used settings in NMdataConf
are
as.fun
: a function to apply to all objects before returning them from NMdata
functions. If you use dplyr/tidyverse
, do (notice, no quotes!):library(tibble) NMdataConf(as.fun=tibble::as_tibble)
use.input
: Should NMscanData
combine (output data) with input data? (default TRUE
)
recover.rows
: Should NMscanData
Include rows not processed by Nonmem? (default FALSE
).
file.mod
: A function that translates the list file path to the input control stream file path. Default is to replace extension with .mod
.
check.time
: Default is TRUE
, meaning that output (list file and tables) are expected to newer than input (control stream and input data). If say you copy files between systems, this check may not make sense.
NMdata
not use options()
?R
has a system for handling settings. NMdata
does not use that.
NMdataConf
can check both setting/argument names and values for consistency. try(NMdataConf(asfun=tibble::as_tibble)) try(NMdataConf(use.input="FALSE"))
NMdataConf
:NMdataConf(reset=TRUE)
NMdataConf(use.input=NULL, as.fun=NULL)
NMdataConf()
NMdata
qualified?library(devtools) res.test <- test() Ntests <- sum(sapply(res.test,function(x)length(x$results)))
\includegraphics[width=.8\textwidth]{badges_snip_210623}
NMdata
contains very little calculations (only exception may be flagsAssign
/flagsCount
)
Historic bugs have mostly resulted in uninformative errors due to e.g. failure in processing text. Never a wrong data set.
NMdata
includes r Ntests
"unit tests" where results of function calls
with different datasets and arguments are compared to expected
results
Tests are consistently run before any release of the package
The tests are crucial in making sure that fixing one bug or introducing a new feature does not introduce new bugs
The testing approach is as recommended in "R packages" by Hadley Wickham and Jennifer Bryan https://r-pkgs.org/tests.html.
If you have a specific example you want to make sure is tested in the package, we will include the test in the package
NMdata
NMdata
NMdata
more accessible and usefulIf you have ideas you want to contribute, let's discuss!
Additional features
NMfreezeModels
: Save Nonmem models with input data and all results to ensure reproducibility of output
::: columns :::: column Data creation
renameByContents
compareCols
mergeCheck
flagsAssign
/flagsCount
NMorderColumns
NMcheckData
)NMwriteData
NMstamp
/NMinfo
Read/write Nonmem control streams
NMreadSection
/NMwriteSection
::::
:::: column
Retrieve data from Nonmem runs
NMscanData
NMscanMultiple
summary
, NMinfo
NMscanInput
, NMreadCsv
NMscanTables
, NMreadTab
NMcheckColnames
Adjust behavior to your preferences
NMdataConf
Other
NMfreezeModels
):::: :::
NMdata
functions under development:::::::::::::: {.columns} ::: {.column width="85%"} In order to ensure reproducibility, any output has to be produced based on arvhived/frozen Nonmem models. ::: ::: {.column width="15%"} \includegraphics[width=.5in]{figures/worksign.png} ::: ::::::::::::::
The components that need to be "frozen" are
NMfreezeModels
does freeze
Limitations
:::::::::::::: {.columns} ::: {.column width="85%"} * A function to read frozen Nonmem results and mrgsolve code to ensure that the right simulation model and parameter values are used
NMdata
ggwrite
: Flexible saving of tracable outputSaves images in sizes made for powerpoint, including stamps (time, source, output filename). It can save multiple plots at once as one file (pdf) or multiple files.
::: columns
:::: column
ggwrite
is a wrapper of png
and pdf
(and dev.off
) with convenience features such as
?canvasSize
)save
and show
arguments for very simple conditional behaviorsave
defaults to TRUE
if a filename is givenshow
defaults to the inverse of save
::::
:::: column## install.packages("tracee",repos="https://cloud.r-project.org") library(tracee)
\footnotesize
writeOutput <- TRUE script <- "path/to/script.R" p1 <- ggplot(res1,aes(PRED,DV,colour=TRTACT))+geom_point()+ geom_abline(slope=1)+ scale_x_log10()+scale_y_log10() ggwrite(p1,file="results/pred_dv.png", script=script, save=writeOutput)
:::: :::
execSafe
: Save input data with each Nonmem runAdd the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.