library(knitr)
opts_chunk$set(fig.path='plots/',fig.align='center',fig.show='hold',size='footnotesize',cache=F)

library(ggplot2)
theme_set(theme_bw(base_size=14)+
          theme(legend.position="bottom")+
          ## purple facet strip label background and white text
          theme(strip.background =element_rect(fill="#52247F")
               ,strip.text=element_text(color = "#ffffff"))
          )
## theme_set(theme_bw(base_size=14)+
##           theme(legend.position="bottom")+
##           theme(strip.background =element_rect(fill="#ff9dff"))
##           )
library(data.table)
unloadNamespace("NMdata")

## some setup
options(width=60)  # make the printing fit on the page
set.seed(1121)   # make the results repeatable

### shortcuts to examples in NMdata
file.data <- function(...) system.file("examples/data",..., package="NMdata")
file.nm <- function(...) system.file("examples/nonmem",..., package="NMdata")

```{css, echo=FALSE} .watch-out { background-color: lightpink; border: 3px solid red; font-weight: bold; } .smaller { background-color: lightgreen; font-size: 4pt; }

## Outline
\tableofcontents[hideallsubsections]


# Introduction

## What is NMdata?
::: columns
:::: column
### NMdata is 

An R package that can help

*  Creating and checking event-based data sets for PK/PD modeling
*  Keeping Nonmem code updated to match contents of datasets
*  Read all output data and combine with input data from Nonmem runs
- supply output list file (.lst), and the reader is very flexible and automated 

Designed to fit in to the user's setup and coding preferences

* NMdata comes with a configuration tool that can be used to tailor default behaviour to the user's system configuration and preferences.

::::

:::: column

### NMdata is not

* A plotting package

* A tool to retrieve details about model runs

* A calculation or simulation toolbox

* A "silo" that requires you to do things in a certain way
- No tools in NMdata requires other NMdata tools to be used

::::
:::

$$\vspace{.01in}$$

* The data creation tools should be relevant independently of estimation/simulation tool.
* Latest stable release is 0.0.12 and is available on CRAN and MPN (starting from 2022-06-15 snapshot).

## NMdata 0.0.12 on MPN
\includegraphics[width=3.5in]{figures/nmdata_mpn_2022-06-27 22-03-35.png}


<!-- ## Who can find NMdata useful? -->

<!-- * The data set creation tools are relevant no matter the estimation and simulations tools. -->

<!-- * Nonmem users will find additional tools for handling the exchange of data between R and Nonmem. -->


<!-- ## About the author -->
<!-- * Pharmacometrician with experience from biostatistics -->

<!-- * Background in engineering, experience as system administrator, 15 years of R experience -->

<!-- * Very concerned with code robustness and ensuring data consistency. -->

<!-- * Authored an R package on safe data transfer from SAS to R and one on survival analysis.  -->

<!-- I hate being stuck in leg work and having too little time for modeling, -->
<!-- reflection, and understanding key questions. `NMdata` is a big help for -->
<!-- me personally in freeing time to more high-level tasks. -->


<!-- Lots of work missing on this one -->
<!-- ## Motivation -->

<!-- PK/PD modeling is technically extremely heavy. We want to do provide clarity to decision making, but spend a lot of our time in deep mud. -->

<!-- `NMdata` is my humble experience collected in efficient functions that fill some holes and help with some of the most annoying design -->

## How to update to recent MPN snapshot
Update the `pkgr.yml` file: 
(example: `prod_vx123_001_analysis/trunk/analysis/vx_123_001_project/pkgr.yml`)

Version: 1 Threads: 1 Packages: - NMdata

this will allow packages to be cached across projects for faster installs on new projects

Cache: /data/prod_vx708_001_pkgcache-2022-06-15 Repos: - MPN: https://mpn.metworx.com/snapshots/stable/2022-06-15 Lockfile: Type: renv

Then go to `prod_vx123_001_analysis/trunk/analysis/vx_123_001_project` and install/update packages from the linux terminal (not R):

$ cd /data/prod_vx123_001_analysis/trunk/analysis/vx_123_001_project $ pkgr --update install

## Motivation
* The workflow of a pharmacometrician is very technical, with many risks of errors.

* Technical workload takes time from modeling, reflection, and
understanding key questions. 

* During the first 2-3 years I spent in pharmacometrics, I must have spent half the time coding, desparately trying to get Nonmem to behave, and to understand the properties of the estimates I obtained.

* Most of us develop our own ways to avoid some of the many
difficulties in this process. This takes a lot of time and is most
often only because we don't have adequate tools at hand. Or don't know them.

* I generalized some of my solutions and collected them in `NMdata`.

* Almost every single line of code in the package is motivated by bad
experiences. Errors, fear of errors, time wasted on debugging and
double checking.

* I have no intention of missioning these approaches to others. But if
you find something interesting, feel free to take advantage.


<!-- This could become a good slide, but so far not ready at all -->
<!-- ## Overview of NMdata functionality -->
<!-- * Data creation -->
<!-- - Checking of compatibility of data.frames. -->
<!-- - Merge with automated checks  -->

<!-- * Nonmem control stream editing -->

<!-- * Retrieve data from Nonmem -->




## Getting started
Install from `CRAN` or from `MPN` using `pkgr`.
```r
library(NMdata)
library(devtools)
load_all()
NMdataConf(check.time=FALSE)
NMdataConf(as.fun="data.table")

Three vignettes are available so far (see "Vignettes" tab when visiting URL above):

For a quick overview (after installation), do:

help(package="NMdata")

All functions and their arguments are documented in ?manual pages.

pk <- readRDS(file=system.file("examples/data/xgxr2.rds",package="NMdata"))
pk[,trtact:=NULL]
## will create this in the example
pk[,ROW:=NULL]

pk.reduced <- copy(pk)
pk.reduced <- pk.reduced[1:(.N%/%2)]
pk.reduced[,CYCLE:=NULL]
pk.reduced[,AMT:=as.character(AMT)]

Data set creation

Compare compatibility of data sets for rbind and merge: compareCols

::: columns :::: column

A slightly modified version of the pk dataset has been created.

opt.old <- options(width=50)

:::: :::: column

compareCols(pk,pk.reduced)
options(opt.old)

\vspace{12pt}

Before merging or stacking, we may want to

:::: :::

Keep track of missing values

missings <- listMissings(pk)
head(missings)

You can specify

From ?listMissings:

Usage:

     listMissings(data, cols, by, na.strings = c("", "."), quiet = FALSE, as.fun)

Rename columns based on contents

::: columns :::: column

renameByContents

NMisNumeric

::::

:::: column All column names are capital case. We rename to lowercase those that Nonmem will not be able to interpret as numeric. \footnotesize

pk.old <- copy(pk)
pk <- renameByContents(data=pk,
                       fun.test=NMisNumeric,
                       fun.rename = tolower,
                       invert.test = TRUE)

compareCols shows that four columns were renamed:

compareCols(pk.old,pk)

\normalsize ::::

:::

Automated checking of merges

The rows that come out of the merge are the exact same as in one of the existing datasets, only columns added from the second dataset

mergeCheck is not a new implementation of merge. It's an implementation of checks.

Is mergeCheck slower?

mergeCheck

\framesubtitle{Example: Would your standard checks of merges capture this?}

dt.cov <- pk[,.(ID=unique(ID))]
dt.cov[,COV:=sample(1:5,size=.N,replace=TRUE)]
dt.cov <- dt.cov[c(1,1:(.N-1))]

Say we want to add a covariate from a dt.cov. We expect the number of rows to be unchanged from pk. mergeCheck more strictly requires that we get all and only the same rows:

::: columns :::: column

Without mergeCheck

\footnotesize

## The resulting dimensions are correct
pkmerge <- merge(pk,dt.cov,by="ID")
dims(pk,dt.cov,pkmerge)

## But we now have twice as many rows for this subject
dims(pk[ID==31],pkmerge[ID==31])

::: :::: column

mergeCheck throws an error

...and suggests what is wrong \footnotesize

try(mergeCheck(pk,dt.cov,by="ID")) 

:::: \normalsize :::

Conclusion

If you only want to add columns by a merge, mergeCheck does all the necessary checks for you.

Exclusion flags

\framesubtitle{Keep track of data exclusions - don't discard!}

FlagsAssign

::: columns :::: column

::::

:::: column \footnotesize

pk[,`:=`(FLAG=NULL,flag=NULL)]
dt.flags <- fread(text="FLAG,flag,condition
10,Below LLOQ,BLQ==1
100,Negative time,TIME<0")

pk <- flagsAssign(pk,tab.flags=dt.flags,subset.data="EVID==0")
pk <- flagsAssign(pk,subset.data="EVID==1",flagc.0="Dosing")

::::

:::

flagsCount

opts <- options(width=100)

\footnotesize

flagsCount(data=pk[EVID==0],tab.flags=dt.flags)
options(opts)

Finalize data for Nonmem

Advice: always include a unique row identifier

::: columns :::: column

Why

A unique identifier is needed in order to

The identifier should be

:::: :::: column

Sort rows and add a row counter

## order
setorder(pk,ID,TIME,EVID)
## add counter
pk[,ROW:=.I]
pk <- pk %>%
    arrange(ID,TIME,EVID) %>%
    mutate(ROW=1:n())

:::: :::

NMorderColumns

::: columns :::: column \vspace{12pt} * The order of columns in Nonmem is important for two reasons.

:::: :::: column \footnotesize

pk.old <- copy(pk)
pk <- NMorderColumns(pk,first="WEIGHTB")

\normalsize We may want to add MDV and rerun NMorderColumns. \footnotesize

data.table(old=colnames(pk.old),new=colnames(pk))

:::: \normalsize :::

NMcheckData: Check data syntax for Nonmem compatibility

::: columns :::: column * NMcheckData contains a very long list of checks of especially the standard Nonmem columns (ID, TIME, EVID, AMT, DV, MDV, RATE, SS, etc.). They are all checked for allowed values (e.g. TIME must be non-negative, EVID must be one of 0:4, etc).

:::: :::: column \scriptsize

pk <- pk[ID>59]
res.check <- NMcheckData(pk)
res.check
pkmod <- copy(pk)
pkmod[,MDV:=as.numeric(is.na(DV))]
pkmod[ID==60&EVID==1,CMT:=NA]
res.check <- NMcheckData(pkmod)
res.check

:::: :::

NMwriteData

::: columns :::: column

For the final step of writing the dataset, NMwriteData is provided.

The csv writer is very simple

These are the only steps involved between the supplied data set and the written csv.

\footnotesize

file.csv <- fnExtension(file,".csv")
fwrite(data,na=".",quote=FALSE,row.names=FALSE,scipen=0,file=file.csv)

\normalsize

All arguments to fwrite can be modified using the args.fwrite argument.

:::: :::: column \footnotesize

NMwriteData(pk,file="derived/pk.csv")

\normalsize

\vspace{12pt}

:::: :::

Update Nonmem control streams

::: columns :::: column

Tips

:::: :::: column

\footnotesize

nmCode <- NMwriteData(pk,file="derived/pk.csv",
                      write.csv=FALSE,
### arguments that tailors text for Nonmem
                      nmdir.data="../derived",
                      nm.drop="PROFDAY",
                      nm.copy=c(CONC="DV"),
                      nm.rename=c(BBW="WEIGHTB"),
                      ## PSN compatibility
                      nm.capitalize=TRUE)
## example: pick run1*.mod
models <- list.files("../models",
                     pattern="run1.+\\.mod$",
                     full.names=T)
## update $INPUT and $DATA
lapply(models,NMwriteSection,list.sections=nmCode)
## update $INPUT 
lapply(models,
       NMwriteSection,section="INPUT",newlines=nmCode$INPUT)
## example: pick run1*.mod
NMwriteSection(dir="../models",
               file.pattern="run1.+\\.mod$",
               section="INPUT",
               newlines=nmCode$INPUT)

\normalsize

:::: :::

Automated documentation of data

\framesubtitle{Ensure that the data can be traced back to the data generation script}

::: columns :::: column * If the argument script is supplied to NMwriteData, a little meta information is saved together with the output file(s).

:::: :::: column

\footnotesize

NMwriteData(pk,file="derived/pk.csv",
            script = "NMdata_Rpackage.Rmd",quiet=T)
list.files("derived")
## NMreadCsv reads the metadata .txt file if found
pknm <- NMreadCsv("derived/pk.csv")
NMinfo(pknm)
## The .rds file contains the metadata already
pknm2 <- readRDS("derived/pk.rds")
NMinfo(pknm2)

:::: :::

\normalsize

Retrieving data from Nonmem runs

NMscanData

NMscanData is an automated and general reader of Nonmem.

Based on the list file (.lst) it will:

\pause \footnotesize


::: columns :::: column

file1.lst <- system.file("examples/nonmem/xgxr003.lst",
                         package="NMdata")
res0 <- NMscanData(file1.lst,merge.by.row=FALSE)

:::: \pause :::: column

class(res0)
dims(res0)
head(res0,n=2)

\normalsize :::: :::

Remember the unique row identifier

Using a unique row identifier for merging data is highly recommended:

\footnotesize

res1 <- NMscanData(file.nm("xgxr001.lst"),merge.by.row=TRUE)
class(res0)

\normalsize

NMscanData

\framesubtitle{Example: quickly get from a list file to looking at the model}

\footnotesize :::::::::::::: {.columns} ::: {.column width="45%"}

## Using data.table for easy summarize
res1 <- NMscanData(file1.lst,merge.by.row=TRUE,
                   as.fun="data.table",quiet=TRUE)
## Derive geometric mean pop predictions by
## treatment and nominal sample time. Only
## use sample records.
res1[EVID==0,
     gmPRED:=exp(mean(log(PRED))),
     by=.(trtact,NOMTIME)]

::: ::: {.column width="55%"}

\normalsize

## plot individual observations and geometric
## mean pop predictions. Split (facet) by treatment.
ggplot(subset(res1,EVID==0))+
    geom_point(aes(TIME,DV))+
    geom_line(aes(NOMTIME,gmPRED),colour="red")+
    scale_y_log10()+
    facet_wrap(~trtact,scales="free_y",ncol=2)+
    labs(x="Hours since administration",
         y="Concentration (ng/mL)")

::: ::::::::::::::

Recover discarded rows

:::::::::::::: {.columns} ::: {.column width="45%"}

NMdataConf(as.fun="data.table")
system.file("examples/nonmem/xgxr014.lst", package="NMdata")

\footnotesize

res2 <- NMscanData(file1.lst,
                   merge.by.row=TRUE,recover.rows=TRUE)

:::

::: {.column width="55%"}

## Derive another data.table with geometric mean pop predictions by
## treatment and nominal sample time. Only use sample records.
res2[EVID==0&nmout==TRUE,
                  gmPRED:=exp(mean(log(PRED))),
                  by=.(trtact,NOMTIME)]
## plot individual observations and geometric mean pop
## predictions. Split by treatment.
ggplot(res2[EVID==0])+
    geom_point(aes(TIME,DV,colour=flag))+
    geom_line(aes(NOMTIME,gmPRED))+
    scale_y_log10()+
    facet_wrap(~trtact,scales="free_y",ncol=2)+
    labs(x="Hours since administration",y="Concentration (ng/mL)")

::: ::::::::::::::

Compare models using NMscanMultiple

A wrapper of NMscanData that reads and stacks multiple models.

:::::::::::::: {.columns} ::: {.column width="45%"} \footnotesize

NMdataConf(as.fun="data.table")
NMdataConf(col.row="ROW")
NMdataConf(merge.by.row=TRUE)
## notice fill is an option to rbind with data.table
lst.1 <- system.file("examples/nonmem/xgxr001.lst",
                     package="NMdata")
lst.2 <- system.file("examples/nonmem/xgxr014.lst",
                     package="NMdata")
res1.m <- NMscanData(lst.1,quiet=TRUE)
res2.m <- NMscanData(lst.2,quiet=TRUE,
                     modelname="single-compartment")

res.mult <- rbind(res1.m,res2.m,fill=T)
res.mult[EVID==0&nmout==TRUE,
         gmPRED:=exp(mean(log(PRED))),
         by=.(model,trtact,NOMTIME)]
## NMdata class gone because of rbind
class(res.mult)
models <- file.nm(c("xgxr001.lst","xgxr014.lst"))

res.mult <- NMscanMultiple(files=models,quiet=T)
## Deriving geometric mean PRED vs time for each
## model and treatment
res.mult[EVID==0&nmout==TRUE,
         gmPRED:=exp(mean(log(PRED))),
         by=.(model,trtact,NOMTIME)]

::: ::: {.column width="55%"} \normalsize

ggplot(res.mult,aes(NOMTIME,gmPRED,colour=model))+
    geom_point(aes(TIME,DV),
               alpha=.5,colour="grey")+
    geom_line(size=1.1)+
    scale_y_log10()+
    labs(x="Hours since administration",y="Concentration (ng/mL)")+
    facet_wrap(~trtact,scales="free_y",ncol=2)

::: ::::::::::::::

Preserve all input data properties

::: columns :::: column By default, NMscanData will look for an rds file next to the csv file (same file name, only extension .rds different).

:::: :::: column The plots are correctly ordered by doses - because they are ordered by factor levels as in rds input data.

\footnotesize

lst <- system.file("examples/nonmem/xgxr014.lst",
                   package="NMdata")
res14 <- NMscanData(lst,quiet=TRUE)
## Derive another data.table with geometric mean pop predictions by
## treatment and nominal sample time. Only use sample records.
res14[EVID==0&nmout==TRUE,
                  gmPRED:=exp(mean(log(PRED))),
                  by=.(trtact,NOMTIME)]
## plot individual observations and geometric mean pop
## predictions. Split by treatment.
ggplot(res14[EVID==0])+
    geom_point(aes(TIME,DV,colour=flag))+
    geom_line(aes(NOMTIME,gmPRED))+
    scale_y_log10()+
    facet_wrap(~trtact,scales="free_y",ncol=2)+
    labs(x="Hours since administration",y="Concentration (ng/mL)")

:::: :::

\normalsize

The NMdata class

::: columns :::: column Most important message: an NMdata object can be used as if it weren't.

Methods defined for NMdata:

Simple other methods like rbind and similar are defined by dropping the NMdata class and then perform the operation.

:::: column \tiny

class(res1)
NMinfo(res1,"details")

::::

:::

The NMdata class

\framesubtitle{What data was read?}

::: columns :::: column

Table-specific information

\scriptsize

NMinfo(res1,"tables")

:::: :::: column

Column-specific information

(The nrows and topn arguments are arguments to print.data.table to get a top and bottom snip of the table.) \scriptsize

print(NMinfo(res1,"columns"),nrows=20,topn=10)

:::: :::

What to do when Nonmem results seem meaningless?

\framesubtitle{Check of usual suspect: DATA}

::: columns :::: column NMcheckColnames lists column names - As in input data set - As in Nonmem $DATA - As inferred by NMscanInput (and NMscanData) This will help you easily check if $DATA matches the input data file. This is a new function that will be available in the next NMdata release A more advanced idea is some automated guessing if mistakes were made. This is currently not on the todo list :::: :::: column In this case, input column names are aligned with $DATA \footnotesize

NMcheckColnames(lst)

\normalsize :::: :::

What should I do for my models to be compatible with NMscanData?

NMscanData limitations

The most important limitation to have in mind is not related to NMscanData iteself

Even if limitations of NMscanData may be several, they are all rare. There is a very good chance you will never run into any of them.

Data read building blocks

NMscanData uses a few simpler functions to read all the data it can find. These functions may be useful when you don't want the full automatic package provided by NMscanData.

Configuration of NMdata defaults

NMdataConf

\framesubtitle{Tailor NMdata default behavior to your setup and preferences} ::: columns

:::: column

::::

:::: column My initialization of scripts often contain this:

library(NMdata)
NMdataConf(as.fun="data.table"
### this is the default value
          ,col.row="ROW"
### Recommended but _right now_ not default
          ,merge.by.row=TRUE
### You can switch this when script is final
          ,quiet=FALSE)

:::: ::: Other commonly used settings in NMdataConf are

library(tibble)
NMdataConf(as.fun=tibble::as_tibble)

Why does NMdata not use options()?

R has a system for handling settings. NMdata does not use that.

try(NMdataConf(asfun=tibble::as_tibble))
try(NMdataConf(use.input="FALSE"))

How is NMdata qualified?

library(devtools)
res.test <- test()
Ntests <- sum(sapply(res.test,function(x)length(x$results)))

\includegraphics[width=.8\textwidth]{badges_snip_210623}

Next steps for NMdata

Next steps for NMdata

Summary

::: columns :::: column Data creation

Read/write Nonmem control streams

Adjust behavior to your preferences

Other

:::: :::

NMdata functions under development

NMfreezeModels

:::::::::::::: {.columns} ::: {.column width="85%"} In order to ensure reproducibility, any output has to be produced based on arvhived/frozen Nonmem models. ::: ::: {.column width="15%"} \includegraphics[width=.5in]{figures/worksign.png} ::: ::::::::::::::

The components that need to be "frozen" are

NMfreezeModels does freeze

Limitations

Safe model reader

:::::::::::::: {.columns} ::: {.column width="85%"} * A function to read frozen Nonmem results and mrgsolve code to ensure that the right simulation model and parameter values are used

Other tools

tracee

ggwrite: Flexible saving of tracable output

Saves images in sizes made for powerpoint, including stamps (time, source, output filename). It can save multiple plots at once as one file (pdf) or multiple files.

::: columns :::: column ggwrite is a wrapper of png and pdf (and dev.off) with convenience features such as

## install.packages("tracee",repos="https://cloud.r-project.org")
library(tracee)

\footnotesize

writeOutput <- TRUE
script <- "path/to/script.R"
p1 <- ggplot(res1,aes(PRED,DV,colour=TRTACT))+geom_point()+
    geom_abline(slope=1)+
    scale_x_log10()+scale_y_log10()
ggwrite(p1,file="results/pred_dv.png",
        script=script,
        save=writeOutput)

:::: :::

execSafe: Save input data with each Nonmem run



philipdelff/NMdata documentation built on March 5, 2025, 12:20 a.m.