knitr::opts_chunk$set(eval = FALSE)

Overview

Automatic running of R scripts at scheduled times is useful for making routine reports that should run regularly, for example outbreak reports and extracting data for statistics like the "Skrantesykestatikk". This vignette describes the different steps to set up an automatic batch job in Windows.

  1. Set up Windows Task Scheduler to run a batch script
  2. Write the bat file that run the R scripts
  3. Write the main R script
  4. Write the R script to check the log

Set up Windows Task Scheduler

You can use the Task Scheduler in Windows to run an R script automatically. To set up the Task Scheduler, see Task Scheduler help. The most important points are mentioned below.

Chose "Create task" ("Opprett oppgave") in the action tab and fill in the name of the task. Use a name that is descriptive, we suggest to use similar names for the task, the bat file (see below) and fill in a description of the task. For the remaining points you may use the standard selections, i.e. the common user for batch-tasks and run only when the user is logged on. Be aware, this means that you cannot log off when you leave the PC running the batch scripts, i.e. you just disconnect from the external desktop.

In the tab "Triggers" ("Utløsere"), chose new trigger and set up the system for when to run the program. In the tab for "Actions" ("Handlinger"), chose new and select the bat file that starts the R scripts (see below). We have chosen to put all the batch-scripts in a common subdirectory at C:. There is usually no need to change the standard settings of the tabs "Conditions" ("Betingelser") and "Settings" ("Innstillinger").

We have a dedicated server to run the batch scripts and we use a dedicated user account to set up and run the tasks. Contact epi to get access to a server to host the script.

Write the bat file

The Task manager calls a bat file to run the R scripts. We call two R scripts, the main script that sources several R scripts to perform various tasks, and an R script to check the log of the main script and report the status and if there were any error or warning message.

A typical bat file can look like this:

"C:\Program Files\R\R-4.0.3\bin\x64\R.exe" CMD BATCH "\\fullpath\batch_outbreak.R"
"C:\Program Files\R\R-4.0.3\bin\x64\R.exe" CMD BATCH "\\fullpath\batch_check_log.R"

Be aware, a path name including spaces must be enclosed in quotes. National characters should be avoided in the path name. If national characters are included, the bat file may not find the R scripts. In case, this may be avoided by the following:

This can be done automatically in the bat file. Do not use this code unless necessary, if something goes wrong, you may have changed the code page at the batch server unintentionally.

:: Save the original code page at the server running the script
for /f "tokens=4" %%i in ('chcp') do set codepage=%%i
:: change the code page to the code page of the server with the scripts, for example 850. 
chcp 850 
:: Run the R scripts
"C:\Program Files\R\R-4.0.3\bin\x64\R.exe" CMD BATCH "\\fullpath\batch_outbreak.R"
"C:\Program Files\R\R-4.0.3\bin\x64\R.exe" CMD BATCH "\\fullpath\batch_check_log.R"
:: Reset the code page at the server running the script
chcp %codepage%

Write the main R script

Give the main script a short, descriptive name. The name of the main script is included in the bat file command. To avoid having to edit the bat file, you should chose a name that you expect will not change. The main script for generating an outbreak report is by convention named batch_outbreak.R.

The main script sources one or more R scripts that together perform the tasks to generate a product, for example a report. The batch script will usually consist of four parts:

Set up the R environment

In this part, you want to set up the R environment for all script that are called (sourced) by the main script. That includes attaching all packages needed and setting up global variables and functions.

Attach packages

In base one use library or require to attach packages. Of these library is preferred as require will not produce an error message if a package isn't attached.

An alternative is to use use_pkg and use_NVIverse from package NVIbatch. These functions accept a vector with package names and will attach the packages if installed, and install the package first if not installed. Use use_pkg for packages at "Cran" and use_NVIverse for packages within the "NVIverse". If the packages need to be installed, the user running the batch needs writing access to the directories for R packages.

# SET UP R ENVIRONMENT ----
## Attach R packages ----
library(NVIbatch)
use_pkg(pkg = c("dplyr", "RODBC"))
use_NVIverse("NVIdb")

Set global variables

The global variables are put in section in the main script. In this example, there are two global variables, the path to the R scripts and today's date. The path to the R script at the batch server cannot be set automatically by package here.

## Paths and filenames ----
# The path to the R scripts.
script_path <- paste0(set_dir_NVI("Hendelser"), "outbreak_123_2022/.../rscripts/")

# Today's date in the format yyyymmdd for use in filenames.
today <- format(Sys.Date(), "%Y%m%d")

Import support data

Very often, you will combine data with information in other registers, for example to translate PJS-codes to descriptive text, or merge with fylke (county) and kommune (municipality) registers. Currently such registers must be loaded/imported before they can be used. This may change when such information becomes stored in databases.

## Import support data ----
# Read translation table for komnr
kommune_fylke <- read_kommune_fylke()

Source the R scripts

These scripts are performing the tasks that shall generate the products and reports. Here an example with some of the scripts that usually will be part of preparing an outbreak report. For simplicity, the names of the scripts in the outbreak report have been standardized.

#### Create population
source(paste0(script_path, "source/", "prepare_population.R"), encoding = "UTF-8")

# Creates PJSradata
source(paste0(script_path, "source/", "extract_pjs_data.R"), encoding = "UTF-8")
source(paste0(script_path, "source/", "clean_pjs_data.R"), encoding = "UTF-8")

# Creates PJSsakprover
source(paste0(script_path, "source/", "prepare_pjs_sakprover.R"), encoding = "UTF-8")

# Creates PJSsaker
source(paste0(script_path, "source/", "prepare_pjs_saker.R"), encoding = "UTF-8")

Mark the end of the main script

Depending of the setup of the batch job, the output log can end with error messages due to lack of writing access. These errors has no importance for the output from the R scripts and therefore there is no need to include them in a check of the log (see below). This can be avoided by printing a message at the end of the main script to mark the end. In the check log, one check for error and warning messages until the end-mark, and skip messages afterwards. There is no need for this step unless you run the check log step.

print("Ferdig batch_outbreak")

Write the R script to check the log

When running R in batch mode, an Rout file is automatically created with the output of the log. The aim of this script is to perform a check of the log output after running the main script, and to report any error or warning message. The script generates an email that is sent to specified email addresses.

By convenience, the script is named "batch_check_log.R" (or previously "batch_save_log.R").

The check log script will usually consist of four parts:

There is need for a routine that delete old log files from the log archive as there isn't any need for keeping track of more than the most recent log files. Such a routine hasn't been developed yet. Therefore, old log files should be deleted manually.

Set up the R environment

The setup of the R environment is done comparable to the setup the R environment for the main script. An example is given below. The packages given in this example is the ones that you will need for this script.

# SET UP R ENVIRONMENT ----
## Attach packages
library(NVIbatch)
use_pkg(pkg = c("sendmailR"))
use_NVIverse(pkg = c("NVIdb"))

## Global variables
# Paths and file names
# The path to the outbreak directory.
domene <- paste0(set_dir_NVI("Hendelser"), "outbreak_123_2022/.../")

# The name of the main script without extension.
# This will also be the name of the Rout file.
batch <- "batch_outbreak"

# Today's date in the format yyyymmdd for use in filenames.
today <- format(Sys.Date(), "%Y%m%d")

Save the log in the archive

The Rout file is saved with a new name that includes today's date before the file extension. By convention, the file is saved in the directory "batchlog".

# Save the Rout file ----
file.copy(from = paste0(domene, "rscripts/", batch, ".Rout"),
          to = paste0(domene, "batchlog/", batch, " ", today, ".Rout"))

Check log for error and warning messages

The function gather_messages is used to gather error and warning messages from the log file. The parameter remove_allowed accepts a vector of error and warning messages that should not be reacted on. Only parts of the message needs to be written and case is ignored (the check is performed using grep and ignore.case = TRUE). Default is remove_allowed = c("was built under R version"), which removes warning messages due to using packages being built on later R versions than the current running the batch. The parameter remove_after accepts a character string with text that marks the end of the Rout file. Any messages after this text will not be reacted upon. Default is remove_after = "Ferdig Batch". Only parts of the message needs to be written and case is ignored.

# CHECK LOG FOR ERROR AND WARNING MESSAGES ----
messages <- gather_messages(filename = paste0(domene, "Rscripts/", batch, ".Rout"),
                            remove_allowed = c("was built under R version"),
                            remove_after = "Ferdig Batch")

Send email with the results of the check

An email that report the status of the check is sent to specified email addresses The email includes:

The subject of the email starts with "Status" if no warning or error messages are found. If any warning or error messages is identified, the subject starts with "Feil i".

The email can be sent to one or more email addresses.

# BUILD EMAIL MESSAGE ----
# Path and name of Rout file
attachmentPathName <- paste0(domene, "Rscripts/", batch, ".Rout")
# Name of attachment, usually the same as the file name
attachmentName <- paste0(batch, ".Rout")
# Key part for attachments, put the body and the mime_part in a list for msg
attachmentObject <- mime_part(x = attachmentPathName, name = attachmentName)

# Include error and warning messages (if any)
# The subject is customized depending on whether there are error and warning messages or not
if (length(messages) > 0) {
  subject <- paste("Feil i", batch)
  body <- list(c("Feilmeldinger", messages), attachmentObject)
} else {
  subject <- paste("Status for", batch)
  body <- list(attachmentObject)
}

# SEND EMAIL ----
sendmail(from = "<section@domain.no>",
         to = c("<responsible.person@domain.no>"),
         subject = subject,
         msg = body,
         control = list(smtpServer = "domain.no"))

The batch script will send an email when the task has been run at the server. If no email is received when expected, this may indicate that the server is down or the user running the tasks have been logged off. If so, one need to restart the server and/or log on with the user account running the script.



NorwegianVeterinaryInstitute/NVIbatch documentation built on Dec. 15, 2024, 3:15 p.m.