R/codebook.R

Defines functions codebook

Documented in codebook

#' Create a codebook for the oTree code
#' @description
#' Create a codebook of your oTree code by automatically scanning
#' your project folder and retrieving the information of the apps'
#' \code{Constants}, \code{Subsession}, \code{Group}
#' and \code{Player} variables.
#' @details
#' This code works only when dictionaries are not used (for
#' example, in the session configurations in \code{settings.py}).
#'
#' Caution 1: Multiline comments are ignored, meaning that all variables
#' commented out in this manner will nevertheless be included in the codebook.
#' In contrast, variables commented out with line comments will not
#' appear in the codebook.
#'
#' Caution 2: If there are commas in the value strings, they might be
#' used to split the text. Please manually insert a backslash symbol
#' in front of the commas to avoid that (i.e., escape them).
#' E.g. \code{"Yes, I will"} -> \code{"Yes\, I will"}.
#'
#' Caution 3: This code cannot interpret variables that were imported from other
#' files (for example CSV files) and that have special formatting
#' included (e.g., special string formatting in Python such
#' as \code{float(1.4)} to represent a float number).
#'
#' Caution 4: This code was developed and tested with basic oTree codes
#' and has not been verified for compatibility with oTree versions
#' later than 5.4.0.
#' If you experience issues with newer versions or more complex code structures,
#' please open an issue on GitHub.
#'
#' Caution 5: Custom exports and variables from
#' the \code{Participant} \code{Session} classes
#' are not part of the codebook. Also built-in variables
#' are not presented in the codebook.
#'
#' Further info: \code{None} values are presented
#' as \code{"None"} (i.e. as a string)
#' in the list and the file output.
#'
#' @param path Character string. Path of the oTree experiment.
#' @param fsource Character string. \code{"init"} if information should be taken
#' from the \code{init.py} files (newer oTree code with 5.x
#' format). \code{"models"}
#' (or \code{"model"}) if the information
#' should be taken from the \code{models.py} files
#' (older oTree code with 3.x format).
#' @param output Character string. \code{"list"} if the output should contain a
#' list of variables and their information. \code{"file"} if the output
#' should be a file such as a Word or PDF file.
#' \code{"both"} if the output should contain a file and a list.
#' @param output_dir Character string. The absolute path where
#' the function's output will be saved.
#' Only absolute paths are allowed for this parameter.
#' Relative paths can be specified in the \code{output_file} parameter.
#' @param output_file Character string.
#' The name of the output file generated by the function.
#' The file name can be provided with or without an extension.
#' Relative paths are also allowed in the file name.
#' @param output_format Character string.
#' Specifies the format of the file output.
#' This value is passed to the \code{output_format}
#' argument of \link[rmarkdown:render]{rmarkdown::render}.
#' Allowed options are: \code{"html_document"}, \code{"word_document"}, \code{
#' "odt_document"}, \code{"rtf_document"}, \code{"md_document"}, \code{
#' "latex_document"}, \code{"pdf_document"}, \code{"pdf_document_simple"},
#' or their short forms \code{"html"}, \code{"word"}, \code{"odt"}, \code{
#' "rtf"}, \code{"md"}, \code{"latex"}, \code{"pdf"}, \code{"pdf_simple"}.
#' Important: The \code{"pdf_document"} format uses \code{xelatex} for
#' PDF generation.
#' If your document does not require advanced LaTeX features,
#' it is recommended to use \code{"pdf_document_simple"}.
#' @param output_open Logical. \code{TRUE} if file output should
#' be opened after creation.
#' @param app_doc Logical. \code{TRUE} if app documentation should be
#' included in the output file.
#' @param app Character string or character vector.
#' Name of the included app(s).
#' Default is to use all apps.
#' Cannot be used simultaneously with \code{app_rm}.
#' @param app_rm Character string or character vector.
#' Name of the excluded app(s).
#' Default is to exclude no apps.
#' Cannot be used simultaneously with \code{app}.
#' @param doc_info Logical. \code{TRUE} if a message with information on all
#' variables without documentation should also be returned. \code{FALSE} if
#' this message should be suppressed.
#' @param sort Character vector. Vector that specifies the order of
#' the apps in the codebook.
#' @param settings_replace Character string or \code{NULL}.
#' Specifies how to handle references to settings variables.
#' Use \code{"global"} to replace references with the global settings variables
#' defined in \code{settings.py}.
#' Use \code{"user"} to replace references with the variables
#' provided in the \code{user_settings} argument.
#' Use \code{NULL} to leave references to settings variables unchanged.
#' Caution: This function does not use variables defined
#' in \code{SESSION_CONFIGS}.
#' If you vary settings variables in \code{SESSION_CONFIGS},
#' set \code{settings_replace} to \code{"user"} and manually replace
#' them using the \code{user_}\code{settings} argument.
#' @param user_settings List. List of variables in the \code{settings.py} file
#' that are used to replace setting variable references.
#' This is only used if \code{settings_replace = "user"} and should be used when
#' setting variables are defined within the \code{SESSION_CONFIGS}.
#' @param preamb Deprecated. `preamb = TRUE` is no
#' longer supported. Please remove preambles from your old codebooks.
#' @param encoding Character string. Encoding of the created Markdown file.
#' As in \link[knitr:knit]{knitr::knit}, this argument is
#' always assumed to be \code{UTF-8}
#' and ignored.
#' @param title Character string. Title of output file.
#' @param subtitle Character string. Subtitle of output file.
#' @param include_cons Logical.
#' \code{TRUE} if there should be a section for the \code{Constants} variables
#' in the codebook.
#' @param include_subs Logical.
#' \code{TRUE} if there should be a section for the \code{Subsession} variables
#' in the codebook.
#' @param params List.
#' List of variable name and value pairs to be passed to the RmD file.
#' Only relevant if argument output \code{"file"} or \code{"both"} if chosen.
#' @param date Character string or \code{NULL}.
#' Date that is passed to the Rmd file.
#' Either \code{"today"}, \code{NULL}, or a user defined date.
#' Only relevant if argument output \code{"file"} or \code{"both"} if chosen.
#' @param splitvarname Logical. \code{TRUE} if long variable names should be
#' split across multiple lines in the output file tables.
#' If \code{FALSE}, table columns should adjust to fit the longest
#' variable names.
#' @param sep_list Character string. Determines how sub-lists are displayed
#' in the file output. Use \code{"newline"} to separate sub-lists with
#' newline characters (`\\n`), or \code{"vector"} to display them as
#' strings in `c(...)` format.
#' @param initial Logical. \code{TRUE} if initial values should be included
#' in the output file. \code{FALSE} if they should not be included.
#' @import knitr
#' @import pander
#' @import rmarkdown
#' @import stringr
#' @import utils
#' @returns
#' The function returns two main types of outputs:
#'
#' (a) a list of variables along with their information
#'
#' (b) a file containing the codebook for the experiment
#'
#' If \code{doc_info} is \code{TRUE} it also returns a
#' message containing the names of
#' all variables that have no documentation.
#' @examplesIf rlang::is_installed("withr")
#' # The examples use a slightly modified version of the official oTree
#' # sample codes.
#'
#' # Make a codebook and resort the apps
#' combined_codebook <- codebook(
#'   path = system.file("extdata/ocode_new", package = "gmoTree"),
#'   output = "list",
#'   fsource = "init",
#'   doc_info = FALSE)
#'
#' # Show the structure of the codebook
#' str(combined_codebook, 1)
#' str(combined_codebook$bargaining$Player, 1)
#'
#' # Make a codebook with only the "bargaining" app
#' combined_codebook <- codebook(
#'   path = system.file("extdata/ocode_new", package = "gmoTree"),
#'   output = "list",
#'   fsource = "init",
#'   app = "bargaining",
#'   doc_info = FALSE)
#'
#' # Show the structure of the codebook
#' str(combined_codebook, 1)
#' str(combined_codebook$bargaining$Player, 1)
#'
#' # Make a codebook with all but the "bargaining" app
#' combined_codebook <- codebook(
#'   path = system.file("extdata/ocode_new", package = "gmoTree"),
#'   output = "list",
#'   fsource = "init",
#'   app_rm = "bargaining",
#'   doc_info = FALSE)
#'
#' # Show the structure of the codebook
#' str(combined_codebook, 1)
#' str(combined_codebook$bargaining$Player, 1)
#'
#' # Use oTree code in 3.x format
#' combined_codebook <- codebook(
#'   path = system.file("extdata/ocode_z", package = "gmoTree"),
#'   fsource = "model",
#'   output = "list",
#'   doc_info = FALSE)
#'
#' # Show the structure of the codebook
#' str(combined_codebook, 1)
#'
#' # Show information on missing documentation or complex code
#' combined_codebook <- codebook(
#'   path = system.file("extdata/ocode_new", package = "gmoTree"),
#'   fsource = "init",
#'   output = "list",
#'   app_rm = "bargaining",
#'   doc_info = TRUE)
#'
#' \dontrun{
#'
#' # Create a codebook PDF with authors' names and todays' date
#' codebook(
#'   path = system.file("extdata/ocode_z", package = "gmoTree"),
#'   fsource = "init",
#'   doc_info = FALSE,
#'   output = "file",
#'   output_format = "pdf_document",
#'   date = "today",
#'   title = "My Codebook",
#'   subtitle = "codebook created with gmoTree",
#'   params = list(author = c("Max Mustermann", "John Doe"))
#'   )
#'
#' # Create a codebook PDF and save it in a subfolder of the
#' # current folder:
#' # "C:/Users/username/folder/R_analyses/cb/cb.pdf"
#' getwd() # "C:/Users/username/folder/R_analyses"
#' dir.create("cb")
#' combined_codebook <- gmoTree::codebook(
#'   path = "C:/Users/username/folder/R_analyses/oTree",
#'   fsource = "models",
#'   output = "both",
#'   output_file = "cb/cb.pdf",
#'   output_format = "pdf_document")
#'
#' # You can also omit *.pdf after the file name
#' combined_codebook <- gmoTree::codebook(
#'   path = "C:/Users/username/folder/R_analyses/oTree",
#'   fsource = "models",
#'   output = "both",
#'   output_file = "cb/cb",
#'   output_format = "pdf_document")
#' }

#' @export
codebook <- function(path = ".",
                     fsource = "init",
                     output = "both",
                     output_dir = NULL,
                     output_file = "codebook",
                     output_format = "pdf_document_simple",
                     output_open = TRUE,
                     app_doc = TRUE,
                     app = NULL,
                     app_rm = NULL,
                     doc_info = TRUE,
                     sort = NULL,
                     settings_replace = "global",
                     user_settings = NULL,
                     include_cons = TRUE,
                     include_subs = FALSE,
                     preamb = FALSE,
                     encoding = "UTF-8",
                     title = "Codebook",
                     subtitle = "created with gmoTree",
                     params = NULL,
                     date = "today",
                     splitvarname = FALSE,
                     sep_list = "newline",
                     initial = TRUE) {

  # Stop and load
    # Source original code  ####
      # Define path
      if (!is.null(path)) {
        # Change Windows paths to paths that can be read by Ubuntu
        path <- gsub("\\\\", "/", path)
      } else {
        stop("Path must not be NULL!")
      }

      # Check if path exists
      if (!dir.exists(path)) {
        stop("The path ", path, " does not exist!",
             " You are currently in ",
             getwd())
      }

      # Check if fsource is valid
      if (length(fsource) > 1L) {
        stop("Please enter only one fsource!")
      }

      if (is.null(fsource) ||
          (fsource != "init" &&
           fsource != "model" &&
           fsource != "models")) {
        stop("fsource must be either \"init\", \"model\", or \"models\"!")
      }

      if (fsource == "model" || fsource == "models") {

        files <- list.files(path,
                            pattern = "models\\.py",
                            full.names = TRUE,
                            recursive = TRUE)

      } else if (fsource == "init") {
        files <- list.files(path,
                            pattern = "__init__\\.py",
                            full.names = TRUE,
                            recursive = TRUE)

        # Exclude files from the _builtin folders
        files <- files[grepl("(?<!\\_builtin\\/)__init__\\.py$",
                             files,
                             perl = TRUE)]

      }

      # Check files
      if (length(files) == 0L) {
        stop("No files to process. Ensure the \"fsource\" is correctly ",
             "specified (e.g. \"init\" instead of \"model\") ",
             "and the folder contains \"init\" or \"model\" files.")
      }

    # Output  ####
      # Check output
      if (!is.character(output) ||
          length(output) != 1L ||
          !(output %in% c("list", "both", "file"))) {
        stop("Output should be \"list\", \"both\", or \"file\"!")
      }

      if (!is.null(output) && output != "list") {

        # Check output format
        # Allowed output formats
        allowed_formats <- c(
          html = "html_document",
          pdf = "pdf_document",
          pdf_simple = "pdf_document_simple",
          word = "word_document",
          odt = "odt_document",
          rtf = "rtf_document",
          latex = "latex_document",
          md = "md_document"
        )

        if (is.null(output_format) ||
            length(output_format) != 1L ||
            !(tolower(output_format) %in% names(allowed_formats) ||
              tolower(output_format) %in% allowed_formats)) {
          stop("Output format should be one of the following: ",
               paste(names(allowed_formats), collapse = ", "), " or ",
               paste(allowed_formats, collapse = ", "), "!")
        }

        # Map shorthand to full name
        if (tolower(output_format) %in% names(allowed_formats)) {
          output_format <- allowed_formats[[tolower(output_format)]]
        }

        # If path in file names
        if (is.null(output_file)) {
          stop("Please enter a output_file name!")

        } else {

          # Change Windows paths to paths that can be read by Ubuntu
          output_file <- gsub("\\\\", "/", output_file)

          # If file name starts with /
          output_file <- gsub("^/", "", output_file)
        }

        if (!is.null(output_dir)) {
          output_dir <- gsub("\\\\", "/", output_dir)
        }

        # If dir is a relative path
        if (!is.null(output_dir) &&
            grepl("^\\.", x = output_dir)) {
          stop("Please don't use relative paths in output_dir!")
        }

        # If output_file contains an absolute path,
        # output_dir should not be used
        if (!is.null(output_file) &&
            grepl("^([A-Z]:|/)", output_file) &&
            !is.null(output_dir)) {

          if (startsWith(x = output_file,
                                prefix = output_dir)) {
            output_dir <- NULL

          } else if (substitute(output_dir) == "getwd()" &&
                     startsWith(x = output_file,
                                prefix = getwd())) {
            output_dir <- NULL

          } else {
            stop("When using an absolute path for ",
                 "\"output_file,\" \"output_dir\" should not be used.")
          }
        } else if (!is.null(output_file) &&
                   !grepl("^([A-Z]:|/)", output_file) &&
                   is.null(output_dir)) {
          output_dir <- getwd()
        }

        # If dir is not there
        if (!is.null(output_dir) && !dir.exists(output_dir)) {

          stop("The directory ",
               output_dir,
               " does not exist yet. ",
               "Please create it manually before running this function.")
        }

        # Change output file
        # Add file path
        if (!is.null(output_dir)) {
          output_file <- file.path(output_dir, output_file)
        }

        # Check pandoc
        pandoc.installed <- rmarkdown::pandoc_available()

        if (!pandoc.installed) {
          stop("Pandoc is not installed. ",
               "Please install it from ",
               "https://pandoc.org/getting-started.html ",
               "before proceeding.")
        }
      }

    # Other arguments  ####

    # Preamb deprecated
    if (!isFALSE(preamb)) {
      lifecycle::deprecate_warn(
        when = "1.4.1",
        what = "codebook(preamb)",
        details = " Please remove old preamble text from your old codebooks, as it contains errors."
      )
    }

    # Parameter
    if (!is.null(params) && !is.list(params)) {
      stop("params must be a list!")
    }

    # Settings replace
    if (
      !is.null(settings_replace) &&
      !settings_replace %in% c("global", "user")) {
      stop("settings_replace must be either \"global\", \"user\", or NULL!")
    }

    # Check if app(s) exist(s)
    if (!is.null(app)) {
      if (length(app) == 1L) {
        if (!(app %in% basename(dirname(files)))) {
          stop("App \"", app, "\"is not in oTree code!")
        }
      } else if (length(app) > 1L) {
        for (app_i in seq_along(app)) {

          if (!(app[app_i] %in% basename(dirname(files)))) {
            stop("At least one app, \"",
                 app[app_i],
                 "\" is not in oTree code!")
          }
        }
      }
    }

  # Create variables and environment  ####

    # Create a new environment and initialize
    env <- new.env(parent = emptyenv())
    env$settingspy <- TRUE   # Is there a settings.py file?
    env$settingslist <- character()  # Settings vars that cannot be replaced
    env$usettingslist <- character()  # User sett. vars that cannot be replaced
    env$equalvariables <- character() # Vars with unescaped equal signs?

    # Create vector of variables without documentation
    env$nodocs <- character()

    # Create vector of variables with complex codes
    env$complexcons <- character()

    # Create vector of warnings
    env$warnings <- character()

    # Specify non-variable names
    # (parts in settings.py not used in the codebook)
    nonvariables <- c("ROOMS", "SESSION_CONFIGS", "INSTALLED_APPS",
                      "SESSION_CONFIG_DEFAULTS", "with", "from")

  # Background functions  ####

    # Stop if  ####

    # Settings_replace
    if (length(user_settings) > 0L &&
        settings_replace != "user") {
      stop("settings_replace must be set to \"user\" ",
           "if \"user_settings\" are not empty!")
    }

    # Check if only app or app_rm is specified
    if (!is.null(app) && !is.null(app_rm)) {
      stop("Please specify only \"app\" or \"app_rm!\"")
    }

  # Helping functions  ####
    process_lists <- function(variablevalue,
                             folder_name,
                             current_class,
                             variable,
                             env) {
      skip <- FALSE
      returnlist <- list()

      # One level list (vector, sublist)  ####
      if (!grepl("^\\[\\s*\\[\\s*\\[", variablevalue) &&
          !grepl("^\\[\\s*\\[", variablevalue) &&
          startsWith(variablevalue, "["
          )) {

        skip <- TRUE

        # make [..] to list(...)  ####
        variablevalue <- gsub(pattern = "\\[",
                              replacement = "\\list(",
                              x = variablevalue)

        variablevalue <- gsub(pattern = "\\]",
                              replacement = "\\)",
                              x = variablevalue)

        # Transform string of vector to normal vector  ####
        variablevalue <- evaluatestring(variablevalue)

        returnlist <- variablevalue
      }

      # Three level list, sublists  ####
      if (!skip &&
          stringr::str_detect(string = unlist(variablevalue),
                              pattern = "^\\[\\s*\\[\\s*\\[")) {

        stop("This function does not support lists with more than two levels.",
             " Found in: $", folder_name, "$", current_class,
             "$", variable, ".")
      }

      # Two level lists  ####
      if (!skip &&
          grepl("^\\[\\s*\\[", variablevalue)) {

        # Replace first and last square brackets
        variablevalue <- sub(x = variablevalue,
                             pattern = "^ *\\[",
                             replacement = "")
        variablevalue <- sub(x = variablevalue,
                             pattern = "\\][^]] *$",
                             replacement = "")

        # Extract  each [ ... ] block
        if (stringr::str_detect(variablevalue, "\\[")) {
          variablevalue <- unlist(
            stringr::str_match_all(variablevalue,
                                   pattern = "\\s*\\[.*?\\]\\s*"))
        }

        # Replace first and last square brackets from these blocks
        variablevalue <- sub(x = variablevalue,
                             pattern = "^\\s*\\[ *",
                             replacement = "")

        variablevalue <- sub(x = variablevalue,
                             pattern = "\\s*\\][^]]*$",
                             replacement = "")

        for (variablevalue_i in seq_along(variablevalue)) {

          elem <- variablevalue[variablevalue_i]

          # Split the element into key and value
          parts <- stringr::str_split(elem, ",")[1L]

          # Parts must be in list format afterwards because of mixed types
          parts <- as.list(parts[[1L]])
          parts <- lapply(X = parts,
                          clean_string,
                          equal = FALSE,  # Important!!
                          quotes = TRUE,
                          current_class = current_class,
                          folder_name =
                            paste(folder_name,
                                  errorinfo = "called by parts"),
                          variable = variable)

          parts <- lapply(parts,
                          evaluatestring)

          returnlist[[variablevalue_i]] <- parts

        }
      }

      if (length(returnlist) == 1L) {
        returnlist <- unlist(returnlist)
      }

      return(returnlist)
    }

    # Get vars from Constants or settings.py
    const_sett_vars <- function(matches,
                                current_class,
                                filevars,
                                normalspace,
                                folder_name,
                                env = env) {
      # This is called by process_settings and process_files

      # Get variable names
      # Only those that are on the same indent are measured!
      pattern <- paste0("^\\s{",
                        normalspace,
                        "}[a-zA-Z_0-9]+ *(?=\\s*=)")

      # Vector of variable names
      all_cons_sett_vars <-
        unlist(regmatches(
          x = matches,   # Here still in vector!!!
          m = gregexpr(pattern = pattern,
                       text = matches,
                       perl = TRUE)))

      all_cons_sett_vars <- trimws(all_cons_sett_vars) # Strip spaces etc.

      # Put everything in one line
      matches <- collapse_and_clean_matches(matches)

      # Check if "with" occurs
      if (grepl(x = matches, pattern = "\\nwith")) {
        env$complexcons <-
          c(env$complexcons, paste0("> $", folder_name, "$",
                                   current_class, " (with)\n"))
      }

      # Check if "read_csv" occurs
      if (grepl(x = matches, pattern = "read_csv")) {
        env$complexcons <-
          c(env$complexcons, paste0("> $", folder_name, "$",
                                   current_class, " (read_csv)\n"))
      }

      # Get everything until the second variable is mentioned and modify
      for (cons_sett_i in seq_along(all_cons_sett_vars)) {

        if (all_cons_sett_vars[cons_sett_i] != "with" &&  # TODO? Unnecessary, there is no = sign
            all_cons_sett_vars[cons_sett_i] != "from") {

          # Make pattern
          if (cons_sett_i < length(all_cons_sett_vars)) {

            pattern <- paste0(
              "(?<=\n\\b", all_cons_sett_vars[cons_sett_i], "\\b)",
              "[\\s\\S]*?",
              "(?=",
              "\\n\\b", all_cons_sett_vars[cons_sett_i + 1L], "\\b *=|",
              "\\nwith",
              ")")

          } else if (cons_sett_i == length(all_cons_sett_vars)) {
            pattern <- paste0("(?<=\n",
                              all_cons_sett_vars[cons_sett_i],
                              ")", "[\\s\\S\\\\n]*")
          }

          # Create variable
          if (!(all_cons_sett_vars[cons_sett_i] %in% nonvariables)) {

              # Create variable value for file list
              varval <- unlist(regmatches(
                x = matches,
                m = gregexpr(pattern = pattern,
                             text = matches,
                             perl = TRUE)))

              varval <- clean_string(string = varval,
                                     folder_name = folder_name,
                                     equal = TRUE,
                                     n = TRUE,
                                     quotes = TRUE,
                                     space = TRUE,
                                     brackets = FALSE,
                                     sbrackets = FALSE,
                                     current_class = current_class,
                                     variable = variable)

              # Deal with lists  ####
              if (startsWith(varval, "[")) {
                varval <- process_lists(varval,
                                        folder_name,
                                        current_class,
                                        all_cons_sett_vars[cons_sett_i],
                                        env)
              }

              # Replace variable references within Constants/settings  ####
              # See if there are references to previous variables
              if (is.character(varval)) {

                for (j in seq(cons_sett_i)) {

                  if (j != cons_sett_i && any(grepl(pattern = paste0("\\b",
                                               all_cons_sett_vars[j],
                                               "\\b"),
                                          x = as.character(varval)))) {

                      # If not a list
                      if (length(
                        filevars[[current_class]][[all_cons_sett_vars[j]]]) ==
                        1L) {

                        varval <-
                          gsub(x = varval,
                               pattern = paste0("(?<!settings.)",
                                                all_cons_sett_vars[j]),
                               replacement = filevars[[current_class]][[
                                 all_cons_sett_vars[j]]],
                               perl = TRUE)

                      } else {
                        # Make all to characters
                        replacementlist <- lapply(
                          filevars[[current_class]][[
                            all_cons_sett_vars[j]]],
                          as.character)

                        # Make lists
                        for (i in seq_along(replacementlist)) {
                          replacementlist[i] <-
                              paste0("list(",
                                     paste(replacementlist[[i]],
                                           collapse = ", "),
                                     ")"
                            )
                        }

                        listvec <- "c("  # This should never happen
                        if (is.list(replacementlist)) {
                          listvec <- "list("
                        }

                        varval <-
                          gsub(x = varval,
                               pattern = paste0("(?<!settings.)",
                                                all_cons_sett_vars[j]),
                               replacement = paste0(listvec,
                                 paste(replacementlist,
                                   collapse = ", "),
                               ")"),
                               perl = TRUE)
                      }
                  }
                }
              }

              # If string containing a vector, make this a vector
              # E.g. "c(1,2,3+4)" to c(1,2,7)
              # But also strings to numbers: "2000" to 2000
              if (is.character(varval)) {
                try({
                  custom_env <- new.env(parent = baseenv())  # Create a new environment
                  custom_env$list <- c  # Temporarily assign list to c

                  tmp <- eval(parse(text = varval), envir = custom_env)
                  if (!is.null(tmp) && !is.function(tmp)) {
                    varval <- tmp
                  }
                }, silent = TRUE)
              }

              varval <- repair_list(varval)

              # Add variable to file list
              filevars[[current_class]][[all_cons_sett_vars[cons_sett_i]]] <-
                varval
          }
        }
      }

      # Return all settings or Constants variables
      return(filevars)
    }

    # Replace unmatched parentheses
    replace_unmatched_parentheses <- function(string,
                                              current_class,
                                              folder_name,
                                              variable,
                                              env) {

      open <- stringr::str_count(string, pattern = "\\(")
      close <- stringr::str_count(string, pattern = "\\)")
      opensq <- stringr::str_count(string, pattern = "\\[")
      closesq <- stringr::str_count(string, pattern = "\\]")

      # Round brackets
      if (open == 0L &&
          close == 1L) {
        string <- gsub(x = string, pattern = "\\)", replacement = "")
      } else if (open == 1L && close == 0L) {
        string <- gsub(x = string, pattern = "\\(", replacement = "")
      } else if (!(open == close)) {
        # e.g. if there are more than one opening bracket
        env$complexcons <-
          c(env$complexcons, paste0("> $", folder_name, "$",
                                    current_class, "$", variable,
                                    " (unmatched brackets)\n"))
      }

      # Square brackets
      if (opensq == 0L && closesq == 1L) {
        string <- gsub(x = string, pattern = "\\]", replacement = "")
      } else if (opensq == 1L && closesq == 0L) {
        string <- gsub(x = string, pattern = "\\[", replacement = "")
      } else if (opensq != closesq) {
        # e.g. if there are more than one opening bracket
        env$complexcons <-
          c(env$complexcons, paste0("> $", folder_name, "$",
                                    current_class, "$", variable,
                                    " (unmatched square brackets)\n"))

      } # Don't remove square brackets if they are first and last yet!

      return(string)
    }

    # Clean string
    clean_string <- function(string,
                             folder_name,  # For error info
                             current_class,
                             variable,
                             equal = TRUE,
                             n = TRUE,
                             space = TRUE,
                             quotes = TRUE,
                             brackets = TRUE,
                             sbrackets = TRUE,
                             lastcomma = TRUE
                             ) {

      # Remove unescaped equal signs ####
      # (those usually only happen at the start)
      if (equal) {
          # Remove equal
          string <- stringr::str_replace_all(string, "(?<!\\\\)=", "")

      }

      # Trim leading and trailing spaces  ####
      string <- trimws(string)

      # Quotes: remove documentation first  ####

      string <- removedocstrings(string)

      # Save real quotes first   ####
      string <- gsub(pattern = "\\\\\"",
                     replacement = "<<realquotedouble>>",
                     x = string,
                     perl = TRUE)

      string <- gsub(pattern = "\\\\\\'",  # One more because of '
                     replacement = "<<realquotesingle>>",
                     x = string,
                     perl = TRUE)

      # Remove line breaks  ####
      if (n) {
        # Line breaks breaking strings
        string <- gsub(pattern = paste0("\\\"",
                                        "\\s*",
                                        "\\n",  # removes \n
                                        "\\s*",
                                        "\\\""),
                       replacement = "",
                       x = string)

        string <- gsub(pattern = paste0("\\\"",
                                        "\\s*",
                                        "\\\\\\n",   # removes \\\n
                                        "\\s*",
                                        "\\\""),
                       replacement = "",
                       x = string)

        # Normal line breaks
        string <- gsub(pattern = "\\n",
                       replacement = " ",
                       x = string)

      }

      # Trim white space again  ####
      if (space) {
        string <- trimws(string)
      }

      # Remove last comma in a string  ####
      if (lastcomma) {
        string <- gsub(pattern = ",$",
                       replacement = "",
                       x = string)
      }

      # Brackets (ensure that brackets are processed last!)   ####
      if (brackets) {  #  && !is.na(string)
        string <- replace_unmatched_parentheses(string = string,
                                               current_class = current_class,
                                               folder_name = folder_name,
                                               variable = variable,
                                               env = env)

      }

      # Get real quotes back  ####
      string <- gsub(pattern = "<<realquotedouble>>",
                     replacement = "\"",
                     x = string)

      string <- gsub(pattern = "<<realquotesingle>>",
                     replacement = "\'",
                     x = string)

      # Return   ####
      return(string)
    }

    removefirstlastquote <- function(string) {

      if (is.character(string)) {
        # Escaped
        string <- sub("^\\\\\"(.*)\\\\\"$", "\\1", string)

        string <- sub("^\\\\\'(.*)\\\\\'$", "\\1", string)

        # Non-escaped
        string <- sub("^\\\"(.*)\\\"$", "\\1", string)
        string <- sub("^\\\'(.*)\\\'$", "\\1", string)

        string <- trimws(string)

        return(string)

      } else {
        return(string)

      }
    }

    evaluatestring <- function(string) {
      # Converts a string representation of a vector/list
      # into an actual vector/list and
      # evaluates any arithmetic expressions within the string

      try({
              # Remove spaces after ( and before )
              string <- gsub("\\s*c\\(\\s*", "c\\(", string)
              string <- gsub("\\s*\\)", "\\)", string)
              # Len should be read as length
              string <- gsub("^\\blen\\b\\(", "length\\(", string)

              # Create a custom environment where None is defined as "None"
              custom_env <- new.env(parent = baseenv())
              custom_env$None <- "None"

              # Evaluate the string in the custom environment
              tmp <- eval(parse(text = string), envir = custom_env)

              if (!is.function(tmp)) {
                string <- tmp
              }

      }, silent = TRUE)

      if (!is.null(string)) {
        return(string)
      }
    }

    removedocstrings <- function(string) {

      string <- gsub(pattern = "(?s)^'''(.*)'''$",
                     replacement = "\\1",  # Keep the content in the middle
                     x = string,
                     perl = TRUE)

      string <- gsub(pattern = '(?s)^"""\\n*(.*)\\n*"""$',
                     replacement = "\\1",  # Keep the content in the middle
                     x = string,
                     perl = TRUE)

      # Non-escaped double quotes +
      string <- gsub(pattern = '(?s)^"""(.*)"""$',
                     replacement = "\\1",  # Keep the content in the middle
                     x = string,
                     perl = TRUE)

      # Escaped double quotes +
      string <- gsub(pattern = '(?s)^\\\"\\\"\\\"(.*)\\\"\\\"\\\"$',
                     replacement = "\\1",  # Keep the content in the middle
                     x = string,
                     perl = TRUE)

      return(string)
    }

    delprint <- function(string) {
      if (is.character(string)) {
        string <-
          gsub(
            x = string,
            pattern = "print\\(.*\\)",
            replacement = "")
        string <- trimws(string)
      }
      return(string)
    }

    # Function to split each element at the last comma
    # = to split last part of variable 1 from variable 2 name
    split_at_last_comma <- function(part) {

      f_split_parts <- stringr::str_split(part,
                                          ",(?=[^,]*$)",
                                          n = 2L)[[1L]]

      if (length(f_split_parts) > 1L) {

        return(c(stringr::str_trim(f_split_parts[1L]),
                 stringr::str_trim(f_split_parts[2L])))
      } else {

        return(part)
      }
    }

    # Remove all line comments
    remove_line_comments <- function(file_content) {
      processed_lines <- character()

      for (line in file_content) {
        # Remove single-line comments
        line <- sub("#.*", "", line)

        # Add the processed line to the result
        processed_lines <- c(processed_lines, line)
        # Multiline comments are not processed
      }
      return(processed_lines)
    }

    # Get doc line numbers
    get_doc_lines <- function(file_content) {
      inside_doc <- FALSE
      for (line_nr in seq_along(file_content)) {
        if (startsWith(x = file_content[line_nr],
                       prefix = "doc")) {

          inside_doc <- TRUE
          start <- line_nr

        } else if (inside_doc &&
                   (startsWith(file_content[line_nr], "class") ||
                    startsWith(file_content[line_nr], "def"))) {

          end <- line_nr - 1L
          return(c(start, end))
        }

        if (line_nr == length(file_content)) {

          if (inside_doc) {
            # This should not happen because doc is always at
            # the beginning of a page! But its still there in case there is a
            # messy code

            end <- line_nr
            return(c(start, end))
          } else {
            return(NULL)
          }
        }
      }
    }

    # Get class line numbers
    get_class_lines <- function(file_content, class) {

      inside_class <- FALSE
      if (grepl(pattern = "Constants|C", x = class)) {

        class <- "Constants\\s*\\(|C\\s*\\("
      }

      for (line_nr in seq_along(file_content)) {
        if (stringr::str_detect(file_content[line_nr],
                                paste0("^class ", class))) {

          inside_class <- TRUE
          start <- line_nr

        } else if (inside_class &&
                   startsWith(x = file_content[line_nr],
                              prefix = "class")) {

          end <- line_nr - 1L
          return(c(start, end))
        }

        if (line_nr == length(file_content)) {
          end <- line_nr
          return(c(start, end))
        }
      }
    }

    # Clean matches
    collapse_and_clean_matches <- function(matches) {

      matches <- paste(matches, collapse = "\n") # Put all matches in one string

      # Remove all spaces at the beginning and after each \n
      matches <- gsub(pattern = "\n *",
                      replacement = "\n",
                      x = matches)

      # To make the lookbehind easier down there
      matches <- gsub(pattern = "^\\s*",
                      replacement = "\n",
                      x = matches)

      return(matches)
    }

    # Replace constants values references by actual values
    cons_replace <- function(string, filevars, folder_name, env = env) {
      # Replace Constants with the constants variable.

      pattern <- "(Constants\\.[^ ]+)|(C\\.[^ ]+)"

      # Find all references to constants
      consmatches <- stringr::str_extract_all(string, pattern)
      consmatches <- unlist(consmatches)

      if (!is.null(consmatches) && length(consmatches) > 0L) {

        for (fullvarpattern in consmatches) {

          if (!is.na(fullvarpattern)) {

            var <- sub(pattern = "(Constants\\.)|(C\\.)",
                       replacement = "",
                       x = fullvarpattern)

            myreplacement <- filevars[["Constants"]][[var]]

            if (!is.null(myreplacement)) {
              # First remove possible preceding +
              # (in Python, a + adds strings together)

              # Part before
              string <- sub(pattern = paste0("['\"]?",
                                              "\\s*",
                                              "\\+",
                                              "\\s*", fullvarpattern),
                            replacement = fullvarpattern,
                            x = string)

              # Part after
              string <- sub(pattern = paste0(fullvarpattern,
                                             "\\s*",
                                             "\\+",
                                             "\\s*",
                                             "['\"]?"),
                            replacement = fullvarpattern,
                            x = string)

              # Replace the value
              string <- sub(pattern = fullvarpattern,
                            replacement = myreplacement,
                            x = string)
            } else {
              env$warnings <-
                c(env$warnings,
                  paste0("Variable ", fullvarpattern,
                     " in folder ", folder_name,
                     " is not in Constants and cannot be replaced:"))
            }
          }
        }
      }

      return(string)
    }

    # Replace settings values references by actual values
    replace_settings_f <- function(mystring,
                                 folder_name, # app
                                 combined_codebook,
                                 user_settings,
                                 settings_replace,
                                 env,
                                 e_variable = NULL,
                                 e_key = NULL
                                 ) {

      mystring <- mystring[[1L]]

       # Do nothing if NULL, NA or empty
       if (is.null(mystring) ||
           (length(mystring) == 1L && is.na(mystring))) {

        env$warnings <-
          c(env$warnings,
            "There is an unusual variable in your data! Variable: ",
            e_variable, ".")

        return(mystring)

       } else if (length(mystring) == 1L && mystring == "") {

         return(mystring)
       }

      if (!is.character(mystring)) {
        return(mystring)
      }

      pattern <- "(?<!\\\")settings\\.[_a-zA-Z0-9]+"

      # Check for sublists not necessary here because
      # they are already sublists!

      # Check if the string refers to a settings variable
      settings_matches <-
        unlist(stringr::str_extract_all(mystring, pattern))

      if (!is.null(settings_matches) &&
          length(settings_matches) > 0L) {

        for (fullvarpattern in settings_matches) {

          myreplacement <- NULL

          if (!is.na(fullvarpattern)) {

            # Remove "settings." part of the variable name
            settings_var <- sub(pattern = "settings\\.",
                                replacement = "",
                                x = fullvarpattern)

            # Remove any brackets from the variable name
            settings_var <- gsub(pattern = "\\(",
                                 replacement = "",
                                 x = settings_var)

            settings_var <- gsub(pattern = "\\)",
                                 replacement = "",
                                 x = settings_var)

            if (!is.null(settings_replace) &&
                settings_replace == "global") {

              if (is.null(combined_codebook[["settings"]][[settings_var]])) {

                env$settingslist <- c(env$settingslist,
                                      paste0("> $", folder_name,
                                             "$", e_variable,
                                      ", reference \"settings.", settings_var,
                                      "\"\n"))
              } else {
                myreplacement <- combined_codebook[["settings"]][[settings_var]]
              }

            } else if (!is.null(settings_replace) &&
                       settings_replace == "user") {

              if (!is.null(user_settings) &&
                  settings_var %in% names(user_settings)) {

                myreplacement <- user_settings[[settings_var]]

              } else {
                env$usettingslist <- c(env$usettingslist,
                                       paste0("> $", folder_name,
                                              "$", e_variable,
                                              ", reference \"settings.",
                                              settings_var,
                                              "\"\n"))
              }
            }

            # Replace variable within the whole string
            if (!is.null(myreplacement)) {
              if (length(myreplacement) == 1L) {

                # Replace single value
                mystring <- sub(pattern = fullvarpattern,
                                replacement = myreplacement,
                                x = mystring)

              } else  {
                if (grepl(mystring,
                          pattern = paste0("^", fullvarpattern, "$"))) {

                    mystring <- myreplacement
                } else {
                  # Here exceptionally with c() because
                  # of future calculations with it
                  mystring <- sub(pattern = fullvarpattern,
                                 replacement = paste0("c(", paste(myreplacement,
                                                            collapse = ", "),
                                                      ")"),
                                  x = mystring)

                }

              }
            } else {
                # Do nothing! env$(u)settingslist was filled above
            }
          }
        }
      } else {
        # If there are no references to settings, return string

        return(mystring)
      }

      # If numeric, then evaluate
      if (length(mystring) == 1L) {

        # Here, we can also see sublists e.g. "c(1,2,3)"
        mystring <- evaluatestring(mystring)

      } else {

        # Evaluate single elements
        for (mystring_i in seq_along(mystring)) {

          val <- mystring[mystring_i]
          mystring[mystring_i] <- evaluatestring(val)
        }
      }

      # Might return an integer but that will be a string
      # as soon as it replaces the old variable!
      return(mystring)
    }

    # Repair lists
    repair_list <- function(x) {

      if (!is.list(x)) {
        if (length(x) == 1L) {

          return(x)
        } else {
          x <- as.list(x)

          return(x)
        }
      }

      recursive_flatten <- function(lst) {
        lapply(lst, function(element) {
          if (is.list(element) && length(element) == 1L) {
            return(element[[1L]])  # Extract the single element
          } else if (is.list(element)) {
            return(recursive_flatten(element))  # Recur for nested lists
          } else {
            return(element)
          }
        })
      }

      return(recursive_flatten(x))
    }

  # Functions to process a file  ####
  process_settingspy <- function(file_path, env) {

    file_path <- file.path(file_path, "settings.py")

    folder_name <- basename(dirname(file_path))
    file_content <- readLines(file_path, warn = FALSE)
    file_content <- remove_line_comments(file_content)
    settings <- list()

    # Get variables
    filevars <- const_sett_vars(matches = file_content,
                                current_class = "settings",
                                filevars = settings,
                                normalspace = 0L,
                                folder_name = folder_name,
                                env = env)

    return(filevars)
  }

  process_file <- function(file_path,
                           folder_name,
                           combined_codebook = combined_codebook,
                           env = env) {

    file_content <- readLines(file_path, warn = FALSE)

    # Sometimes, init.py only has 1 line in old oTree
    if (length(file_content) <= 2L) {
      stop("At least one of your init-files is empty. ",
           "Try using the argument \"fsource = \'model\'\".")
    }

    file_content <- remove_line_comments(file_content)
    doc_lines <- get_doc_lines(file = file_content)
    constants_lines <- get_class_lines(file = file_content, "Constants")

    group_lines <- get_class_lines(file = file_content, "Group")
    player_lines <- get_class_lines(file = file_content, "Player")
    subsession_lines <- get_class_lines(file = file_content, "Subsession")

    current_class <- ""
    filevars <- list()

    for (line_nr in seq_along(file_content)) {
      # The first time the class is mentioned
      # the class is set for the next lines

      # App documentation  ####
      if (!is.null(doc_lines) &&
          line_nr == doc_lines[[1L]]) {

        matches <- file_content[(line_nr):doc_lines[[2L]]]

        matches <- paste(matches, collapse = " ")

        matches <- gsub(x = matches,
                        pattern = "^doc",
                        replacement = "")

        matches <- clean_string(string = matches,
                                quotes = TRUE,
                                current_class = current_class,
                                folder_name = folder_name,
                                variable = NULL)

        matches <- removefirstlastquote(matches)

        filevars[["doc"]] <- matches
      }

      # Constants  ####
      if (line_nr == constants_lines[1L]) {

        current_class <- "Constants"
        matches <- file_content[(line_nr + 1L):constants_lines[2L]]

        # Count the spaces at the beginning of each line
        cons_normalspace <- gregexpr("^\\s+", matches[1L])

        cons_normalspace <- attr(cons_normalspace[[1L]],
                                 "match.length")

        # Get variables
        filevars <- const_sett_vars(matches = matches,
                                    current_class = current_class,
                                    filevars = filevars,
                                    normalspace = cons_normalspace,
                                    folder_name = folder_name,
                                    env = env)

        # Clean constants  ####
        for (cons_var_i in seq_along(filevars[["Constants"]])) {

          # If there is a second level
          if (length(filevars[["Constants"]][[cons_var_i]]) > 1L) {

            for (cons_l2 in seq_along(filevars[["Constants"]][[cons_var_i]])) {

              # Delete print commands
              filevars[["Constants"]][[cons_var_i]][[cons_l2]] <-
                delprint(filevars[["Constants"]][[cons_var_i]][[cons_l2]])

              # Replace settings references with the actual variables  ####
              if (length(filevars[["Constants"]][[cons_var_i]][[cons_l2]]) ==
                  1L) {
                filevars[["Constants"]][[cons_var_i]][[cons_l2]] <-
                  replace_settings_f(
                    mystring = filevars[["Constants"]][[cons_var_i]][[cons_l2]],
                    folder_name = folder_name,
                    combined_codebook = combined_codebook,
                    user_settings = user_settings,
                    settings_replace = settings_replace,
                    e_variable = paste0(
                      names(filevars[["Constants"]])[[cons_var_i]],
                      ", element: ",
                      cons_l2
                    ), env = env
                  )

                # Remove first and last quote  ####
                filevars[["Constants"]][[cons_var_i]][[cons_l2]] <-
                  removefirstlastquote(
                    filevars[["Constants"]][[cons_var_i]][[cons_l2]]
                )


              } else {

                filevars[["Constants"]][[cons_var_i]][[cons_l2]] <- sapply(
                  filevars[["Constants"]][[cons_var_i]][[cons_l2]],
                  replace_settings_f,
                  folder_name = folder_name,
                  combined_codebook = combined_codebook,
                  user_settings = user_settings,
                  settings_replace = settings_replace,
                  e_variable = paste0(
                    names(filevars[["Constants"]])[[cons_var_i]],
                    ", element: ",
                    cons_l2),
                  env = env,
                  simplify = FALSE
                )

                # Remove first and last quote
                filevars[["Constants"]][[cons_var_i]][[cons_l2]] <-
                  sapply(filevars[["Constants"]][[cons_var_i]][[cons_l2]],
                    removefirstlastquote,
                    simplify = FALSE
                )
              }


            }

          } else {

            # Delete print command
            filevars[["Constants"]][[cons_var_i]] <-
              delprint(filevars[["Constants"]][[cons_var_i]])

            # Replace all references to the settings with the actual variables
            if (is.character(filevars[["Constants"]][[cons_var_i]])) {
              repl <- replace_settings_f(
                mystring = filevars[["Constants"]][[cons_var_i]],
                folder_name = folder_name,
                combined_codebook = combined_codebook,
                user_settings = user_settings,
                settings_replace = settings_replace,
                e_variable = names(filevars[["Constants"]])[[cons_var_i]],
                env = env)

              filevars[["Constants"]][[cons_var_i]] <- repl

              # Remove first and last quote
              filevars[["Constants"]][[cons_var_i]] <- removefirstlastquote(
                filevars[["Constants"]][[cons_var_i]]
              )

            }
          }

          # Repair lists /vectors  ####
          filevars[["Constants"]][[cons_var_i]] <-
            repair_list(filevars[["Constants"]][[cons_var_i]])
        }
      }

      # Player, Group and Subsession  ####
      if (line_nr == player_lines[[1L]] ||
          line_nr == group_lines[[1L]] ||
          line_nr == subsession_lines[[1L]]
          ) {

        # Get all class text
        if (line_nr == player_lines[1L]) {
          matches <- file_content[(line_nr + 1L):player_lines[2L]]
          current_class <- "Player"

        } else if (line_nr == group_lines[1L]) {
          matches <- file_content[(line_nr + 1L):group_lines[2L]]
          current_class <- "Group"

        } else if (line_nr == subsession_lines[1L]) {
          matches <- file_content[(line_nr + 1L):subsession_lines[2L]]
          current_class <- "Subsession"
        }

        matches <- collapse_and_clean_matches(matches)

        # If there is no class info
        if (
          stringr::str_detect(trimws(matches[1L]),
                              "^pass$")) {
          filevars[[current_class]] <- "Pass"
          next
        }

        # Get variables  ####

        # Variable names  ####
        variables <- unlist(regmatches(
          x = matches,
          m = gregexpr(pattern = "\n[a-zA-Z_0-9]+ *(?= *= *models)",
                       text = matches,
                       perl = TRUE)))

        # Strip spaces etc.
        variables <- trimws(variables)

        # Variable values  ####
        for (variables_i in seq_along(variables)) {

          variable <- variables[variables_i]

          if (variables_i < length(variables)) {
            pattern <- paste0("(?<=\n", variables[variables_i], ")",
                              " *=[\\s\\S]*",
                              "(?=\n", variables[variables_i + 1L], " *=)")

          } else {
            # Last variable until the end
            pattern <- paste0("(?<=\n",
                              variables[variables_i],
                              ")", " *=[\\s\\S\\\\n]*"
            )
          }

          # Create variable in filevars
          filevars[[current_class]][[variable]] <- list()

          # Varmatches
          varmatches <- unlist(regmatches(
            x = matches,
            m = gregexpr(pattern = pattern,
                         text = matches,
                         perl = TRUE)))

          # Remove possible subsequent functions from matches  ####
          varmatches <- sub(
            pattern = "(\\ndef )[\\s\\S\\\\n]*",
            replacement = "",
            x = varmatches,
            perl = TRUE
          )

          # Remove possible subsequent if statements from matches  ####
          varmatches <- sub(
            pattern = "(\\nif )[\\s\\S\\\\n]*",
            replacement = "",
            x = varmatches,
            perl = TRUE
          )

          # Remove print from matches
          varmatches <- delprint(varmatches)

          # Get variable information  ####
          # Get field
          field <-
            stringr::str_extract(varmatches, "(?<=models\\.)[^(]+")

          # Remove field from matches
          varmatches <- sub(
            pattern = paste0(" *= *models\\.", field),
            replacement = "",
            x = varmatches,
            perl = TRUE
          ) # First bracket stays but this is okay and stripped later.

          # Remove last part of matches
          if (grepl(x = varmatches,
                    pattern = "\\)[\n ]*$",
                    perl = TRUE)) {

            # Remove last closing bracket
            varmatches <- sub(
              pattern = "\\,*[\n ]*\\)[\n ]*$",
              replacement = "",
              x = varmatches,
              perl = TRUE
            )

          }

          # If there are no arguments
          if (stringr::str_detect(string = varmatches,
                                  pattern = "[a-zA-Z][^\\n]",
                                  negate = TRUE)) {

            varmatches <- "noargs = TRUE"
          } else {
            varmatches <- paste("noargs = FALSE, ",
                                varmatches, sep = " ")
          }

          # Variable information   ####
          # First split its content at every = sign  ####

          # Check for unescaped equal signs in choice options
          # = within square brackets
          list_with_equals_pattern <- "\\[[^\\]]*[^\\\\]=[^\\]]*\\]"

          if (grepl(pattern = list_with_equals_pattern,
                    x = varmatches,
                    perl = TRUE)) {

            paste(variable)
            env$equalvariables <- c(env$equalvariables,
                                    paste0("\n> $", folder_name, "$",
                                           current_class, "$",
                                           variable))

            next
          }

          split_pattern <- "(?<!\\\\) *= *" # Only non-escaped equal signs
          parts <- str_split(stringr::str_trim(varmatches), split_pattern)[[1L]]

          # Sometimes, there is a comma at the end
          parts[length(parts)] <- sub(pattern = ",\\\n)$",
                                      replacement = "",
                                      x = parts[length(parts)])

          # Combine
          if (length(parts) == 2L) {
            parts <- c(parts[1L], parts[2L])
          } else if (length(parts) > 2L) {

            # Now the value of one variable is together with
            # the variable name of the next variable
            # Apply split_at_last_comma to each element
            # except the first and last

            split_parts <- unlist(lapply(parts[2L:(length(parts) - 1L)],
                                         split_at_last_comma))

            parts <- c(parts[1L],
                       split_parts,
                       parts[length(parts)])
          } else {
            stop("An unexpected error occurred. ",
                 "Please contact the maintainer with details.")
          }

          if (length(parts) %% 2L != 0L) {
            env$equalvariables <- c(env$equalvariables,
                                    paste0("\n> $", folder_name, "$",
                                           current_class, "$",
                                           variable))

            next

          } else {

          }

          # Make key value frame  ####
          # Create an empty list to store your kv_frame
          kv_frame <- data.frame(key = c(),
                                 value = c())

          # Iterate over the vector and fill the kv_frame
          for (j in seq(1L, length(parts), by = 2L)) {
            key <- parts[j]
            value <- parts[j + 1L]
            kv_frame <- rbind(kv_frame,
                              data.frame(key = key,
                                         value = value))
          }

          # Last strip
          kv_frame$key <- gsub(x = kv_frame$key,
                               "\\n",
                               "")

          # Clean key
          kv_frame$key <-
            sapply(kv_frame$key,
                   clean_string,
                   quotes = TRUE,
                   current_class = current_class,
                   folder_name = folder_name,
                   variable = variable)

          # Choices need to be specified  #####
          if ("choices" %in% kv_frame$key) {

            text <- kv_frame$value[kv_frame$key == "choices"]

            # Remove trailing and leading whitespace
            text <- trimws(text)

            # In case the kv_frame works with square brackets
            numbrackets <- length(unlist(gregexpr(pattern = "\\[",
                                               text = text)))

            if (numbrackets > 1L) {  # If key - value pairs
              # Replace first and last square brackets
              text <- sub(x = text,
                          pattern = "^\\[",
                          replacement = "")

              text <- sub(x = text,
                          pattern = "\\][^]]*$",
                          replacement = "")

              # Extract  each [ ... ] block
              text <- gsub(pattern = "\n",
                           replacement  = "",
                           x = text,
                           perl = TRUE)

              text <-
                unlist(stringr::str_match_all(text,
                                              pattern = "\\[.*?\\]"))

              # If choices, combine into a single data frame
              # (not dict because values can appear several times)
              choices <- data.frame(
                choices_key <- c(),
                choices_value <- c())

              for (elem in text) {

                # Split the element into key and value
                parts <- stringr::str_split(string = elem,
                                            pattern = ",",
                                            n = 2L)[[1L]]

                # Clean and assign key and value
                # Key
                choices_key <- clean_string(string = parts[1L],
                                            quotes = TRUE,
                                            current_class = current_class,
                                            folder_name = folder_name,
                                            variable = variable)
                choices_key <- removefirstlastquote(choices_key)

                # Value
                choices_value <- clean_string(string = parts[2L],
                                              quotes = TRUE,
                                              equal = TRUE,
                                              current_class = current_class,
                                              folder_name = folder_name,
                                              variable = variable)

                choices_value <- cons_replace(choices_value, filevars,
                                              folder_name, env = env)

                choices_value <- replace_settings_f(
                  mystring = choices_value,
                  folder_name = folder_name,
                  combined_codebook = combined_codebook,
                  user_settings = user_settings,
                  settings_replace = settings_replace,
                  e_variable = variable,
                  env = env)

                choices_value <- removefirstlastquote(choices_value)

                # Return key-value pair
                choices <- rbind(choices,
                                 data.frame(
                                   key = choices_key,
                                   value = choices_value))

              }
            } else if (numbrackets == 1L) {

              # If not key-value pairs. E.g. choices=[1, 2, 3]

              # Replace first and last square brackets
              text <- sub(x = text,
                          pattern = "^\\[",
                          replacement = "")

              text <- sub(x = text,
                          pattern = "\\][^]]*$",
                          replacement = "")

              # Combine into a single data frame
              # (not dict because values can appear several times)
              # Important: Check Caution 2!

              choices <-
                stringr::str_split(text, "(?<!\\\\),", n = Inf)[[1L]]

              # Make escaped commas normal again
              choices <- gsub(x = choices,
                              pattern = "\\\\,",
                              replacement = ",",
                              perl = TRUE)

              # Clean choices
              choices <- sapply(choices,
                                clean_string,
                                quotes = TRUE,
                                current_class = current_class,
                                folder_name = folder_name,
                                variable = variable)

              choices <- sapply(choices,
                                removefirstlastquote)

              choices <- cons_replace(choices, filevars,
                                      folder_name, env = env)

              choices <- sapply(
                choices,
                replace_settings_f,
                folder_name = folder_name,
                combined_codebook = combined_codebook,
                user_settings = user_settings,
                settings_replace = settings_replace,
                e_variable = variable,
                env = env)

              # Info: Here vector, because variable values are all the same type
              choices <- as.vector(choices)
            }

            # Remove it from kv_frame
            kv_frame <- kv_frame[kv_frame$key != "choices", ]
            text <- NULL
          }

          # Prettify variable information  ####
          kv_frame$value <- lapply(seq_along(kv_frame$value), function(i) {

            clean_string(
              kv_frame$value[[i]],
              folder_name = folder_name,
              quotes = TRUE,
              current_class = current_class,
              variable = variable
            )
          })

          # Replace constant variable references
          # with actual constant variables
          kv_frame$value <- cons_replace(kv_frame$value,
                                         filevars,
                                         folder_name,
                                         env = env)

          # Replace settings variable references with
          # actual settings variables
          for (k in seq_along(kv_frame$value)) {

            kv_frame$value[k] <-
              replace_settings_f(mystring = kv_frame$value[k],
                                 folder_name = folder_name,
                                 combined_codebook = combined_codebook,
                                 user_settings = user_settings,
                                 settings_replace = settings_replace,
                                 e_variable = variable,
                                 e_key = kv_frame$key[k],
                                 env = env)

            # Try to evaluate
            kv_frame$value[k] <- evaluatestring(kv_frame$value[k])

            # Last removal of leading and trailing "
            kv_frame$value[k] <- removefirstlastquote(kv_frame$value[k])
          }

          kv_frame$value <- lapply(kv_frame$value,
                                   removefirstlastquote)

          # Get everything (except choices and field) into the variable
          filevars[[current_class]][[variable]] <-
            stats::setNames(as.list(kv_frame$value), kv_frame$key)

          # Get choices again
          if (exists("choices") && length(choices) != 0L) {

            filevars[[current_class]][[variable]][["choices"]] <- choices
          }

          # Get field again
          filevars[[current_class]][[variable]][["field"]] <- field

          # Change noargs to logical
          filevars[[current_class]][[variable]][["noargs"]] <-
            as.logical(filevars[[current_class]][[variable]][["noargs"]])

          # If there is no documentation, add this to info
          if (!("doc" %in% names(filevars[[current_class]][[variable]])) &&
              !("label" %in% names(filevars[[current_class]][[variable]])) &&
              !("verbose_name" %in%
                names(filevars[[current_class]][[variable]]))) {

            env$nodocs <- c(env$nodocs,
                            paste0("$", folder_name, "$",
                                   current_class, "$", variable))
          }

          # Delete kv_frame
          kv_frame <- NULL
          choices <- NULL
        }
      }

    }
    return(filevars)
  }

  # Function to process a directory  ####
  process_directory <- function(path,
                                combined_codebook,
                                files = files,
                                settings_replace = settings_replace,
                                app = app,
                                app_rm = app_rm,
                                env = env) {

    # Files on highest level
    settingsfiles <- list.files(path,
                              pattern = "settings\\.py",
                              full.names = TRUE,
                              recursive = FALSE)

    if (length(settingsfiles) == 1L &&
        !is.null(settings_replace) &&
        settings_replace == "global") {

      combined_codebook <- process_settingspy(file_path = path)

    } else if (length(settingsfiles) == 0L) {
        env$settingspy <- FALSE
    }

    for (file_path in files) {

      folder_name <- basename(dirname(file_path))

      if ((is.null(app_rm) && is.null(app)) ||
          (!is.null(app_rm) && !(folder_name %in% app_rm)) ||
          (!is.null(app) && folder_name %in% app)) {

        combined_codebook[[folder_name]] <-
          process_file(file_path = file_path,
                       folder_name = folder_name,
                       combined_codebook = combined_codebook,
                       env = env)
      }
    }
    return(combined_codebook)
  }

  # Run process_directory ####
  combined_codebook <- list(user_settings = list())
  combined_codebook <- process_directory(path,
                                         combined_codebook,
                                         files = files,
                                         settings_replace = settings_replace,
                                         app = app,
                                         app_rm = app_rm,
                                         env = env)

  # Stop if there were problems
  if (length(env$equalvariables) > 0L) {
    stop("\nThe following variable(s) cannot be read properly by gmoTree. ",
         "\nPlease escape any equal signs in the values of the oTree code!",
         paste0(env$equalvariables, collapse = ""))
  }

  # Adjust settings  ####
  if ("settings" %in% names(combined_codebook)) {
    combined_codebook[["settings"]][nonvariables] <- NULL
  }

  # Sort apps in codebook  ####
  if (!is.null(sort)) {

    sort <- c("settings", sort)

    if (
      length(sort) == length(names(combined_codebook)) &&
      setequal(sort, names(combined_codebook))) {

      combined_codebook <- combined_codebook[sort]

    } else {
      if (length(sort[!(sort %in% names(combined_codebook))]) > 0L) {
        p1 <- paste0("\n\nSort elements not in apps are: ",
              paste(sort[!(sort %in% names(combined_codebook))],
                    collapse = ", "))
      } else {
        p1 <- ""
      }

      if (length(names(combined_codebook)[!(names(combined_codebook) %in%
                                            sort)]) > 0L) {
        p2 <-
          paste0("\n\nApps not in sort are: ",
                paste(names(combined_codebook)[!(names(combined_codebook) %in%
                                                   sort)],
                      collapse = ", "))
      } else {
        p2 <- ""
      }

      env$warnings <-
        c(env$warnings,
          paste0("Sort apps are not equal to all apps. Therefore, ",
              "sort is not applied. ", p1, p2))
    }
  }

  # Make output file  ####
    if (output == "file" || output == "both") {

      # If other files already have this name   ####
      nr_suffix <- 0L

      # Output extension as in output_format
      output_form_ext <- sub(pattern = "_.*$",
                             replacement = "",
                             x = output_format)

      output_form_ext[output_form_ext == "word"] <- "docx"
      output_form_ext[output_form_ext == "latex"] <- "tex"

      # Check if file extension is already in file name (strip if yes)
      output_file <- sub(pattern = paste0("\\.",
                                          output_form_ext,
                                          "$"),
                         replacement = "",
                         x = output_file)

      # Check for non-fitting file extensions
      if (!(tolower(tools::file_ext(output_file)) == "" ||
            tolower(tools::file_ext(output_file)) == tolower(output_form_ext)
      )) {
        stop("You are not allowed to use dots in your output_file names or ",
              "file extensions in the ",
             "output_file that do not correspond to the output format! ",
             "Your output_file extension is ",
             tools::file_ext(output_file),
             ". The extension according to your output_format should be ",
             output_form_ext, ".")
      }

      # Define dictionary that has to be checked
      checkdir <- dirname(output_file)

      # Check if there are files with the same name in the folder
      nr_doc_same <- sum(
        grepl(pattern = paste0("^", basename(output_file),
                               "[_\\d]*\\.", output_form_ext),
              x = list.files(checkdir),
              perl = TRUE))

      # If yes, add number to file
      if (nr_doc_same > 0L) {
        nr_suffix <- nr_doc_same + 1L
        output_file <- paste0(output_file, "_", nr_suffix)
      }

      # Make parameters  ####
      params2 <- list(
        app_doc = app_doc,
        include_cons = include_cons,
        include_subs = include_subs,
        title = title,
        date = date,
        subtitle = subtitle,
        encoding = encoding,
        combined_codebook = combined_codebook,
        splitvarname = splitvarname,
        sep_list = sep_list,
        initial = initial)

      if (!is.null(params)) {

        if (is.list(params)) {
          params <- utils::modifyList(params2, params)

        }
      } else {
        params <- params2
      }

      if (!is.null(params[["date"]]) && params[["date"]] == "today") {
        params[["date"]] <- format(Sys.time(), "%d %B %Y")
      }

      # Make output  ####

        # Specify output_format
        output_format2 <- output_format
        output_options <- NULL

        pdflist <- list(pdf = FALSE)
        latexengine <- list(latex_engine = NA)

        if (output_format2 == "pdf_document") {

          # Xelatex better for multilingual documents
          output_format2 <- rmarkdown::pdf_document(
            latex_engine = "xelatex",
            md_extensions = "-smart")

          pdflist <- list(pdf = TRUE)
          latexengine <- list(latex_engine = "xelatex")

          # Count longest variable value
          maxlen <- 0L

          for (folder in names(combined_codebook)) {

            if (folder != "settings" && folder != "user_settings") {

              for (class in names(combined_codebook[[folder]])) {

                thiscodebookclass <- combined_codebook[[folder]][[class]]

                if (class != "doc" &&
                    !is.null(thiscodebookclass)) {

                    for (variable in
                         names(thiscodebookclass)) {

                      if ((class == "Player" ||
                          class == "Group" ||
                          class == "Subsession") &&
                          "choices" %in%
                          names(thiscodebookclass[[variable]])) {

                        lenofval <-
                          length(
                            thiscodebookclass[[variable]][["choices"]])

                        maxlen <- pmax(lenofval, maxlen)

                      } else if (class == "Constants") {

                        lenofval <-
                          length(
                            thiscodebookclass[[variable]])

                        maxlen <- pmax(lenofval, maxlen)
                      }
                    }
                }
              }
            }
          }

          # Check for many variable values
          if (maxlen > 20L) {
            # 20 is tested on my computer. There might be better solutions!
            warning("One of your variables has many values ",
              "(no of values/sublists = ",
               maxlen,
               ") and may cause serious problems in the PDF output! ",
               "(Some PDF viewers such as NITRO might struggle with it.) ",
               "If you experience any problems, use \"output_format = ",
               "pdf_document_simple\", first knit to Latex, or open ",
               "and save again with a PDF reader that can handle ",
               "long table cells. ")
          }

        } else if (output_format2 == "pdf_document_simple") {
          output_format2 <- rmarkdown::pdf_document(
            md_extensions = "-smart")

          pdflist <- list(pdf = TRUE)
          latexengine <- list(latex_engine = "pdflatex")

        } else if (output_format2 == "html_document") {
          output_format2 <- rmarkdown::html_document(md_extensions = "-smart")
          pdflist <- list(pdf = FALSE)

        } else if (output_format2 == "latex_document") {
          output_format2 <- rmarkdown::latex_document(
            md_extensions = "-smart")

          pdflist <- list(pdf = TRUE)
          latexengine <- list(latex_engine = "")

        } else {
          pdflist <- list(pdf = FALSE)
          latexengine <- list(latex_engine = NA)
        }

        params <- utils::modifyList(pdflist, params)
        params <- utils::modifyList(latexengine, params)

        # Render file
        # Don't use output_dir here,
        # because that's already included in file name!

        created_file <- rmarkdown::render(
          input = system.file("rmd", "codebook.Rmd", package = "gmoTree"),
          output_format = output_format2,
          output_file = output_file,
          params = params,
          quiet = FALSE,
          output_options = output_options,
          clean  = TRUE # Encoding is ignored here! Always UTF-8
        )

        # Open
        created_file <- normalizePath(created_file)

        if (output_open) {
          utils::browseURL(created_file)
        }
        message("File saved in ", created_file)
  }

  # Message: Variables with no documentation info  ####
      if (!is.null(doc_info) &&
          !is.na(doc_info) &&
          doc_info &&
          !(length(env$nodocs) == 0L)) {

          message(
            "Variables without documentation, label, or verbose name:\n",
            paste0("> ", env$nodocs, collapse = "\n"))
      }

  # Last check if there is complex code in the variables and return vector ####
      # Function to recursively check for the string "float"
      # around variable values and return paths
      float_check_paths <- function(codebook, path = "") {

        # List to collect paths
        collected_paths <- list()

        # If the element is a list, recurse deeper
        if (is.list(codebook)) {
          for (name in names(codebook)) {
            # Recursively collect apps and variable names
            deeper_paths <-
              float_check_paths(codebook[[name]],
                                paste0(path, "$", name))
            collected_paths <- c(collected_paths, deeper_paths)
          }
        } else {

          # Add the current path to the list if "float" is found
          if (length(codebook) == 1L &&
              is.character(codebook) &&
              grepl("float(?!Field)",
                    codebook,
                    ignore.case = TRUE,
                    perl = TRUE)) {

            collected_paths <- c(collected_paths, path)
          }
        }

        return(collected_paths)
      }

      complex2 <- float_check_paths(codebook = combined_codebook)
      complex2 <- unlist(complex2)

      if (length(complex2) > 0L) {
      complex2 <- paste(">", complex2, "(float)\n")
      }

      env$complexcons <- c(env$complexcons, complex2)

      # Show warning if there is complex code in Constants,
      # Player, Group or settings
      if (length(env$complexcons) > 0L) {
        env$warnings <-
          c(env$warnings,
            paste0("Some variables or code parts contain code that ",
                "is too complex for this function. ",
                "Hence, this function might have overseen ",
                "important variables and references to them. ",
                "Found in:\n",
                paste(env$complexcons, collapse = "")))
      }

  # Return warnings  ####

    # Warning message regarding global settings variables
    if (length(env$settingslist) > 0L &&
        !is.null(settings_replace) &&
        settings_replace == "global") {

      if (env$settingspy) {
        env$warnings <-
          c(env$warnings,
            paste0("The following settings variable/s is/are ",
                "not in settings and ",
                "cannot be replaced:\n",
                paste0(env$settingslist, collapse = "")))
      } else {
        env$warnings <-
          c(env$warnings, paste0("There is no settings.py in your path! ",
                "The following settings variable/s is/are not in settings and ",
                "cannot be replaced:\n",
                paste0(env$settingslist, collapse = "")))
      }

    }

    # Warning message regarding user settings variables
    if (length(env$usettingslist) > 0L &&
               !is.null(settings_replace) &&
               settings_replace == "user") {

      env$warnings <-
        c(env$warnings,
          paste0("The following settings variable/s is/are ",
                 "not in user_settings and ",
                 "cannot be replaced:\n",
                 paste0(env$usettingslist, collapse = "")))

    }

    if (length(env$warnings) > 0L) {
      env$warnings <- paste(env$warnings, collapse = "\n\n")
      warning(env$warnings)
    }

  # Return list  ####

  if (output == "list" || output == "both") {
    return(combined_codebook)
  }
}

Try the gmoTree package in your browser

Any scripts or data that you put into this service are public.

gmoTree documentation built on April 3, 2025, 10:26 p.m.