gmcmapply: expand.grid-based multi-core multivariate apply (mcmapply)...

View source: R/gmcmapply.R

gmcmapplyR Documentation

expand.grid-based multi-core multivariate apply (mcmapply) wrapper for fast implementation of nested looping structures

Description

expand.grid-based multi-core multivariate apply (mcmapply) wrapper for fast implementation of nested looping structures

Usage

gmcmapply(mvars, FUN, SIMPLIFY = TRUE, mc.cores = 1, ...)

Arguments

mvars

a named list of variables to be combined and mapped over. These are equivalent to the layers of a nested for loop. If desired, mvars can also be passed as a customized expand-grid-like data.frame.

FUN

user-defined function to apply over variables in mvars. N.B. This function is written to use the names of mvars as formal arguments to FUN. Thus, the majority of FUNs need not require any arguments to be specified. As long as the variable object names in FUN match the names of mvars, gmapply will handle the translation of names(mvars) to FUN.

SIMPLIFY

defaults to TRUE, which appends function output to a tibble of the expand.grid-ed mvars. FALSE exports a list, where length(list) = nrow(expand.grid(mvars)). TODO: implement an abstracted function that converts the list to a named, nested list when SIMPLIFY = FALSE.

mc.cores

number of cores to utilize if running in parallel. Defaults to 1, which implements mapply.

...

Value

compiled returns of FUN

TODO

1. edit body of FUN to create logs for all child processes using sink().2. For SIMPLIFY = FALSE add abstracted function from https://stackoverflow.com/questions/55264739/convert-nested-data-frame-to-a-hierarchical-list to create a hierarchical named list. 3. abstract grid.expand sorting into function and allow return on grid that is fed to mapply. 4. ensure list return when SIMPLIFY = TRUE. 5. Figure out flexible method for df passing (potentially leave empty and read from global envi).

Author(s)

Nate Hall

Examples


## Not run: 
     # Example 1:
     # just make sure variables used in your function appear as the names of mvars
     myfunc <- function(...){
       return_me <- paste(l3, l1^2 + l2, sep = "_")
       return(return_me)
     }

     mvars <- list(l1 = 1:3,
                   l2 = 1:5,
                   l3 = letters[1:3])


     ### list output (mapply)
     outs_gmc <- gmcmapply(mvars, myfunc, SIMPLIFY = FALSE)

     ## N.B. This is equivalent to running:
     outs <- vector(mode = "list", length = nrow(expand.grid(mvars)))
     step <- 1
     for(l1 in 1:10){
       for(l2 in 1:5){
         for(l3 in letters[1:3]){
           outs[[step]] <- myfunc(l1,l2,l3)
           step <- step +1
         }
       }
     }

     # identical(outs_gmc,outs) #TRUE

     ### tibble output with returned value of FUN stored in fun_out. This is the tidy-preferred option.
     outs <- gmcmapply(mvars, myfunc, SIMPLIFY = TRUE)


     ### tibble output run on 3 cores.
     outs <- gmcmapply(mvars, myfunc, SIMPLIFY = TRUE, mc.cores = 3)

    Example 2. Pass non-default args to FUN.
    ## Since the apply functions dont accept full calls as inputs (calls are internal), user can pass arguments to FUN through dots, which can overwrite a default option for FUN.

    ## update myfunc to have a default argument
     myfunc <- function(rep_letters = 3, ...){
       return_me <- paste(rep(l3, rep_letters), l1^2 + l2, sep = "_")
       return(return_me)
     }

     outs_default <- gmcmapply(mvars, myfunc)
     outs_not_default <- gmcmapply(mvars, myfunc, rep_letters = 1)


## End(Not run)


natehall329/nate_utils documentation built on Dec. 31, 2024, 3:25 p.m.