| ffexp | R Documentation |
A class for easily creating and evaluating full factorial experiments.
e1 <- ffexp$new(eval_func=, ) e1$run_all() e1$plot_run_times() e1$save_self()
eval_func The function called to evaluate each design point.
... Factors and their levels to be evaluated at.
save_output Should the output be saved?
parallel If TRUE, function evaluations are done in parallel.
parallel_cores Number of cores to be used in parallel.
If "detect", parallel::detectCores() is used to determine
number. "detect-1" may be used so that the computer isn't running
at full capacity, which can slow down other tasks.
$new() Initialize an experiment. The preprocessing is done,
but no function evaluations are run.
$run_all() Run all factor combinations.
$run_one() Run a single factor combination.
$add_result_of_one() Used to add result of evaluation to data set,
don't manually call.
$plot_run_times() Plot the run times. Especially useful when
they have been run in parallel.
$save_self() Save ffexp R6 object.
$recover_parallel_temp_save() If you ran the experiment using
parallel with parallel_temp_save=TRUE and it crashes partway
through, call this to recover the runs that were completed.
Runs that were stopped mid-execution are not recoverable.
outrawdfRaw data frame of output.
outcleandfClean output in data frame.
rungridmatrix specifying which inputs will be run for each experiment.
nvarsNumber of variables
allvarsAll variables
varlistCharacter vector of objects to pass to a parallel cluster.
arglistList of values for each argument
number_runsTotal number of runs
completed_runsLogical vector of whether each run has been completed.
eval_funcThe function that is called for each experiment trial.
outlistA list of the output from each run.
save_outputLogical of whether the output should be saved.
parallelLogical whether experiment runs should be run in parallel. Allows for massive speedup.
parallel_coresHow many cores to use when running in parallel. Can be an integer, or 'detect' will detect how many cores are available, or 'detect-1' will do one less than that.
parallel_clusterThe parallel cluster being used.
folder_pathThe path to the folder where output will be saved.
verboseHow much should be printed when running. 0 is none, 2 is average.
extract_output_to_dfA function to extract the raw output into a data frame. E.g., if the output is a list, but you want a single item to show up in the output data frame.
hashvalueA value used to make sure inputs match when reloading.
new()Create an 'ffexp' object.
ffexp$new( ..., eval_func, save_output = FALSE, parallel = FALSE, parallel_cores = "detect", folder_path, varlist = NULL, verbose = 2, extract_output_to_df = NULL )
...Input arguments for the experiment
eval_funcThe function to be run. It must take named arguments matching the names of ...
save_outputShould output be saved to file?
parallelShould a parallel cluster be used?
parallel_coresWhen running in parallel, how many cores should be used. Not actually the number of cores used, actually the number of clusters created. Can be more than the computer has available, but will hurt performance. Can set to 'detect' to have it detect how many cores are available and use that, or 'detect-1' to use one fewer than there are.
folder_pathWhere the data and files should be stored. If not given, a folder in the existing directory will be created.
varlistCharacter vector of names of objects that need to be passed to the parallel environment.
verboseHow much should be printed when running. 0 is none, 2 is average.
extract_output_to_dfA function to extract the raw output into a data frame. E.g., if the output is a list, but you want a single item to show up in the output data frame.
run_all()Run an experiment. The user can choose to run all rows, or just specified ones, if it should be run in parallel, and what files should be saved.
ffexp$run_all( to_run = NULL, random_n = NULL, redo = FALSE, run_order, save_output = self$save_output, parallel = self$parallel, parallel_cores = self$parallel_cores, parallel_temp_save = save_output, write_start_files = save_output, write_error_files = save_output, delete_parallel_temp_save_after = FALSE, varlist = self$varlist, verbose = self$verbose, outfile, warn_repeat = TRUE )
to_runWhich rows should be run? If NULL, then all that haven't been run yet.
random_nRandomly selects n trials among those not yet completed and runs them.
redoShould already completed rows be run again?
run_orderIn what order should the rows by run? Options: random, in_order, and reverse.
save_outputShould the output be saved?
parallelShould it be run in parallel?
parallel_coresWhen running in parallel, how many cores should be used. Not actually the number of cores used, actually the number of clusters created. Can be more than the computer has available, but will hurt performance. Can set to 'detect' to have it detect how many cores are available and use that, or 'detect-1' to use one fewer than there are.
parallel_temp_saveShould temp files be written when running in parallel? Prevents losing results if it crashes partway through.
write_start_filesShould start files be written?
write_error_filesShould error files be written for rows that fail?
delete_parallel_temp_save_afterIf using parallel temp save files, should they be deleted afterwards?
varlistA character vector of names of variables to be passed the the parallel cluster.
verboseHow much should be printed when running. 0 is none, 2 is average.
outfileWhere should master output file be saved when running in parallel?
warn_repeatShould warnings be given when repeating already completed rows?
run_for_time()Run the experiment for a given time, not for a specified number of trials. Runs 'batch_size' trials between checking the time elapsed, only needs to be more than 1 when running in parallel. It will complete the current batch before stopping, it does not quit in the middle of the batch when reaching the time limit, so it will go over the time limit given.
ffexp$run_for_time( sec, batch_size, show_time_in_bar = FALSE, save_output = self$save_output, parallel = self$parallel, parallel_cores = self$parallel_cores, parallel_temp_save = save_output, write_start_files = save_output, write_error_files = save_output, delete_parallel_temp_save_after = FALSE, varlist = self$varlist, verbose = self$verbose, warn_repeat = TRUE )
secNumber of seconds to run for
batch_sizeNumber of trials to run between checking the time elapsed.
show_time_in_barThe progress bar can show either the number of runs completed or the time elapsed.
save_outputShould the output be saved?
parallelShould it be run in parallel?
parallel_coresWhen running in parallel, how many cores should be used. Not actually the number of cores used, actually the number of clusters created. Can be more than the computer has available, but will hurt performance. Can set to 'detect' to have it detect how many cores are available and use that, or 'detect-1' to use one fewer than there are.
parallel_temp_saveShould temp files be written when running in parallel? Prevents losing results if it crashes partway through.
write_start_filesShould start files be written?
write_error_filesShould error files be written for rows that fail?
delete_parallel_temp_save_afterIf using parallel temp save files, should they be deleted afterwards?
varlistA character vector of names of variables to be passed the the parallel cluster.
verboseHow much should be printed when running. 0 is none, 2 is average.
warn_repeatShould warnings be given when repeating already completed rows?
run_superbatch()Run batches. Allows for better progress visualization and saving when running in parallel
ffexp$run_superbatch( nsb, redo = FALSE, run_order, save_output = self$save_output, parallel = self$parallel, parallel_cores = self$parallel_cores, parallel_temp_save = save_output, write_start_files = save_output, write_error_files = save_output, delete_parallel_temp_save_after = FALSE, varlist = self$varlist, verbose = self$verbose, warn_repeat = TRUE )
nsbNumber of super batches
redoShould already completed rows be run again?
run_orderIn what order should the rows by run? Options: random, in_order, and reverse.
save_outputShould the output be saved?
parallelShould it be run in parallel?
parallel_coresWhen running in parallel, how many cores should be used. Not actually the number of cores used, actually the number of clusters created. Can be more than the computer has available, but will hurt performance. Can set to 'detect' to have it detect how many cores are available and use that, or 'detect-1' to use one fewer than there are.
parallel_temp_saveShould temp files be written when running in parallel? Prevents losing results if it crashes partway through.
write_start_filesShould start files be written?
write_error_filesShould error files be written for rows that fail?
delete_parallel_temp_save_afterIf using parallel temp save files, should they be deleted afterwards?
varlistA character vector of names of variables to be passed the the parallel cluster.
verboseHow much should be printed when running. 0 is none, 2 is average.
warn_repeatShould warnings be given when repeating already completed rows?
outfileWhere should master output file be saved when running in parallel?
run_one()Run a single row of the experiment. You can specify which one to run. Generally this should not be used by users, use 'run_all' instead.
ffexp$run_one( irow = NULL, save_output = self$save_output, write_start_files = save_output, write_error_files = save_output, warn_repeat = TRUE, is_parallel = FALSE, return_list_result_of_one = FALSE, verbose = self$verbose, force_this_as_output )
irowWhich row should be run?
save_outputShould the output be saved?
write_start_filesShould a file be written when starting the experiment?
write_error_filesShould a file be written if there is an error?
warn_repeatShould a warning be given if repeating a row?
is_parallelIs this being run in parallel?
return_list_result_of_oneShould the list of the result of this one be return?
verboseHow much should be printed when running. 0 is none, 2 is average.
force_this_as_outputValue to use instead of evaluating function.
add_result_of_one()Add the result of a single experiment to the object. This shouldn't be used by users.
ffexp$add_result_of_one( output, systime, irow, row_grid, row_df, start_time, end_time, save_output, hashvalue )
outputThe output of the experiment.
systimeThe time it took to run
irowThe row of inputs used.
row_gridThe corresponding row in the run grid.
row_dfThe corresponding row data frame.
start_timeThe start time of the experiment.
end_timeThe end time of the experiment.
save_outputShould the output be saved?
hashvalueNot used.
plot_run_times()Plot the run times of each trial.
ffexp$plot_run_times()
plot_pairs()Plot pairs of inputs and outputs. Helps see correlations and distributions.
ffexp$plot_pairs()
plot()Calling 'plot' on an 'ffexp' object calls 'plot_pairs()'
ffexp$plot()
calculate_effects()Calculate the effects of each variable as if this was an experiment using a linear model.
ffexp$calculate_effects()
calculate_effects2()Calculate the effects of each variable as if this was an experiment using a linear model.
ffexp$calculate_effects2()
save_self()Save this R6 object
ffexp$save_self(verbose = self$verbose)
verboseHow much should be printed when running. 0 is none, 2 is average.
create_save_folder_if_nonexistent()Create the save folder if it doesn't already exist.
ffexp$create_save_folder_if_nonexistent()
rename_save_folder()Rename the save folder
ffexp$rename_save_folder(new_folder_path, new_folder_name)
new_folder_pathNew path for the save folder
new_folder_nameIf you want the new save folder to be in the current directory, you can use this instead of 'new_folder_path' and just give the folder name.
delete_save_folder_if_empty()Delete the save folder if it is empty. Used to prevent leaving behind empty folders.
ffexp$delete_save_folder_if_empty(verbose = self$verbose)
verboseHow much should be printed when running. 0 is none, 2 is average.
recover_parallel_temp_save()Running this loads the information saved to files if 'save_parallel_temp_save=TRUE' was used when running. Useful when running long jobs in parallel so that you don't lose all results if it crashes before finishing.
ffexp$recover_parallel_temp_save(delete_after = FALSE, only_reload_new = FALSE)
delete_afterShould the temp files be deleted after they are recovered? If TRUE, make sure you save the ffexp object after running this function so you don't lose the data.
only_reload_newWill only reload output from runs that don't show as completed yet. Can make it much faster if there are many saved files, but most have already been loaded to this object.
rungrid2()Display the input rows of the experiment. rungrid just gives integers, this gives the actual values.
ffexp$rungrid2(rows = 1:nrow(self$rungrid))
rowsWhich rows to display the inputs for? On big experiments, specifying the rows can be much faster.
add_variable()Add a variable to the experiment. You must specify the value of the variable for all existing rows, and then also the values of the variable which haven't been run yet.
ffexp$add_variable(name, existing_value, new_values, suppressMessage = FALSE)
nameName of the variable being added.
existing_valueWhich existing argument is a level being added to?
new_valuesThe values of the new variable which have not been run. This should not include 'arg_name', the name of the new variable at the existing values.
suppressMessageShould the message be suppressed? The message tells the user a new variable was added and it is being returned in a new object. Default FALSE.
add_level()Add a level to one of the arguments. This returns a new object. The existing object is not changed.
ffexp$add_level(arg_name, new_values, suppressMessage = FALSE)
arg_nameWhich existing argument is a level being added to?
new_valuesThe value of the new levels to be added to 'arg_name'.
suppressMessageShould the message be suppressed? The message tells the user a new level was added and it is being returned in a new object. Default FALSE.
remove_results()Remove results of completed trials. They will be rerun next time $run_all() is called.
ffexp$remove_results(to_remove)
to_removeIndexes of trials to remove
print()Printing the object shows some summary information.
ffexp$print()
set_parallel_cores()Set the number of parallel cores to be used when running in parallel. Needed in case user sets "detect"
ffexp$set_parallel_cores(parallel_cores)
parallel_coresWhen running in parallel, how many cores should be used. Not actually the number of cores used, actually the number of clusters created. Can be more than the computer has available, but will hurt performance. Can set to 'detect' to have it detect how many cores are available and use that, or 'detect-1' to use one fewer than there are.
stop_cluster()Stop the parallel cluster.
ffexp$stop_cluster()
finalize()Cleanup after deleting object.
ffexp$finalize()
clone()The objects of this class are cloneable with this method.
ffexp$clone(deep = FALSE)
deepWhether to make a deep clone.
# Two factors, both with two levels.
# The evaluation function simply prints out the combination
cc <- ffexp$new(a=1:2,b=c("A","B"),
eval_func=function(...) {c(...)})
# View the factor settings it will run (each row).
cc$rungrid
# Evaluate all four settings
cc$run_all()
cc <- ffexp$new(a=1:3,b=2, cd=data.frame(c=3:4,d=5:6),
eval_func=function(...) {list(...)})
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.