Workhorse for simulation studies
Description
Generates data according to all provided
constellations in dataGrid
and applies
all provided constellations in procGrid
to them.
Usage
1 2 3 4 5 
Arguments
dataGrid 
a 
procGrid 
similar as 
replications 
number of replications for the simulation 
discardGeneratedData 
if 
progress 
if 
summary.fun 
univariate functions to summarize the results (numeric or logical) over
the replications, e.g. mean, sd. Alternatively, 
ncpus 
a cluster of 
cluster 
a cluster generated by the 
clusterSeed 
if the simulation is done in parallel
manner, then the combined multiplerecursive generator from L'Ecuyer (1999)
is used to generate random numbers. Thus 
clusterLibraries 
a character vector specifying the packages that should be loaded by the workers. 
clusterGlobalObjects 
a character vector specifying the names of R objects in the global environment that should be exported to the global environment of every worker. 
fallback 
must be missing or a character specfying a file. Every time
when the data generation function is changed, the results so far obtained
are saved in the file specified by 
envir 
must be provided if the functions specified
in 
... 
only needed to alert the user if some deprecated arguments were used. 
Value
The returned object is a list of the class
evalGrid
, where the fourth element is a list of lists named
simulation
. simulation[[i]][[r]]
contains:
data 
the data set that was generated by the

results 
a list containing 
Note
If cluster
is provided by the user the
function evalGrids
will NOT stop the cluster.
This has to be done by the user. Conducting parallel
simulations by specifing ncpus
will interally
create a cluster and stop it after the simulation
is done.
Author(s)
Marsel Scheer
See Also
as.data.frame.evalGrid
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107  rng = function(data, ...) {
ret = range(data)
names(ret) = c("min", "max")
ret
}
# call runif(n=1), runif(n=2), runif(n=3)
# and range on the three "datasets"
# generated by runif(n=1), runif(n=2), runif(n=3)
eg = evalGrids(
expandGrid(fun="runif", n=1:3),
expandGrid(proc="rng"),
rep=10
)
eg
# summarizing the results in a data.frame
as.data.frame(eg)
# we now generate data for a regression
# and fit different regression models
# not that we use SD and not sd (the
# reason for this is the cast() call below)
regData = function(n, SD){
data.frame(
x=seq(0,1,length=n),
y=rnorm(n, sd=SD))
}
eg = evalGrids(
expandGrid(fun="regData", n=20, SD=1:2),
expandGrid(proc="lm", formula=c("y~x", "y~I(x^2)")),
replications=2)
# can not be converted to data.frame, because
# an object of class "lm" can not converted to
# a data.frame
try(as.data.frame(eg))
# for the data.frame we just extract the r.squared
# from the fitted model
as.data.frame(eg, convert.result.fun=function(fit) c(rsq=summary(fit)$r.squared))
# for the data.frame we just extract the coefficients
# from the fitted model
df = as.data.frame(eg, convert.result.fun=coef)
# since we have done 2 replication we can calculate
# sum summary statistics
library("reshape")
df$replication=NULL
mdf = melt(df, id=1:7, na.rm=TRUE)
cast(mdf, ... ~ ., c(mean, length, sd))
# note if the data.frame would contain the column
# named "sd" instead of "SD" the cast will generate
# an error
names(df)[5] = "sd"
mdf = melt(df, id=1:7, na.rm=TRUE)
try(cast(mdf, ... ~ ., c(mean, length, sd)))
# extracting the summary of the fitted.model
as.data.frame(eg, convert.result.fun=function(x) {
ret = coef(summary(x))
data.frame(valueName = rownames(ret), ret, check.names=FALSE)
})
# we now compare to methods for
# calculating quantiles
# the functions and parameters
# that generate the data
N = c(10, 50, 100)
library("plyr")
dg = rbind.fill(
expandGrid(fun="rbeta", n=N, shape1=4, shape2=4),
expandGrid(fun="rnorm", n=N))
# definition of the two quantile methods
emp.q = function(data, probs) c(quantile(data, probs=probs))
nor.q = function(data, probs) {
ret = qnorm(probs, mean=mean(data), sd=sd(data))
names(ret) = names(quantile(1, probs=probs))
ret
}
# the functions and parameters that are
# applied to the generate data
pg = rbind.fill(expandGrid(proc=c("emp.q", "nor.q"), probs=c(0.01, 0.025, 0.05)))
# generate data and apply quantile methods
set.seed(1234)
eg = evalGrids(dg, pg, replication=50, progress=TRUE)
# convert the results to a data.frame
df = as.data.frame(eg)
df$replication=NULL
mdf = melt(df, id=1:8, na.rm=TRUE)
# calculate, print and plot summary statistics
require("ggplot2")
print(a < arrange(cast(mdf, ... ~ ., c(mean, sd)), n))
ggplot(a, aes(x=fun, y=mean, color=proc)) + geom_point(size=I(3)) + facet_grid(probs ~ n)
