RunSimulation: Run Simulation

Description Usage Arguments Value

Description

Function to generate data and run full simulaton

Usage

1
2
3
RunSimulation(Sim = "RL", ntrain, ntest, p, m, contamination = "Var",
  Vartype = "Id", DGP = 2, ntrees, ndsize, ntreestune, parvec, cvreps,
  cvfolds, tol)

Arguments

Sim

Which simulation? Either "RL" for Roy Larocque (2012), or "LM" for Li, Martin (2017)

ntrain

number of training cases

ntest,

number of test cases

p

proportion of outliers

m

signal to noise parameter when Sim=="RL"

contamination

Use either variance ("Var") or mean ("Mean") contamination. Only relevant for Sim =="RL".

Vartype

use identity ("Id") or Toeplitz ("Toeplitz") correlation matrix. Only relevant for Sim =="LM"

DGP

If Sim == "RL", which data generating process should be used? either 1 for tree-like, or 2 for non-tree

ntrees

number of trees

ndsize

nodesize

ntreestune

number of trees to use for tuning alpha

parvec

vector of candidate values for tuning parameter alpha

cvreps

number of repetitions to perform in cross validation

cvfolds

number of folds to perform in cross validation

tol

maximal change in interation for LOWESSRF weights in cross validation

DATA

object from generate_RLdata or generate_LMdata

Value

returns a list of 4 items 1. Datasets (TRAIN, TEST, and Outlier Indicator) 2. Matrix of 16 columns giving different predictions. Last column is true Y. 3. Number of iterations 4. Output from TuneMultifoldCV (a list of 8 items itself)


AndrewjSage/RFLOWESS documentation built on May 26, 2019, 6:38 a.m.