cubist_asPerl_idioms: Export Tidy Rules of Cubist Object to Perl Idioms

Description Usage Arguments Value Author(s) See Also Examples

Description

This function exports one Perl module source code file containing low-level idioms of the language so that the Cubist model can be reused or idiom implemented without having the Cubist package itself. The function, thus, permits dependency free archival of a Cubist model based on the basic functionality of the language.

For idiom implementation, it is important that the concept of “neighbors” as a feature and argument to the prediction of Cubist not be used. The pursuit here, for a given row of the input matrix, is to isolate by name each applicable rule and then trigger a prediction for each rule and then pool the estimate by some method.

Usage

1
2
3
cubist_asPerl_idioms(cubist_object, cubist_tag="C0", nut_digits=3,
                     parentpmdir="IDIOMS", path=".", prefix="",
                     tidyrule.return=FALSE, sample_str="", ...)

Arguments

cubist_object

Either a Cubist object (model) from the Cubist package on which the function
tidyrules::tidyRules() will be called or the results from the user already having called tidyRules() on the Cubist object;

cubist_tag

A string thought of a being a small fragment of arbitrary text chosen by a user that identifies a particular Cubist model. This tag is critically important and useful for subroutines on export and the Perl module nomenclature will have this tag in their names;

nut_digits

The number of digits to pass to the round function as the nuts of each branch are written. This helps keep file sizes down and the written code more readable than letting full floating point result;

parentpmdir

The top-level directory name that the output file would need to be migrated into. For example, the default would represent a path of IDIOMS::cubist*.pm of the Perl module where the * is the cubist_tag;

path

The directory path to which the various Perl files are to be written;

prefix

A string that is prefixed to the output file names of this function. This option is useful if the cubist_tag is a number to avoid having file names starting with a number;

tidyrule.return

A logical triggering the tidyrules::tidyRules() data frame;

sample_str

An arbitrary string of sample-size summary information about the Cubist model that the user wants to give over to the idiom implementation by creating a variable $SAMPLE_STR equal to sample_str;

...

Additional arguments to pass (currently not used);

Value

This function is used for its side effects written to the file system, but can be used to return without modification the results of the tidyrules::tidyRules() function.

Author(s)

W.H. Asquith

See Also

Xrow_to_Perl, cubist_asR_idioms

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
## Not run: 
# We set the simulation sample size to 1,000; the number of committees is 3.
nsim <- 1000; committees <- 3

# The high-level design is to have the ability to tag a given Cubist object.
# The tag will be used as output files and subroutines and variables therein
# are written.
cubist_tag <- 0 # say our "zeroth order" Cubist object
set.seed(9) # arbitrary here, but setting the seed for reproducibility
cubist_tag <- paste0("C",cubist_tag) # create the tag "C0" to be used

# simulate some X and Y data
X <- sort(runif(nsim, min=0, max=2))
Y <- 0.34*sin(2*pi*X) - .74*cos(2*pi*X) -
     0.14*sin(4*pi*X) + .19*cos(4*pi*X) + rnorm(nsim, sd=0.2)
X <- data.frame(X=X, pi=pi) # the design that follows both by Cubist
# and by the idiom functions in this package need at least two columns
# even if the second column containing pi is never used.

# We can foreshadow some type of cross-validation, but here just use all.
phi <- 1
inSchool <- sample(1:nrow(X), floor(phi*nrow(X)))
Xs <- X[ inSchool,]; Xt <- X[-inSchool,]
Ys <- Y[ inSchool ]; Yt <- Y[-inSchool ]

# make the Cubist model
cubist_tree <- Cubist::cubist(x=Xs, y=Ys, committees=committees)

# construct a string as described above, here a colon-delimited,
# equal-sign keyed sequence of content that we want preserved in
# the exported Cubist idioms for access again when we build 
# infrastructure to use the idioms, not otherwise in this demo
txt <- paste0("committees=",committees,":sample_size=",nsim)

tmpath <- tempdir() # temporary directory
message("temporary path = ",tmpath)

# now write a text file to that directory that has the X values
# in this example. This file can be re-read by
# inst/aux_perl/demoLOOP_C0.pl X.txt 
# to get predictions
write.table(data.frame(X=seq(0,2, by=0.001)),
            file=paste0(tmpath,"/","X.txt"),
            sep="\t", row.names=FALSE, quote=FALSE)

cubist_asPerl_idioms(cubist_tree, cubist_tag=cubist_tag,
                     path=tmpath, sample_str=txt)
                  
files <- list.files(path=tmpath, pattern=paste0(".+",cubist_tag,".pm"))
print(files)
# [1] "cubistC0.pm"
# so one source file of idioms for the C0 tagged Cubist were
# made, and the cubist_utils.R is tag-independent and has a couple
# of accessor functions to work with the our big picture 
## End(Not run)

wasquith-usgs/Cubist.Idioms documentation built on Jan. 1, 2021, 12:45 p.m.