e32-mixedTypeEngine: The "MixedTypeEngine" Class

Description Usage Arguments Details Objects from the Class Methods Author(s) See Also Examples

Description

A MixedTypeEngine combines a ClinicalEngine (which defines the combinatorics of hits and block hyperparameters that determine cluster identities and behavior), a stored ClinicalNoiseModel, and cutpoints for generating mixed type data generated by makeDataTypes into an object that can be used to re-generate downstream datasets with shared parameters.

Usage

1
2
3
4
5
MixedTypeEngine(ce, noise, cutpoints)
## S4 method for signature 'MixedTypeEngine'
rand(object, n, keepall = FALSE, ...)
## S4 method for signature 'MixedTypeEngine'
summary(object, ...)

Arguments

ce

Object of class ClinicalEngine (or a list; see Details).

noise

Object of class NoiseModel, preferably constructed using the function ClinicalNoiseModel. Alternatively, a list; see Details.

cutpoints

a list with the properties of the cutpoints element produced by the function makeDataTypes. Alternatively, a list; see Details.

object

object of class MixedTypeEngine

n

a non-negative integer

keepall

a logical value

...

additional arguments for generic functions.

Details

The MixedTypeEngine is a device for a parameter set used to generate a simulated set of clinical data which can be used to store these parameters and to generate related datasets downstream. Building a MixedTypeEngine requires many parameters. You can supply these parameters in mutliple steps:

  1. Construct a ClinicalEngine.

  2. Contruct a ClinicalNoiseModel.

  3. Use randrand to generate a "raw" data set from the ClinicalEngine.

  4. Use blur to add noise to the raw data.

  5. Feed the noisy data into makeDataTypes to generate a mixed-type dataset, with cut points.

  6. Pass the ClinicalEngine, ClinicalNoiseModel, and cutpoints into the MixedTypeEngine constructor.

The alternative method is to pass the parameters for Steps 1, 2, and 5 directly into the MixedTypeEngine directly, as lists, and it will carry out steps 3-5 automatically. Note, however, that instead of passing a dataset to be used by the makeDataTypes function, you instead set the value of N to the desired number of patients used during construction. Also, if you use the explicit steps, you can save the intermediate data sets that are generated. If you simply pass all of the parameters to the constructor, those intermediate data sets are discarded, and you must generate a new data set using rand.

Objects from the Class

Objects can be created by a direct call to new, though using the constructor MixedTypeEngine is preferred.

Methods

rand(object, n, keepall, ...)

Generates nrow(Engine)*n matrix representing clinical features of n patients following the underlying distribution, noise, and data discretization pattern captured in the object of MixedTypeEngine. If keepall == TRUE, it reurns a list containing a data frame named clinical and three data matrices called raw, noisy, and binned. If keepall == FALSE, then noly the clinical and binned components are returned.

summary(object, ...)

Prints a summary of the object.

Author(s)

Kevin R. Coombes krc@silicovore.com, Caitlin E. Coombes caitlin.coombes@osumc.edu

See Also

Engine CancerModel CancerEngine ClinicalNoiseModel makeDataTypes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
## Generate a Clinical Engine of continuous data 
## with clusters generated from variation on the base CancerEngine
ce <- ClinicalEngine(20, 4, TRUE)
summary(ce)

## Generate an initial data set
set.seed(194718)
dset <- rand(ce, 300)
class(dset)
names(dset)
summary(dset$clinical)
dim(dset$data)

## Add noise before binning mixed type data
cnm <- ClinicalNoiseModel(nrow(ce@localenv$eng)) # default
noisy <- blur(cnm, dset$data)

## Set the data mixture
dt <- makeDataTypes(dset$data, 1/3, 1/3, 1/3, 0.3)
## Store the cutpoints
cp <- dt$cutpoints

## Use the pieces from above to create an MTE.
mte <- MixedTypeEngine(ce,
           noise = cnm,
           cutpoints = dt$cutpoints)

## Use the MTE rand method to generate
## multiple data sets with the same parameters
R <- rand(mte, 20)
summary(R)

S <- rand(mte, 20)
summary(S)

Umpire documentation built on Nov. 11, 2020, 1:08 a.m.