simulate_data
Simulate Multi-omic Data. Data will be structured with common samples (N
) across multiple omics datasets (D
) each with P
features. Other parameters such as signal-to-noise, response family, etc. can be modified below.
Simulate Multi-omic Data. Data will be structured with common samples (N
) across multiple omics datasets (D
) each with P
features. Other parameters such as signal-to-noise, response family, etc. can be modified below.
simulate_data(
N = 100,
D = 3,
P = 100,
c1 = 3,
c2 = 1,
sparsity = 0.2,
method = "factor",
num.factors = 5,
family = "gaussian",
factors.influencing.X = 2,
factors.influencing.Y = 2,
ordinal.centers = c(-1.2, -1, 0, 1, 1.2),
multi.centers.x = c(-1, -1, 1, 1)/sqrt(2),
multi.centers.y = c(-1, 1, -1, 1)/sqrt(2),
N.test = 1000,
seed = 123
)
Argument |Description
------------- |----------------
N
| Number of samples (N
). Defaults to 100
.
D
| Number of datasets (D
). Defaults to 3
.
P
| Number of features (P
) per dataset (D
). Total number of features will be P * D
. Defaults to 100
.
c1
| Primary level of signal provided. c1 = 1
will be less signal-to-noise, whereas a higher c1
(c1 >= 3
) will be more. Defaults to 3
.
c2
| A second signal parameter, controls spread of signal from true means. c2 = 1
(default) is normal spread. Increase to reduce spread, and decrease to spread points further. Only used when family = "ordinal"
or family = "multinomial"
. Will ignore if family = "gaussian"
.
sparsity
| How much sparsity to implement in X
? Defaults to 0.2
, can be between 0
and 1
.
method
| How to simulate the data? method = "factor"
will simulate data from num.factors
true factors. method = "random"
will simulate X
randomly (with correlation depending upon c
) and Y
directly from X
.
num.factors
| How many factors to be simulated in U
? Defaults to 5
. Note that using method = "random"
will ignore this parameter.
family
| What type of response to simulate for Y
? Options are "gaussian"
(default), "ordinal"
, and "multinomial"
. Note that "ordinal"
and "multinomial"
require additional parameters below.
factors.influencing.X
| How many factors should influence X
? Defaults to 2
. Note that using method = "random"
will ignore this parameter.
factors.influencing.Y
| How many factors should influence Y
? Defaults to 2
. Note that using method = "random"
will ignore this parameter.
ordinal.centers
| Centers for signal for family = "ordinal"
Defaults to c(-1.2,-1, 0, 1,1.2)
. Must be a vector of length C
, where C
is the number of ordinal classes. If family = "multinomial"
, use multi.centers.x
and multi.centers.y
instead. Ignores for family = "gaussian"
.
multi.centers.x
| X-axis centers for signal for family = "multinomial"
Defaults to c(-1, -1, 1, 1)/sqrt(2)
. Must be a vector of length C
, where C
is the number of multinomial classes. If family = "ordinal"
, use ordinal.centers
instead. Ignores for family = "gaussian"
.
multi.centers.y
| Y axis centers for signal for family = "multinomial"
Defaults to c(-1, 1, -1, 1)/sqrt(2)
. Must be a vector of length C
, where C
is the number of multinomial classes. If family = "ordinal"
, use ordinal.centers
instead. Ignores for family = "gaussian"
.
N.test
| Number of samples for the test dataset to be returned. Defaults to 1000
.
seed
| Seed to set for consistent results. Defaults to seed = 123
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.