View source: R/simulate_tables.R
simulate_tables | R Documentation |
Generate random contingency tables representing various functional, non-functional, dependent, or independent patterns, without specifying a parametric model for the patterns.
simulate_tables(
n = 100, nrow = 3, ncol = 3,
type = c("functional", "many.to.one",
"discontinuous", "independent",
"dependent.non.functional"),
n.tables = 1,
row.marginal = NULL,
col.marginal = NULL,
noise = 0.0, noise.model = c("house", "candle"),
margin = 0
)
n |
a positive integer specifying the sample size to be distributed in each table. For |
nrow |
a positive integer specifying the number of rows in each table. The value must be no less than 2. For |
ncol |
a positive integer specifying the number of columns in output table. |
type |
a character string to specify the type of pattern underlying the table. The options are |
n.tables |
a positive integer value specifying the number of tables to be generated. |
row.marginal |
a non-negative numeric vector of length |
col.marginal |
a non-negative numeric vector of length |
noise |
a numeric value between 0 and 1 specifying the noise level to be added to a table using function |
noise.model |
a character string indicating the noise model of either |
margin |
a numeric value of either 0, 1 or 2. Default is 0.
0: noise is applied along both rows and columns.
1: noise is applied along each row.
2: noise is applied along each column.
See |
This function generates five types of table representing different interaction patterns between row and column discrete random variables X
and Y
. Three of the five types are non-constant functional patterns (Y
is a non-constant function of X
):
type="functional"
: Y
is a function of X
but X
may or may not be a function of Y
.
type="many.to.one"
: Y
is a many-to-one function of X
but X
is not a function of Y
.
type="discontinuous"
: Y
is a function of X
, where the function value of X must differ from its neighbors. X
may or may not be a function of Y
. A discontinuous function forms a contrast with those that are close to constant functions.
The fourth type
"dependent.non.functional"
is non-functional patterns where X
and Y
are statistically dependent but not function of each other. The samples are distributed according to row.marginal
probabilities.
The fifth type
"independent"
represents patterns where X
and Y
are statistically independent whose joint probability mass function is the product of their marginal probability mass functions.
For all functional tables (type="functional"
, type="many.to.one"
, type="discontinuous"
), the samples are distributed using either the given row or column marginal probabilities. Theoretically, it is not always possible to enforce both marginals in a functional pattern. If both marginals are provided, one will be randomly selected to generate a table; about half of the time each equested marginal is used. If neither is provided, either row or column uniform marginal will be randomly selected to generate a table; half of the time a table will have a uniform row marginal and the other half a uniform column marginal.
Random noise can be optionally applied to the tables using either the house or the candle noise model. See add.noise
for details.
sharma2017simulating;textualFunChisq provide full mathematical and statistical details of the simulation strategies for the above table types except the "discontinuous"
type which was introduced after the publication.
A list containing the following components:
pattern.list |
a list of tables containing binary patterns in 0's and 1's. Each table is created by setting all non-zero entries in the corresponding sampled contingency table from |
sample.list |
a list of tables satisfying both the mathematical and statistical requirements. These tables are noise free. |
noise.list |
a list of tables after applying noise to the corresponding tables in |
pvalue.list |
a list of p-values reporting the statistical significance of the generated tables for the required type. When the pattern type specifies a functional relationship, the p-values are computed by the functional chi-square test \insertCitezhang2013decipheringFunChisq; otherwise, the Pearson's chi-square test of independence is used to calculate the p-value. |
Ruby Sharma, Sajal Kumar, Hua Zhong, and Joe Song
add.noise
for details of the noise model.
# In all examples, x is the row variable and y is the column
# variable of a table.
# Example 1. Simulating a noisy function where y=f(x),
# x may or may not be g(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="functional",
noise=0.2, n.tables = 1,
row.marginal = c(0.3,0.2,0.3,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 1. Functional pattern")
plot_table(tbls$sample.list[[1]], main="Ex 1. Sampled pattern (noise free)")
plot_table(tbls$noise.list[[1]], main="Ex 1. Sampled pattern with 0.2 noise")
plot.new()
# Example 2. Simulating a noisy functional pattern where
# y=f(x), x may or may not be g(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="functional",
noise=0.5, n.tables = 1,
row.marginal = c(0.3,0.2,0.3,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 2. Functioal pattern", col="seagreen2")
plot_table(tbls$sample.list[[1]], main="Ex 2. Sampled pattern (noise free)", col="seagreen2")
plot_table(tbls$noise.list[[1]], main="Ex 2. Sampled pattern with 0.5 noise", col="seagreen2")
plot.new()
# Example 3. Simulating a noisy many.to.one function where
# y=f(x), x!=f(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="many.to.one",
noise=0.2, n.tables = 1,
row.marginal = c(0.4,0.3,0.1,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 3. Many-to-one pattern", col="limegreen")
plot_table(tbls$sample.list[[1]], main="Ex 3. Sampled pattern (noise free)", col="limegreen")
plot_table(tbls$noise.list[[1]], main="Ex 3. Sampled pattern with 0.2 noise", col="limegreen")
plot.new()
# Example 4. Simulating noisy discontinuous
# pattern where y=f(x), x may or may not be g(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5,
type="discontinuous", noise=0.2,
n.tables = 1, row.marginal = c(0.2,0.4,0.2,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 4. Discontinuous pattern", col="springgreen3")
plot_table(tbls$sample.list[[1]], main="Ex 4. Sampled pattern (noise free)", col="springgreen3")
plot_table(tbls$noise.list[[1]], main="Ex 4. Sampled pattern with 0.2 noise", col="springgreen3")
plot.new()
# Example 5. Simulating noisy dependent.non.functional
# pattern where y!=f(x) and x and y are statistically
# dependent.
tbls <- simulate_tables(n=100, nrow=4, ncol=5,
type="dependent.non.functional", noise=0.3,
n.tables = 1, row.marginal = c(0.2,0.4,0.2,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 5. Dependent.non.functional pattern",
col="sienna2", highlight="none")
plot_table(tbls$sample.list[[1]], main="Ex 5. Sampled pattern (noise free)",
col="sienna2", highlight="none")
plot_table(tbls$noise.list[[1]], main="Ex 5. Sampled pattern with 0.3 noise",
col="sienna2", highlight="none")
plot.new()
# Example 6. Simulating a pattern where x and y are
# statistically independent.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="independent",
noise=0.3, n.tables = 1,
row.marginal = c(0.4,0.3,0.1,0.2),
col.marginal = c(0.1,0.2,0.4,0.2,0.1))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 6. Independent pattern",
col="cornflowerblue", highlight="none")
plot_table(tbls$sample.list[[1]], main="Ex 6. Sampled pattern (noise free)",
col="cornflowerblue", highlight="none")
plot_table(tbls$noise.list[[1]], main="Ex 6. Sampled pattern with 0.3 noise",
col="cornflowerblue", highlight="none")
plot.new()
# Example 7. Simulating a noisy function where y=f(x),
# x may or may not be g(y), with given column marginal
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="functional",
noise=0.2, n.tables = 1,
col.marginal = c(0.2,0.1,0.4,0.2,0.1))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 7. Functional pattern")
plot_table(tbls$sample.list[[1]], main="Ex 7. Sampled pattern (noise free)")
plot_table(tbls$noise.list[[1]], main="Ex 7. Sampled pattern with 0.2 noise")
plot.new()
# Example 8. Simulating a noisy many.to.one function where
# y=f(x), x!=f(y) with given column marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=4, type="many.to.one",
noise=0.2, n.tables = 1,
col.marginal = c(0.4,0.3,0.1,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 8. Many-to-one pattern", col="limegreen")
plot_table(tbls$sample.list[[1]], main="Ex 8. Sampled pattern (noise free)", col="limegreen")
plot_table(tbls$noise.list[[1]], main="Ex 8. Sampled pattern with 0.2 noise", col="limegreen")
plot.new()
# Example 9. Simulating noisy discontinuous
# pattern where y=f(x), x may or may not be g(y) with given column marginal
tbls <- simulate_tables(n=100, nrow=4, ncol=4,
type="discontinuous", noise=0.2,
n.tables = 1, col.marginal = c(0.1,0.4,0.2,0.3))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 9. Discontinuous pattern", col="springgreen3")
plot_table(tbls$sample.list[[1]], main="Ex 9. Sampled pattern (noise free)", col="springgreen3")
plot_table(tbls$noise.list[[1]], main="Ex 9. Sampled pattern with 0.2 noise", col="springgreen3")
plot.new()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.