simplica | R Documentation |
Implements the SIMPLICA algorithm to identify Simplivariate Components in data matrices using a genetic algorithm. These components are related to clusters or biclusters, but defined here in terms of specific structural patterns (constant, additive, multiplicative, or user-defined).
simplica(
df,
maxIter = 2000,
popSize = 300,
pCrossover = 0.6,
pMutation = 0.03,
zeroFraction = 0.9,
elitism = 100,
numSimComp = 5,
verbose = FALSE,
mySeeds = 1:5,
interval = 100,
penalty = c(constant = 0, additive = 1, multiplicative = 0),
patternFunctions = defaultPatternFunctions(),
doSimplicaCV = TRUE,
cvControl = NULL
)
df |
A numeric data matrix to analyze |
maxIter |
Maximum number of generations for the genetic algorithm (default: 2000) |
popSize |
Population size for the genetic algorithm (default: 300) |
pCrossover |
Crossover probability for genetic algorithm (default: 0.6) |
pMutation |
Mutation probability for genetic algorithm (default: 0.03) |
zeroFraction |
Fraction of population initialized with zeros (default: 0.9) |
elitism |
Number of best individuals preserved between generations (default: 100) |
numSimComp |
Number of Simplivariate Components simultaneously optimized (default: 5) |
verbose |
Logical, whether to print SIMPLICA progress information (default: FALSE) |
mySeeds |
Vector of random seeds for replicate runs (default: 1:5) |
interval |
Interval for monitoring GA progress (default: 100) |
penalty |
Named vector of penalty values for each pattern type (default: c(constant = 0, additive = 1, multiplicative = 0)) |
patternFunctions |
List of pattern functions used for fitness evaluation (default: defaultPatternFunctions()) |
doSimplicaCV |
Logical, run cross-validated relabeling with simplicaCV() after GA (default: TRUE) |
cvControl |
Optional list to tune simplicaCV; fields passed to simplicaCV via do.call. Defaults if omitted:
|
A list with:
best: simplica object (includes original GA result; if doSimplicaCV=TRUE, also componentPatternsUpdated and componentAudit)
raw: list of "ga"
objects (one per seed, from the GA package)
Hageman, J. A., Wehrens, R., & Buydens, L. M. C. (2008). "Simplivariate Models: Ideas and First Examples." PLoS ONE, 3(9), e3259. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1371/journal.pone.0003259")}
Madeira, S. C., & Oliveira, A. L. (2004). "Biclustering Algorithms for Biological Data Analysis: A Survey." IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1(1), 24–45. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1109/TCBB.2004.2")}
data("simplicaToy")
# Minimal run just to demonstrate function usage, run with default GA parameters
fit <- simplica(df = simplicaToy$data,
maxIter = 200,
popSize = 50,
mySeeds = 1,
elitism = 1,
verbose = TRUE)
plotComponentResult(df = simplicaToy$data,
string = fit$best$string,
componentPatterns = fit$best$componentPatternsUpdated,
componentScores = fit$best$componentScores,
showAxisLabels = FALSE,
title = "SIMPLICA on simplicaToy",
scoreCutoff = 25000)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.