Description Usage Arguments Details Value Author(s) References See Also Examples
Main function to search and display A- and D- optimal designs for single- or two-channel genetical genomics experiments. Simulated annealing or Metropolis Hastings used to find the best design.
1 2 3 4 5 6 | designGG ( genotype, nSlides, nTuple, nEnvFactors, nLevels,
Level=NULL, bTwoColorArray=TRUE, initial=NULL, weight=1,
region=NULL, optimality="A", method="SA", nIterations=3000,
n.search=2, endTemp=1e-10, startTemp=1, maxTempStep=0.9,
plotScores=TRUE, directory=NULL, fileName=NULL,
envFactorNames=NULL, writingProcess=TRUE )
|
genotype |
genotype data: a nMarker-by-nRILs matrix with two allels being 0 and 1 (or A and B) or three allels being 0, 0.5 and 1 (or, A, H, and B), where 0.5 (or H) represents heterozygous allele. |
nSlides |
total number of slides available for the experiment. |
nTuple |
average number of RILs (or strains) to be assigned onto each condition. |
nEnvFactors |
number of environmental factors, an integer bewteen 1 and 3.
When |
nLevels |
number of levels for each factor, a vector with each
component being an integer. The length of it should equal
|
Level |
a list which specifies the levels for each factor in the
experiment. There are in total |
bTwoColorArray |
binary variable indicating experiment type: |
initial |
the starting design matrix for the algorithm. If specified, this should
be a list with 2 matrices: |
weight |
a vector with length of |
region |
genome region of biological interest. Default = |
optimality |
type of optimality, i.e. "A" (A-optimality) or "D" (D-optimality). A-optimality minimizes $Trace((X'X)^-1)$, which corresponds to minimum average variance of the parameter estimates. D-optimality minimizes $det(X'X)^-1$, which corresponds to minimum generalized variance of the parameter estimates. |
method |
method for searching for an optimal design. "SA" uses simulated annealing. "MH" uses Metropolis Hasting. Default = "SA". |
nIterations |
number of iterations of the simulated annealing method. Default = 3000. |
n.search |
number of times for simulated annealing optimaization with different initial design, default = 2. Here it is suggested to be between 1 and 5. It should not to be too large because of the reaching computational burden. |
endTemp |
ending temperature of simulated annealing process. An important optimization parameter. Default = $1e^-10$. |
startTemp |
starting temperature of simulated annealing process. Default = 1. |
maxTempStep |
maximum temperature decreasing step for simulated annealing process.
The parameter ensures that the multiplicative cooling factor is not
smaller than that. If |
plotScores |
If |
directory |
It tells where the resulting optimal design tables are to be stored.
If |
fileName |
the final optimal design table(s) in |
envFactorNames |
a vector with names for all environmental factor(s). For example, for the
experiment with two environmental factors of temperature and drug treatment:
|
writingProcess |
If TRUE, it prints how much computation work has been finished in a
file called |
Given the genetic information of samples available for the experiment
(genotype) and the information about experimental settings (nEnvFactors
,
nSlides
,nLevels
etc.), the algorithm searches for an A-optimal or D-optimal
(see optimality
) using simulated annealing (see method
). A plot of
the scores at each iterations can also be given using the plotAllScores
function.
It also contains a number of the arguments:
region
is used to specify the
genome region that are of major interest to experimenters.
weight
is used to define
the weight of genetic and environmental factors, and interaction terms. Prior
knowledge about expected effect sizes of interesting factors can also be
incorporated as weight
parameters for the algorithm. The weight is
inversely proportional to the expected effect size of the corresponding parameter.
Example parameter settings:
Suppose to design an experiment with two environmental factors (F1, F2) and
there are two diffferent levels for each environment. The levels are 16
and 24 for F1, and 5 and 10 for F2. Thus the following command can be used:
nEnvFactors <- 2
nLevels <- c ( 2, 2 )
levels <- list ( c(16, 24), c(5, 10) )
The length of parameter weight
is dependent on the number of environmental
factors:
When nEnvFactor
= 0,
weight
is 1 as there is only one parameter of interest (genotype).
When nEnvFactor
= 1,
weight
= c( $w_Q$, $w_F1$, $w_QF1$ )
When nEnvFactor
= 2,
weight
= c( $w_Q$, $w_F1$, $w_F2$, $w_QF1$, $w_QF2$, $w_F1F2$, $w_QF1F2$)
When nEnvFactor
= 3,
weight
= c( $w_Q$, $w_F1$, $w_F2$, $w_F2$,
$w_QF1$, $w_QF2$, $w_QF3$, $w_F1F2$, $w_F1F3$, $w_F2F3$,
$w_QF1F2$, $w_QF1F3$, $w_QF2F3$, $w_QF1F2F3$ )
Here $w_Q$ represents the weight for genotype effect, $w_F1$ represent the
weight for F1 effect and $w_QF1$ represent the weight for interaction between
genotype and F1 effect, etc.
It should be noted that the simulated annealing algorithm might find a
locally and not globally optimal design. Running the optimization process
multiple times is recommended. When nSearch
> 1, the simulated annealing
optimization will be run nSearch times, each run starts with a different
initial design and will provide a (near-)optimal design. If the optimization
problem is simple, all runs will converge to the same (optimal) design.
Otherwise, the best one among all near-optimal designs will be selected as
the output of the function. One can run the algorithm multiple times with
nSearch
= 1 to review a few (near-)optimal designs.
An array design table (arrayDesign.csv) and a condition design table ( conditionDesign.csv) are generated.
Yang Li <yang.li@rug.nl>, Gonzalo Vera <gonzalo.vera.rodriguez@gmail.com>
Rainer Breitling <r.breitling@rug.nl>, Ritsert Jansen <r.c.jansen@rug.nl>
Y. Li, M. Swertz, G. Vera, J. Fu, R. Breitling, and R.C. Jansen. designGG:
An R-package and Web tool for the optimal design of genetical genomics
experiments. BMC Bioinformatics 10:188(2009)
http://gbic.biol.rug.nl/designGG
Y. Li, R. Breitling and R.C. Jansen. Generalizing genetical
genomics: the added value from environmental perturbation, Trends Genet
(2008) 24:518-524.
E. Wit and J. McClure. Statistics for Microarrays: Design, Analysis
and Inference. (2004) Chichester: Wiley.
initialDesign
, designScore
,
updateDesign
, acceptanceProbability
,
experimentDesignTable
, plotAllScores
,
exampleArrayDesignTable
,exampleConditionDesignTable
,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | library(designGG)
#load genotype data
data(genotype)
#Example: single-channel experiment with 2 environmental factors,
#each with 2 levels, and there will be four samples per condition(nTuple=4).
optimalDesign <- designGG ( genotype, nSlides=NULL, nTuple=4, nEnvFactors=2,
nLevels=c(2,2),Level=list(c(16,24),c(5,10)), bTwoColorArray=FALSE,
initial=NULL, weight=1, region=seq(1,20), optimality="A",
method="SA", nIterations=100, n.search=2, endTemp=1e-10,
startTemp=1, maxTempStep=0.9, plotScores=TRUE,
directory=NULL, fileName=NULL, envFactorNames=NULL,
writingProcess=FALSE )
#Example 2: dual-channel experiment with 2 environmental factors,
#each with 2 levels. There are 50 slides available.
optimalDesign <- designGG ( genotype, nSlides=50, nTuple=NULL, nEnvFactors=2,
nLevels=c(2,2),Level=list(c(16,24),c(5,10)), bTwoColorArray=TRUE,
initial=NULL, weight=1, region=seq(1,20), optimality="A",
method="SA", nIterations=100, n.search=2, endTemp=1e-10,
startTemp=1, maxTempStep=0.9, plotScores=TRUE,
directory=NULL, fileName=NULL, envFactorNames=NULL,
writingProcess=FALSE )
#result
optimalDesign$arrayDesign
optimalDesign$conditionDesign
plotAllScores(optimalDesign$plot.obj)
#Use the following commands to see example output tables:
data(exampleArrayDesignTable)
exampleArrayDesignTable
data(exampleConditionDesignTable)
exampleConditionDesignTable
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.