Estimate marginal posterior of the MGSA problem with an MCMC sampling algorithm.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23  mgsa(o, sets, population = NULL, p = seq(min(0.1, 1/length(sets)), min(0.3,
20/length(sets)), length.out = 10), ...)
## S4 method for signature 'integer,list'
mgsa(o, sets, population = NULL, p = seq(1, min(20,
floor(length(sets)/3)), length.out = 10)/length(sets), ...)
## S4 method for signature 'numeric,list'
mgsa(o, sets, population = NULL, p = seq(1, min(20,
floor(length(sets)/3)), length.out = 10)/length(sets), ...)
## S4 method for signature 'character,list'
mgsa(o, sets, population = NULL, p = seq(1,
min(20, floor(length(sets)/3)), length.out = 10)/length(sets), ...)
## S4 method for signature 'logical,list'
mgsa(o, sets, population = NULL, p = seq(min(0.1,
1/length(sets)), min(0.3, 20/length(sets)), length.out = 10), ...)
## S4 method for signature 'character,MgsaSets'
mgsa(o, sets, population = NULL,
p = seq(min(0.1, 1/length(sets)), min(0.3, 20/length(sets)), length.out =
10), ...)

o 
The observations. It can be a 
sets 
The sets. It can be an 
population 
The total population. Optional. A 
p 
Grid of values for the parameter p. Values represent probabilities of term activity and therefore must be in [0,1]. 
... 
Optional arguments that are passed to the methods. Supported parameters are

The function can handle items (such as genes) encoded as character
or integer
.
For convenience numeric
items can also be provided but these values should essentially be integers.
The type of items in the observations o
, the sets
and in the optional population
should be consistent.
In the case of character
items, o
and population
should be of type character
and sets
can either be an MgsaSets
or a list
of character
vectors.
In the case of integer
items, o
should be of type integer
, numeric
(but essentially with integer values),
or logical
and entries in sets
as well as the population
should be integer
.
When o
is logical
, it is first coerced to integer with a call on which
.
Observations outside the population
are not taken into account. If population
is NULL
, it is defined as the union of all sets.
The default grid value for p is such that between 1 and 20 sets are active in expectation.
The lower limit is constrained to be lower than 0\.1 and the upper limit lower than 0\.3 independently of the total number of sets to make sure that complex solutions are penalized.
Marginal posteriors of activity of each set are estimated using an MCMC sampler as described in Bauer et al., 2010.
Because convergence of an MCM sampler is difficult to assess, it is recommended to run it several times (using restarts
).
If variations between runs are too large (see MgsaResults
), the number of steps (steps
) of each MCMC run should be increased.
An MgsaMcmcResults
object.
Bauer S., Gagneur J. and Robinson P. GOing Bayesian: modelbased gene set analysis of genomescale data. Nucleic Acids Research (2010) http://nar.oxfordjournals.org/content/38/11/3523.full
1 2 3 4 5 6 7 8 9 10 11 12  ## observing items A and B, with sets {A,B,C} and {B,C,D}
mgsa(c("A", "B"), list(set1 = LETTERS[1:3], set2 = LETTERS[2:4]))
## same case with integer representation of the items and logical observation
mgsa(c(TRUE,TRUE,FALSE,FALSE), list(set1 = 1:3, set2 = 2:4))
## a small example with gene ontology sets and plot
data(example)
fit = mgsa(example_o, example_go)
## Not run:
plot(fit)
## End(Not run)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
Please suggest features or report bugs with the GitHub issue tracker.
All documentation is copyright its authors; we didn't write any of that.