EBAM Analysis for Categorical Data
Description
Generates the required statistics for an Empirical Bayes Analysis of Microarrays (EBAM) of categorical data such as SNP data.
Should not be called directly, but via ebam(..., method = chisq.ebam).
This function replaces cat.ebam
.
Usage
1 2 3 4 5 
Arguments
data 
a matrix, data frame, or list. If a matrix or data frame, then each row
must correspond to a variable (e.g., a SNP), and each column to a sample (i.e.\ an observation).
If the number of observations is huge it is better to specify 
cl 
a numeric vector of length 
approx 
should the null distribution be approximated by a ChiSquaredistribution?
Currently only available if 
B 
the number of permutations used in the estimation of the null distribution, and hence, in the computation of the expected zvalues. 
n.split 
number of chunks in which the variables are splitted in the computation
of the values of the test statistic. Currently, only available if 
check.for.NN 
if 
lev 
numeric or character vector specifying the codings of the levels of the
variables/SNPs. Can only be specified if 
B.more 
a numeric value. If the number of all possible permutations is smaller
than or equal to (1+ 
B.max 
a numeric value. If the number of all possible permutations is smaller
than or equal to 
n.subset 
a numeric value indicating in how many subsets the 
fast 
if 
n.interval 
the number of intervals used in the logistic regression with
repeated observations for estimating the ratio f0/f
(if 
df.ratio 
integer specifying the degrees of freedom of the natural cubic
spline used in the logistic regression with repeated observations. Ignored
if 
df.dens 
integer specifying the degrees of freedom of the natural cubic
spline used in the Poisson regression to estimate the density of the observed
zvalues. Ignored if 
knots.mode 
if 
type.nclass 
character string specifying the procedure used to compute the
number of cells of the histogram. Ignored if 
rand 
numeric value. If specified, i.e. not 
Details
For each variable, Pearson's ChiSquare statistic is computed to test if the distribution of the variable differs between several groups. Since only one null distribution is estimated for all variables as proposed in the original EBAM application of Efron et al. (2001), all variables must have the same number of levels/categories.
Value
A list containing statistics required by ebam
.
Warning
This procedure will only work correctly if all SNPs/variables have the same number of levels/categories.
Author(s)
Holger Schwender, holger.schw@gmx.de
References
Efron, B., Tibshirani, R., Storey, J.D., and Tusher, V. (2001). Empirical Bayes Analysis of a Microarray Experiment, JASA, 96, 11511160.
Schwender, H. and Ickstadt, K. (2008). Empirical Bayes Analysis of Single Nucleotide Polymorphisms. BMC Bioinformatics, 9, 144.
Schwender, H., Krause, A., and Ickstadt, K. (2003). Comparison of the Empirical Bayes and the Significance Analysis of Microarrays. Technical Report, SFB 475, University of Dortmund, Germany.
See Also
EBAMclass
,ebam
, chisq.stat
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40  ## Not run:
# Generate a random 1000 x 40 matrix consisting of the values
# 1, 2, and 3, and representing 1000 variables and 40 observations.
mat < matrix(sample(3, 40000, TRUE), 1000)
# Assume that the first 20 observations are cases, and the
# remaining 20 are controls.
cl < rep(1:2, e=20)
# Then an EBAM analysis for categorical data can be done by
out < ebam(mat, cl, method=chisq.ebam, approx=TRUE)
out
# approx is set to TRUE to approximate the null distribution
# by the ChiSquaredistribution (usually, for such a small
# number of observations this might not be a good idea
# as the assumptions behind this approximation might not
# be fulfilled).
# The same results can also be obtained by employing
# contingency tables, i.e. by specifying data as a list.
# For this, we need to generate the tables summarizing
# groupwise how many observations show which level at
# which variable. These tables can be obtained by
library(scrime)
cases < rowTables(mat[, cl==1])
controls < rowTables(mat[, cl==2])
ltabs < list(cases, controls)
# And the same EBAM analysis as above can then be
# performed by
out2 < ebam(ltabs, method=chisq.ebam, approx=TRUE)
out2
## End(Not run)
