Description Usage Arguments Details Value Warning Author(s) References See Also Examples
Generates the required statistics for a Significance Analysis of Microarrays of categorical data such as SNP data.
Should not be called directly, but via sam(..., method = chisq.stat).
Replaces cat.stat
1 2 3 |
data |
a matrix, data frame, or list. If a matrix or data frame, then each row
must correspond to a variable (e.g., a SNP), and each column to a sample (i.e.\ an observation).
If the number of observations is huge it is better to specify |
cl |
a numeric vector of length |
approx |
should the null distribution be approximated by a ChiSquare-distribution?
Currently only available if |
B |
the number of permutations used in the estimation of the null distribution, and hence, in the computation of the expected d-values. |
n.split |
number of chunks in which the variables are splitted in the computation
of the values of the test statistic. Currently, only available if |
check.for.NN |
if |
lev |
numeric or character vector specifying the codings of the levels of the
variables/SNPs. Can only be specified if |
B.more |
a numeric value. If the number of all possible permutations is smaller
than or equal to (1+ |
B.max |
a numeric value. If the number of all possible permutations is smaller
than or equal to |
n.subset |
a numeric value indicating how many permutations are considered simultaneously when computing the expected d-values. |
rand |
numeric value. If specified, i.e. not |
For each SNP (or more general, categorical variable), Pearson's Chi-Square statistic is computed to test if the distribution of the SNP differs between several groups. Since only one null distribution is estimated for all SNPs as proposed in the original SAM procedure of Tusher et al. (2001) all SNPs must have the same number of levels/categories.
A list containing statistics required by sam
.
This procedure will only work correctly if all SNPs/variables have the same number of levels/categories. Therefore, it is stopped when the number of levels differ between the variables.
Holger Schwender, holger.schw@gmx.de
Schwender, H. (2005). Modifying Microarray Analysis Methods for Categorical Data – SAM and PAM for SNPs. In Weihs, C. and Gaul, W. (eds.), Classification – The Ubiquitous Challenge. Springer, Heidelberg, 370-377.
Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays applied to the ionizing radiation response. PNAS, 98, 5116-5121.
SAM-class
,sam
, chisq.ebam
, trend.stat
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 | ## Not run:
# Generate a random 1000 x 40 matrix consisting of the values
# 1, 2, and 3, and representing 1000 variables and 40 observations.
mat <- matrix(sample(3, 40000, TRUE), 1000)
# Assume that the first 20 observations are cases, and the
# remaining 20 are controls.
cl <- rep(1:2, e=20)
# Then an SAM analysis for categorical data can be done by
out <- sam(mat, cl, method=chisq.stat, approx=TRUE)
out
# approx is set to TRUE to approximate the null distribution
# by the ChiSquare-distribution (usually, for such a small
# number of observations this might not be a good idea
# as the assumptions behind this approximation might not
# be fulfilled).
# The same results can also be obtained by employing
# contingency tables, i.e. by specifying data as a list.
# For this, we need to generate the tables summarizing
# groupwise how many observations show which level at
# which variable. These tables can be obtained by
library(scrime)
cases <- rowTables(mat[, cl==1])
controls <- rowTables(mat[, cl==2])
ltabs <- list(cases, controls)
# And the same SAM analysis as above can then be
# performed by
out2 <- sam(ltabs, method=chisq.stat, approx=TRUE)
out2
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.