# Significance Analysis of Microarray

### Description

Performs a Significance Analysis of Microarrays (SAM). It is possible to perform one and two class analyses using either a modified t-statistic or a (standardized) Wilcoxon rank statistic, and a multiclass analysis using a modified F-statistic. Moreover, this function provides a SAM procedure for categorical data such as SNP data and the possibility to employ an user-written score function.

### Usage

1 2 |

### Arguments

`data` |
a matrix, a data frame, or an ExpressionSet object. Each row of Can also be a list (if |

`cl` |
a vector of length In the one-class case, In the two class unpaired case, In the two class paired case, In the multiclass case and if For examples of how |

`method` |
a character string or a name specifying the method/function that should be used
in the computation of the expression scores If If For an analysis of categorical data such as SNP data,
If the variables are ordinal and a trend test should be applied
(e.g., in the two-class case, the Cochran-Armitage trend test), It is also possible to use
an user-written function to compute the expression scores.
For details, see |

`control` |
further optional arguments for controlling the SAM analysis. For
these arguments, see |

`gene.names` |
a character vector of length |

`...` |
further arguments of the specific SAM methods. If |

### Details

`sam`

provides SAM procedures for several types of analysis (one and two class analyses
with either a modified t-statistic or a Wilcoxon rank statistic, a multiclass analysis
with a modified F statistic, and an analysis of categorical data). It is, however, also
possible to write your own function for another type of analysis. The required arguments
of this function must be `data`

and `cl`

. This function can also have other
arguments. The output of this function must be a list containing the following objects:

`d`

:a numeric vector consisting of the expression scores of the genes.

`d.bar`

:a numeric vector of the same length as

`na.exclude(d)`

specifying the expected expression scores under the null hypothesis.`p.value`

:a numeric vector of the same length as

`d`

containing the raw, unadjusted p-values of the genes.`vec.false`

:a numeric vector of the same length as

`d`

consisting of the one-sided numbers of falsely called genes, i.e. if*d > 0*the numbers of genes expected to be larger than*d*under the null hypothesis, and if*d<0*, the number of genes expected to be smaller than*d*under the null hypothesis.`s`

:a numeric vector of the same length as

`d`

containing the standard deviations of the genes. If no standard deviation can be calculated, set`s = numeric(0)`

.`s0`

:a numeric value specifying the fudge factor. If no fudge factor is calculated, set

`s0 = numeric(0)`

.`mat.samp`

:a matrix with B rows and

`ncol(data)`

columns, where B is the number of permutations, containing the permutations used in the computation of the permuted d-values. If such a matrix is not computed, set`mat.samp = matrix(numeric(0))`

.`msg`

:a character string or vector containing information about, e.g., which type of analysis has been performed.

`msg`

is printed when the function`print`

or`summary`

, respectively, is called. If no such message should be printed, set`msg = ""`

.`fold`

:a numeric vector of the same length as

`d`

consisting of the fold changes of the genes. If no fold change has been computed, set`fold = numeric(0)`

.

If this function is, e.g., called `foo`

, it can be used by setting `method = foo`

in `sam`

. More detailed information and an example will be contained in the siggenes
manual.

### Value

An object of class SAM.

### Author(s)

Holger Schwender, holger.schw@gmx.de

### References

Schwender, H., Krause, A., and Ickstadt, K. (2006). Identifying Interesting Genes with siggenes.
*RNews*, 6(5), 45-50.

Schwender, H. (2004). Modifying Microarray Analysis Methods for
Categorical Data – SAM and PAM for SNPs. To appear in: *Proceedings
of the the 28th Annual Conference of the GfKl*.

Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance analysis of microarrays
applied to the ionizing radiation response. *PNAS*, 98, 5116-5121.

### See Also

`SAM-class`

,`d.stat`

,`wilc.stat`

,
`chisq.stat`

, `samControl`

### Examples

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | ```
## Not run:
# Load the package multtest and the data of Golub et al. (1999)
# contained in multtest.
library(multtest)
data(golub)
# golub.cl contains the class labels.
golub.cl
# Perform a SAM analysis for the two class unpaired case assuming
# unequal variances.
sam.out <- sam(golub, golub.cl, B=100, rand=123)
sam.out
# Obtain the Delta plots for the default set of Deltas
plot(sam.out)
# Generate the Delta plots for Delta = 0.2, 0.4, 0.6, ..., 2
plot(sam.out, seq(0.2, 0.4, 2))
# Obtain the SAM plot for Delta = 2
plot(sam.out, 2)
# Get information about the genes called significant using
# Delta = 3.
sam.sum3 <- summary(sam.out, 3, entrez=FALSE)
# Obtain the rows of golub containing the genes called
# differentially expressed
sam.sum3@row.sig.genes
# and their names
golub.gnames[sam.sum3@row.sig.genes, 3]
# The matrix containing the d-values, q-values etc. of the
# differentially expressed genes can be obtained by
sam.sum3@mat.sig
# Perform a SAM analysis using Wilcoxon rank sums
sam(golub, golub.cl, method="wilc.stat", rand=123)
# Now consider only the first ten columns of the Golub et al. (1999)
# data set. For now, let's assume the first five columns were
# before treatment measurements and the next five columns were
# after treatment measurements, where column 1 and 6, column 2
# and 7, ..., build a pair. In this case, the class labels
# would be
new.cl <- c(-(1:5), 1:5)
new.cl
# and the corresponding SAM analysis for the two-class paired
# case would be
sam(golub[,1:10], new.cl, B=100, rand=123)
# Another way of specifying the class labels for the above paired
# analysis is
mat.cl <- matrix(c(rep(c(-1, 1), e=5), rep(1:5, 2)), 10)
mat.cl
# and the above SAM analysis can also be done by
sam(golub[,1:10], mat.cl, B=100, rand=123)
## End(Not run)
``` |