# fuzzy.ebam: EBAM and SAM for Fuzzy Genotype Calls In siggenes: Multiple testing using SAM and Efron's empirical Bayes approaches

## Description

Computes the required statistics for an Empirical Bayes Analysis of Microarrays (EBAM; Efron et al., 2001) or a Significant Analysis of Microarrays (SAM; Tusher et al., 2001), respectively, based on the score statistic proposed by Louis et al. (2010) for fuzzy genotype calls or approximate Bayes Factors (Wakefield, 2007) determined using this score statistic.

Should not be called directly, but via `ebam(..., method = fuzzy.ebam)` or `sam(..., method = fuzzy.stat)`, respectively.

## Usage

 ```1 2 3 4 5 6 7 8``` ```fuzzy.ebam(data, cl, type = c("asymptotic", "permutation", "abf"), W = NULL, logbase = exp(1), addOne = TRUE, df.ratio = NULL, n.interval = NULL, df.dens = 5, knots.mode = TRUE, type.nclass = c("FD", "wand", "scott"), fast = FALSE, B = 100, B.more = 0.1, B.max = 30000, n.subset = 10, rand = NA) fuzzy.stat(data, cl, type = c("asymptotic", "permutation", "abf"), W = NULL, logbase = exp(1), addOne = TRUE, B = 100, B.more = 0.1, B.max = 30000, n.subset = 10, rand = NA) ```

## Arguments

 `data` a matrix containing fuzzy genotype calls. Such a matrix can, e.g., be generated by the function `getMatFuzzy` based on the confidences for the three possible genotypes computed by preprocessing algorithms such as CRLMM. `cl` a vector of zeros and ones specifying which of the columns of `data` contains the fuzzy genotype calls for the cases (`1`) and which the controls (`0`). Thus, the length of `cl` must be equal to the number of columns of `data`. `type` a character string specifying how the analysis should be performed. If `"asymptotic"`, the trend statistic of Louis et al. (2010) is used directly, and EBAM or SAM are performed assuming that under the null hypothesis this test statistic follows am asymptotic standard normal distribution. If `"permutation"`, a permutation procedure is employed to estimate the null distribution of this test statistic. If `"abf"`, Approximate Bayes Factors (ABF) proposed by Wakefield (2007) are determined from the trend statistic, and EBAM or SAM are performed on these ABFs or transformations of these ABFs (see in particular `logbase` and `addOne`). In the latter case, again, a permutation procedure is used in EBAM and SAM to, e.g., compute posterior probabilities of association. `W` the prior variance. Must be either a positive value or a vector of length `nrow(data)` consisting of positive values. Ignored if `type = "asymptotic"` or `type = "permutation"`. For details, see `abf`. `logbase` a numeric value larger than 1. If `type = "abf"`, then the ABFs are not directly used in the analysis, but a log-transformation (with base `logbase`) of the ABFs. If the ABFs should not be transformed, `logbase` can be set to `NA`. Ignored if `type = "asymptotic"` or `type = "permutation"`. `addOne` should 1 be added to the ABF before it is log-transformed? If `TRUE`, `log(ABF + 1, base=logbase)` is used as test score in EBAM or SAM. If `FALSE`, `log(ABF, base = logbase)` is considered. Only taken into account when `type = "abf"` and `logbase` is not `NA`. `df.ratio` integer specifying the degrees of freedom of the natural cubic spline used in the logistic regression with repeated observations for estimating the ratio f0/f. Ignored if `type = "asymptotic"`. If not specified, `df.ratio` is set to `3` if `type = "abf"`, and to `5` if `type = "permutation"` `n.interval` the number of intervals used in the logistic regression with repeated observations (if `type = "permutation"` or `type = "abf"`), or in the Poisson regression used to estimate the density of the observed z-values (if `type = "asymptotic"`). If `NULL`, `n.interval` is estimated by the method specified by `type.nclass`, where at least 139 intervals are considered if `type = "permutation"` or `type = "abf"`. `df.dens` integer specifying the degrees of freedom of the natural cubic spline used in the Poisson regression to estimate the density of the observed z-values in an application of `ebam` with `type = "asymptotic"`. Otherwise, ignored. `knots.mode` logical specifying whether the `df.dens` - 1 knots are centered around the mode and not the median of the density when fitting the Poisson regression model to estimate the density of the observed z-values in an application of `ebam` with `type = "asymptotic"` (for details on this density estimation, see `denspr`). Ignored if `type = "permutation"` or `type = "abf"`. `type.nclass` character string specifying the procedure used to compute the number of cells of the histogram. Ignored if `type = "permutation"`, `type = "abf"`, or `n.interval` is specified. Can be either `"FD"` (default), `"wand"`, or `"FD"`. For details, see `denspr`. `fast` if `FALSE` the exact number of permuted test scores that are more extreme than a particular observed test score is computed for each of the variables/SNPs. If `TRUE`, a crude estimate of this number is used. `B` the number of permutations used in the estimation of the null distribution, and hence, in the computation of the expected z-values. Ignored if `type = "asymptotic"`. `B.more` a numeric value. If the number of all possible permutations is smaller than or equal to (1+`B.more`)*`B`, full permutation will be done. Otherwise, `B` permutations are used. `B.max` a numeric value. If the number of all possible permutations is smaller than or equal to `B.max`, `B` randomly selected permutations will be used in the computation of the null distribution. Otherwise, `B` random draws of the group labels are used. `n.subset` a numeric value indicating in how many subsets the `B` permutations are divided when computing the permuted z-values. Please note that the meaning of `n.subset` differs between the SAM and the EBAM functions. `rand` numeric value. If specified, i.e. not `NA`, the random number generator will be set into a reproducible state.

## Value

A list containing statistics required by `ebam` or `sam`.

## Author(s)

Holger Schwender, [email protected]

## References

Efron, B., Tibshirani, R., Storey, J.D., and Tusher, V. (2001). Empirical Bayes Analysis of a Microarray Experiment, JASA, 96, 1151-1160.

Louis, T.A., Carvalho, B.S., Fallin, M.D., Irizarry, R.A., Li, Q., and Ruczinski, I. (2010). Association Tests that Accommodate Genotyping Errors. In Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., and West, M. (eds.), Bayesian Statistics 9, 393-420. Oxford University Press, Oxford, UK. With Discussion.

Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance Analysis of Microarrays Applied to the Ionizing Radiation Response. PNAS, 98, 5116-5121.

Wakefield, J. (2007). A Bayesian Measure of Probability of False Discovery in Genetic Epidemiology Studies. AJHG, 81, 208-227.

## See Also

`ebam`, `sam`, `EBAM-class`, `SAM-class`

siggenes documentation built on May 31, 2017, 2:35 p.m.