EBAM and SAM for Fuzzy Genotype Calls

Share:

Description

Computes the required statistics for an Empirical Bayes Analysis of Microarrays (EBAM; Efron et al., 2001) or a Significant Analysis of Microarrays (SAM; Tusher et al., 2001), respectively, based on the score statistic proposed by Louis et al. (2010) for fuzzy genotype calls or approximate Bayes Factors (Wakefield, 2007) determined using this score statistic.

Should not be called directly, but via ebam(..., method = fuzzy.ebam) or sam(..., method = fuzzy.stat), respectively.

Usage

1
2
3
4
5
6
7
8
fuzzy.ebam(data, cl, type = c("asymptotic", "permutation", "abf"), W = NULL, 
    logbase = exp(1), addOne = TRUE, df.ratio = NULL, n.interval = NULL, 
    df.dens = 5, knots.mode = TRUE, type.nclass = c("FD", "wand", "scott"), 
    fast = FALSE, B = 100, B.more = 0.1, B.max = 30000, n.subset = 10, rand = NA)
    
fuzzy.stat(data, cl, type = c("asymptotic", "permutation", "abf"), W = NULL, 
    logbase = exp(1), addOne = TRUE, B = 100, B.more = 0.1, B.max = 30000, 
    n.subset = 10, rand = NA)

Arguments

data

a matrix containing fuzzy genotype calls. Such a matrix can, e.g., be generated by the function getMatFuzzy based on the confidences for the three possible genotypes computed by preprocessing algorithms such as CRLMM.

cl

a vector of zeros and ones specifying which of the columns of data contains the fuzzy genotype calls for the cases (1) and which the controls (0). Thus, the length of cl must be equal to the number of columns of data.

type

a character string specifying how the analysis should be performed. If "asymptotic", the trend statistic of Louis et al. (2010) is used directly, and EBAM or SAM are performed assuming that under the null hypothesis this test statistic follows am asymptotic standard normal distribution. If "permutation", a permutation procedure is employed to estimate the null distribution of this test statistic. If "abf", Approximate Bayes Factors (ABF) proposed by Wakefield (2007) are determined from the trend statistic, and EBAM or SAM are performed on these ABFs or transformations of these ABFs (see in particular logbase and addOne). In the latter case, again, a permutation procedure is used in EBAM and SAM to, e.g., compute posterior probabilities of association.

W

the prior variance. Must be either a positive value or a vector of length nrow(data) consisting of positive values. Ignored if type = "asymptotic" or type = "permutation". For details, see abf.

logbase

a numeric value larger than 1. If type = "abf", then the ABFs are not directly used in the analysis, but a log-transformation (with base logbase) of the ABFs. If the ABFs should not be transformed, logbase can be set to NA. Ignored if type = "asymptotic" or type = "permutation".

addOne

should 1 be added to the ABF before it is log-transformed? If TRUE, log(ABF + 1, base=logbase) is used as test score in EBAM or SAM. If FALSE, log(ABF, base = logbase) is considered. Only taken into account when type = "abf" and logbase is not NA.

df.ratio

integer specifying the degrees of freedom of the natural cubic spline used in the logistic regression with repeated observations for estimating the ratio f0/f. Ignored if type = "asymptotic". If not specified, df.ratio is set to 3 if type = "abf", and to 5 if type = "permutation"

n.interval

the number of intervals used in the logistic regression with repeated observations (if type = "permutation" or type = "abf"), or in the Poisson regression used to estimate the density of the observed z-values (if type = "asymptotic"). If NULL, n.interval is estimated by the method specified by type.nclass, where at least 139 intervals are considered if type = "permutation" or type = "abf".

df.dens

integer specifying the degrees of freedom of the natural cubic spline used in the Poisson regression to estimate the density of the observed z-values in an application of ebam with type = "asymptotic". Otherwise, ignored.

knots.mode

logical specifying whether the df.dens - 1 knots are centered around the mode and not the median of the density when fitting the Poisson regression model to estimate the density of the observed z-values in an application of ebam with type = "asymptotic" (for details on this density estimation, see denspr). Ignored if type = "permutation" or type = "abf".

type.nclass

character string specifying the procedure used to compute the number of cells of the histogram. Ignored if type = "permutation", type = "abf", or n.interval is specified. Can be either "FD" (default), "wand", or "FD". For details, see denspr.

fast

if FALSE the exact number of permuted test scores that are more extreme than a particular observed test score is computed for each of the variables/SNPs. If TRUE, a crude estimate of this number is used.

B

the number of permutations used in the estimation of the null distribution, and hence, in the computation of the expected z-values. Ignored if type = "asymptotic".

B.more

a numeric value. If the number of all possible permutations is smaller than or equal to (1+B.more)*B, full permutation will be done. Otherwise, B permutations are used.

B.max

a numeric value. If the number of all possible permutations is smaller than or equal to B.max, B randomly selected permutations will be used in the computation of the null distribution. Otherwise, B random draws of the group labels are used.

n.subset

a numeric value indicating in how many subsets the B permutations are divided when computing the permuted z-values. Please note that the meaning of n.subset differs between the SAM and the EBAM functions.

rand

numeric value. If specified, i.e. not NA, the random number generator will be set into a reproducible state.

Value

A list containing statistics required by ebam or sam.

Author(s)

Holger Schwender, holger.schw@gmx.de

References

Efron, B., Tibshirani, R., Storey, J.D., and Tusher, V. (2001). Empirical Bayes Analysis of a Microarray Experiment, JASA, 96, 1151-1160.

Louis, T.A., Carvalho, B.S., Fallin, M.D., Irizarry, R.A., Li, Q., and Ruczinski, I. (2010). Association Tests that Accommodate Genotyping Errors. In Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M., and West, M. (eds.), Bayesian Statistics 9, 393-420. Oxford University Press, Oxford, UK. With Discussion.

Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance Analysis of Microarrays Applied to the Ionizing Radiation Response. PNAS, 98, 5116-5121.

Wakefield, J. (2007). A Bayesian Measure of Probability of False Discovery in Genetic Epidemiology Studies. AJHG, 81, 208-227.

See Also

ebam, sam, EBAM-class, SAM-class

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.