colTDTsam: SAM and EBAM for Trio Data

Description Usage Arguments Value Author(s) References See Also Examples

View source: R/colTDTsam.R

Description

Performs a Significance Analysis of Microarrays (SAM; Tusher et al., 2001) or an Empirical Bayes Analysis of Microarrays (EBAM; Efron et al., 2001), respectively, based on the genotypic transmission/disequilibrium test statistic.

Usage

1
2
3
4
5
6
7
colTDTsam(mat.snp, model = c("additive", "dominant", "recessive", "max"), 
   approx = NULL, B = 1000, size = 10, chunk = 100, rand = NA)
   
colTDTebam(mat.snp, model = c("additive", "dominant", "recessive", "max"), 
   approx = NULL, B = 1000, size = 10, chunk = 100, 
   n.interval = NULL, df.ratio = 3, df.dens = 3, knots.mode = TRUE, 
   type.nclass = c("wand", "FD", "scott"), fast = FALSE, rand = NA)

Arguments

mat.snp

a matrix in genotype format, i.e. a numeric matrix in which each column is a vector of length 3 * t representing a SNP genotyped at t trios. Each of the t blocks of rows in mat.snp must consist of the genotypes of father, mother, and offspring (in this order), where the genotypes must be coded by 0, 1, and 2. Missing values are allowed and need to be coded by NA. This matrix might be generated from a data frame in ped format by, e.g., employing ped2geno.

model

type of genetic mode of inheritance that should be considered. Either "additive" (default), "dominant", "recessive", or "max". If model = "max", the maximum over the gTDT statistics for testing an additive, dominant, and recessive model is used as gTDT statistic. Abbreviations are allowed. Thus, e.g., model = "dom" will fit a dominant model, and model = "r" an recessive model.

approx

logical specifying whether the null distribution should be approximated by a ChiSquare-distribution with one degree of frredom. If approx = FALSE, the null distribution is estimated based on a permutation method. If not specified, i.e. NULL, approx is set to TRUE, when an additive, dominant, or recessive mode of inheritance is considered, and approx = FALSE, when model = "max". If model = "max", it is not allowed to set approx = TRUE.

B

number of permutations used in the estimation of the null distribution, and thus, the computation of the null statistics. Ignored if approx = TRUE.

size

number of SNPs considered simultaneously when computing the gTDT statistics.

chunk

number of permutations considered simultaneously in the permutation procedure.

n.interval

the number of intervals used in the logistic regression with repeated observations for estimating the ratio of the null density to the density of the observed gTDT values in an EBAM analysis (if approx = FALSE), or in the Poisson regression used to estimate the density of the observed gTDT values (if approx = TRUE). For details, see Efron et al., 2001, or Schwender and Ickstadt, 2008, respectively. If NULL, n.interval is determined by the maximum of 139 (see Efron et al., 2001) and the number of intervals estimated by the method specified by type.nclass.

df.ratio

integer specifying the degrees of freedom of the natural cubic spline used in the logistic regression with repeated observations for estimating the ratio of the null density to the density of the observed gTDT values in an EBAM analysis. Only used when approx is set to FALSE.

df.dens

integer specifying the degrees of freedom of the natural cubic spline used in the Poisson regression to estimate the density of the observed gTDT values in an EBAM analysis. Only used when approx is set to TRUE.

knots.mode

logical specifying whether the df.dens - 1 knots of the natural cubic spline are centered around the mode and not the median of the density when fitting the Poisson regression model to estimate the density of the observed gTDT values in an EBAM analysis. Only used when approx is set to TRUE. For details on this density estimation, see denspr.

type.nclass

character string specifying the procedure used to estimate the number of intervals of the histogram used in the logistic regression with repeated observations or the Poisson regression, respectively (see n.interval). Can be either "wand" (default), "FD", or "scott". Ignored if n.interval is specified. For details, see denspr.

fast

logical specifying whether a crude estimate for the number of permuted test scores larger than the respective observed gTDT value should be used. If FALSE, the exact number of permuted test scores larger than the respective observed gTDT value is computed.

rand

numeric value. If specified, i.e. not NA, the random number generator will be set into a reproducible state.

Value

The output of colTDTsam or colTDTebam is an object of class SAM or EBAM, respectively. All the features implemented in the R package siggenes for an SAM or EBAM analysis, respectively, can therefore be used in the SAM or EBAM analysis of case-parent trio data implemented in colTDTsam or colTDTebam, respectively. For details, see sam or ebam, respectively.

Author(s)

Holger Schwender, holger.schwender@udo.edu

References

Efron, B., Tibshirani, R., Storey, J.D., and Tusher, V. (2001). Empirical Bayes Analysis of a Microarray Experiment, Journal of the American Statistical Association, 96, 1151-1160.

Schwender, H. and Ickstadt, K. (2008). Empirical Bayes Analysis of Single Nucleotide Polymorphisms. BMC Bioinformatics, 9, 144.

Schwender, H., Taub, M.A., Beaty, T.H., Marazita, M.L., and Ruczinski, I. (2011). Rapid Testing of SNPs and Gene-Environment Interactions in Case-Parent Trio Data Based on Exact Analytic Parameter Estimation. Biometrics, 68, 766-773.

Tusher, V.G., Tibshirani, R., and Chu, G. (2001). Significance Analysis of Microarrays Applied to the Ionizing Radiation Response. Proceedings of the National Academy of Science of the United States of America, 98, 5116-5121.

See Also

colTDT, colTDTmaxStat, sam, ebam, SAM-class, EBAM-class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
# Load the simulated data.
data(trio.data)

# Perform a Significance Analysis of Microarrays (SAM).
sam.out <- colTDTsam(mat.test)

# By default an additive mode of inheritance is considered.
# If another mode, e.g., the dominant mode, should be 
# considered, then this can be done by
samDom.out <- colTDTsam(mat.test, model="dominant")

# Analogously, an Empirical Bayes Analysis of Microarrays based
# on the genotypic TDT can be performed by
ebam.out <- colTDTebam(mat.test)

trio documentation built on Nov. 8, 2020, 7:41 p.m.