RGWAS.epistasis: Check epistatic effects by kernel-based GWAS (genome-wide...
In KosukeHamazaki/RAINBOW: Genome-Wide Association Study with SNP-Set Methods

Description Usage Arguments Value References Examples

Check epistatic effects by kernel-based GWAS (genome-wide association studies)

RGWAS.epistasis(
  pheno,
  geno,
  ZETA = NULL,
  covariate = NULL,
  covariate.factor = NULL,
  structure.matrix = NULL,
  n.PC = 0,
  min.MAF = 0.02,
  n.core = 1,
  test.method = "LR",
  dominance.eff = TRUE,
  haplotype = TRUE,
  num.hap = NULL,
  window.size.half = 5,
  window.slide = 1,
  chi0.mixture = 0.5,
  optimizer = "nlminb",
  gene.set = NULL,
  plot.epi.3d = TRUE,
  plot.epi.2d = TRUE,
  main.epi.3d = NULL,
  main.epi.2d = NULL,
  saveName = NULL,
  verbose = TRUE,
  verbose2 = FALSE,
  count = TRUE,
  time = TRUE
)

`pheno`	Data frame where the first column is the line name (gid). The remaining columns should be a phenotype to test.
`geno`	Data frame with the marker names in the first column. The second and third columns contain the chromosome and map position. Columns 4 and higher contain the marker scores for each line, coded as -1, 0, 1 = aa, Aa, AA.
`ZETA`	A list of covariance (relationship) matrix (K: m \times m) and its design matrix (Z: n \times m) of random effects. Please set names of list "Z" and "K"! You can use more than one kernel matrix. For example, ZETA = list(A = list(Z = Z.A, K = K.A), D = list(Z = Z.D, K = K.D)) Z.A, Z.D Design matrix (n \times m) for the random effects. So, in many cases, you can use the identity matrix. K.A, K.D Different kernels which express some relationships between lines. For example, K.A is additive relationship matrix for the covariance between lines, and K.D is dominance relationship matrix.
`covariate`	A n \times 1 vector or a n \times p _ 1 matrix. You can insert continuous values, such as other traits or genotype score for special markers. This argument is regarded as one of the fixed effects.
`covariate.factor`	A n \times p _ 2 dataframe. You should assign a factor vector for each column. Then RGWAS changes this argument into model matrix, and this model matrix will be included in the model as fixed effects.
`structure.matrix`	You can use structure matrix calculated by structure analysis when there are population structure. You should not use this argument with n.PC > 0.
`n.PC`	Number of principal components to include as fixed effects. Default is 0 (equals K model).
`min.MAF`	Specifies the minimum minor allele frequency (MAF). If a marker has a MAF less than min.MAF, it is assigned a zero score.
`n.core`	Setting n.core > 1 will enable parallel execution on a machine with multiple cores (use only at UNIX command line).
`test.method`	RGWAS supports two methods to test effects of each SNP-set. "LR" Likelihood-ratio test, relatively slow, but accurate (default). "score" Score test, much faster than LR, but sometimes overestimate -log10(p).
`dominance.eff`	If this argument is TRUE, dominance effect is included in the model, and additive x dominance and dominance x dominance are also tested as epistatic effects. When you use inbred lines, please set this argument FALSE.
`haplotype`	If the number of lines of your data is large (maybe > 100), you should set haplotype = TRUE. When haplotype = TRUE, haplotype-based kernel will be used for calculating -log10(p). (So the dimension of this gram matrix will be smaller.) The result won't be changed, but the time for the calculation will be shorter.
`num.hap`	When haplotype = TRUE, you can set the number of haplotypes which you expect. Then similar arrays are considered as the same haplotype, and then make kernel(K.SNP) whose dimension is num.hap x num.hap. When num.hap = NULL (default), num.hap will be set as the maximum number which reflects the difference between lines.
`window.size.half`	This argument decides how many SNPs (around the SNP you want to test) are used to calculated K.SNP. More precisely, the number of SNPs will be 2 * window.size.half + 1.
`window.slide`	This argument determines how often you test markers. If window.slide = 1, every marker will be tested. If you want to perform SNP set by bins, please set window.slide = 2 * window.size.half + 1.
`chi0.mixture`	RAINBOW assumes the deviance is considered to follow a x chisq(df = 0) + (1 - a) x chisq(df = r). where r is the degree of freedom. The argument chi0.mixture is a (0 <= a < 1), and default is 0.5.
`optimizer`	The function used in the optimization process. We offer "optim", "optimx", and "nlminb" functions.
`gene.set`	If you have information of gene (or haplotype block), you can use it to perform kernel-based GWAS. You should assign your gene information to gene.set in the form of a "data.frame" (whose dimension is (the number of gene) x 2). In the first column, you should assign the gene name. And in the second column, you should assign the names of each marker, which correspond to the marker names of "geno" argument.
`plot.epi.3d`	If TRUE, draw 3d plot
`plot.epi.2d`	If TRUE, draw 2d plot
`main.epi.3d`	The title of 3d plot. If this argument is NULL, trait name is set as the title.
`main.epi.2d`	The title of 2d plot. If this argument is NULL, trait name is set as the title.
`saveName`	When drawing any plot, you can save plots in png format. In saveName, you should substitute the name you want to save. When saveName = NULL, the plot is not saved.
`verbose`	If this argument is TRUE, messages for the current steps will be shown.
`verbose2`	If this argument is TRUE, welcome message will be shown.
`count`	When count is TRUE, you can know how far RGWAS has ended with percent display.
`time`	When time is TRUE, you can know how much time it took to perform RGWAS.

$map

Map information for SNPs which are tested epistatic effects.

$scores

$scores: This is the matrix which contains -log10(p) calculated by the test about epistasis effects.
$x, $y: The information of the positions of SNPs detected by regular GWAS. These vectors are used when drawing plots. Each output correspond to the repliction of row and column of scores.
$z: This is a vector of $scores. This vector is also used when drawing plots.

Storey, J.D. and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci. 100(16): 9440-9445.

Yu, J. et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet. 38(2): 203-208.

Kang, H.M. et al. (2008) Efficient Control of Population Structure in Model Organism Association Mapping. Genetics. 178(3): 1709-1723.

Endelman, J.B. (2011) Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP. Plant Genome J. 4(3): 250.

Endelman, J.B. and Jannink, J.L. (2012) Shrinkage Estimation of the Realized Relationship Matrix. G3 Genes, Genomes, Genet. 2(11): 1405-1413.

Su, G. et al. (2012) Estimating Additive and Non-Additive Genetic Variances and Predicting Genetic Merits Using Genome-Wide Dense Single Nucleotide Polymorphism Markers. PLoS One. 7(9): 1-7.

Zhou, X. and Stephens, M. (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet. 44(7): 821-824.

Listgarten, J. et al. (2013) A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics. 29(12): 1526-1533.

Lippert, C. et al. (2014) Greater power and computational efficiency for kernel-based association testing of sets of genetic variants. Bioinformatics. 30(22): 3206-3214.

Jiang, Y. and Reif, J.C. (2015) Modeling epistasis in genomic selection. Genetics. 201(2): 759-768.

  ### Import RAINBOW
  require(RAINBOW)

  ### Load example datasets
  data("Rice_Zhao_etal")
  Rice_geno_score <- Rice_Zhao_etal$genoScore
  Rice_geno_map <- Rice_Zhao_etal$genoMap
  Rice_pheno <- Rice_Zhao_etal$pheno

  ### View each dataset
  See(Rice_geno_score)
  See(Rice_geno_map)
  See(Rice_pheno)

  ### Select one trait for example
  trait.name <- "Flowering.time.at.Arkansas"
  y <- as.matrix(Rice_pheno[, trait.name, drop = FALSE])

  ### Remove SNPs whose MAF <= 0.05
  x.0 <- t(Rice_geno_score)
  MAF.cut.res <- MAF.cut(x.0 = x.0, map.0 = Rice_geno_map)
  x <- MAF.cut.res$x
  map <- MAF.cut.res$map


  ### Estimate genomic relationship matrix (GRM)
  K.A <- calcGRM(genoMat = x) 


  ### Modify data
  modify.data.res <- modify.data(pheno.mat = y, geno.mat = x, map = map,
                                 return.ZETA = TRUE, return.GWAS.format = TRUE)
  pheno.GWAS <- modify.data.res$pheno.GWAS
  geno.GWAS <- modify.data.res$geno.GWAS
  ZETA <- modify.data.res$ZETA


  ### View each data for RAINBOW
  See(pheno.GWAS)
  See(geno.GWAS)
  str(ZETA)


  ### Check epistatic effects (by regarding 11 SNPs as one SNP-set)
  epistasis.res <- RGWAS.epistasis(pheno = pheno.GWAS, geno = geno.GWAS, ZETA = ZETA,
                                   n.PC = 4, test.method = "score", gene.set = NULL,
                                   window.size.half = 5, window.slide = 11)

  See(epistasis.res$scores$scores)