RAINBOWR: Reliable Association INference By Optimizing Weights with R
In RAINBOWR: Genome-Wide Association Study with SNP-Set Methods

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.dpi=96
)

NOTE!!!!

The paper for `RAINBOWR` has been published in PLOS Computational Biology (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663). If you use this `RAINBOWR` in your paper, please cite `RAINBOWR` as follows:

Hamazaki, K. and Iwata, H. (2020) RAINBOW: Haplotype-based genome-wide association study using a novel SNP-set method. PLOS Computational Biology, 16(2): e1007663.

The stable version for `RAINBOWR` package is now available at the CRAN (Comprehensive R Archive Network).

Please check the change in RAINBOWR with the version update from NEWS.md.

In this repository, the R package RAINBOWR is available. Here, we describe how to install and how to use RAINBOWR.

What is `RAINBOWR`

RAINBOWR(Reliable Association INference By Optimizing Weights with R) is a package to perform several types of GWAS as follows.

Single-SNP GWAS by RGWAS.normal function
SNP-set (, haplotype-block, or gene-set) GWAS by RGWAS.multisnp function (which tests multiple SNPs at the same time)
Check epistatic (SNP-set x SNP-set interaction) effects by RGWAS.epistasis (very slow and less reliable)
Single-SNP GWAS including tests for interaction with genetic background by RGWAS.normal function
SNP-set (, haplotype-block, or gene-set) GWAS including tests for interaction with genetic background (or epistatic effects with polygenes) by RGWAS.multisnp function (which tests multiple SNPs at the same time)

RAINBOWR also offers some functions to solve the linear mixed effects model.

Solve multi-kernel linear mixed effects model with EM3.general function (using gaston, MM4LMM, or RAINBOWR packages; fast for gaston and MM4LMM)
Solve one-kernel linear mixed effects model with EMM.cpp function
Solve multi-kernel linear mixed effects model with EM3.cpp function (for the general kernel, not so fast)
Solve multi-kernel linear mixed effects model with EM3.linker.cpp function (for the linear kernel, fast)

By utilizing these functions, you can estimate the genomic heritability and perform genomic prediction (GP).

Finally, RAINBOWR offers other useful functions.

qq and manhattan function to draw Q-Q plot and Manhattan plot
modify.data function to match phenotype and marker genotype data
CalcThreshold function to calculate thresholds for GWAS results
See function to see a brief view of data (like head function, but more useful)
genetrait function to generate pseudo phenotypic values from marker genotype
SS_GWAS function to summarize GWAS results (only for simulation study)
estPhylo and estNetwork functions to estimate phylogenetic tree or haplotype network and haplotype effects with non-linear kernels for haplotype blocks of interest.
convertBlockList function to convert haplotype block list estimated by PLINK to the format which can be inputted as a gene.set argument in RGWAS.multisnp, RGWAS.multisnp.interaction, and RGWAS.epistasis functions.

Installation

The stable version of RAINBOWR is now available at the CRAN (Comprehensive R Archive Network). The latest version of RAINBOWR is also available at the KosukeHamazaki/RAINBOWR repository in the GitHub, so please run the following code in the R console.

#### Stable version of RAINBOWR ####
install.packages("RAINBOWR")  


#### Latest version of RAINBOWR ####
### If you have not installed yet, ...
install.packages("devtools")  

### Install RAINBOWR from GitHub
devtools::install_github("KosukeHamazaki/RAINBOWR")

If you get some errors via installation, please check if the following packages are correctly installed. (We removed a dependency on rgl package!)

Rcpp # install Rtools for Windows user
plotly # Suggests
Matrix
cluster
MASS
pbmcapply
optimx
methods
ape
stringr
pegas
ggplot2 # Suggests
ggtree # Suggests, install from Bioconducter with BiocManager::install("ggtree")
scatterpie # Suggests
phylobase # Suggests
haplotypes # Suggests
rrBLUP
expm
here # Suggests
htmlwidgets
Rfast
adegenet # Suggests
gaston
MM4LMM
furrr # Suggests
future # Suggests
progressr # Suggests
foreach # Suggests, but stongly recommend the installation for Windows user to parallel
doParallel # Suggests
data.table # Suggests

In RAINBOWR, since part of the code is written in Rcpp (C++ in R), please check if you can use C++ in R. For Windows users, you should install Rtools.

If you have some questions about installation, please contact us by e-mail (hamazaki@ut-biomet.org).

Usage

First, import RAINBOWR package and load example datasets. These example datasets consist of marker genotype (scored with {-1, 0, 1}, 1,536 SNP chip (Zhao et al., 2010; PLoS One 5(5): e10780)), map with physical position, and phenotypic data (Zhao et al., 2011; Nature Communications 2:467). Both datasets can be downloaded from Rice Diversity homepage (http://www.ricediversity.org/data/). Also, the dataset includes a list of haplotype blocks from the version 0.1.30. This list was estimated by the PLINK 1.9 (Taliun et al., 2014; BMC Bioinformatics, 15).

``` {r, include=TRUE}

Import RAINBOWR

require(RAINBOWR)

Load example datasets

data("Rice_Zhao_etal") Rice_geno_score <- Rice_Zhao_etal$genoScore Rice_geno_map <- Rice_Zhao_etal$genoMap Rice_pheno <- Rice_Zhao_etal$pheno Rice_haplo_block <- Rice_Zhao_etal$haploBlock

View each dataset

See(Rice_geno_score) See(Rice_geno_map) See(Rice_pheno) See(Rice_haplo_block)

You can check the original data format by `See` function.
Then, select one trait (here, `Flowering.time.at.Arkansas`) for example.

``` {r, include=TRUE}
### Select one trait for example
trait.name <- "Flowering.time.at.Arkansas"
y <- Rice_pheno[, trait.name, drop = FALSE]

For GWAS, first you can remove SNPs whose MAF <= 0.05 by MAF.cut function.

``` {r, include=TRUE}

Remove SNPs whose MAF <= 0.05

x.0 <- t(Rice_geno_score) MAF.cut.res <- MAF.cut(x.0 = x.0, map.0 = Rice_geno_map) x <- MAF.cut.res$x map <- MAF.cut.res$map

Next, we estimate additive genomic relationship matrix (GRM) by using `calcGRM` function.

``` {r, include=TRUE}
### Estimate genomic relationship matrix (GRM) 
K.A <- calcGRM(genoMat = x)

Next, we modify these data into the GWAS format of RAINBOWR by modify.data function.

``` {r, include=TRUE}

Modify data

modify.data.res <- modify.data(pheno.mat = y, geno.mat = x, map = map, return.ZETA = TRUE, return.GWAS.format = TRUE) pheno.GWAS <- modify.data.res$pheno.GWAS geno.GWAS <- modify.data.res$geno.GWAS ZETA <- modify.data.res$ZETA

View each data for RAINBOWR

See(pheno.GWAS) See(geno.GWAS) str(ZETA)

`ZETA` is a list of genomic relationship matrix (GRM) and its design matrix.

Finally, we can perform `GWAS` using these data.
First, we perform single-SNP GWAS by `RGWAS.normal` function as follows.

``` {r, include=TRUE}
### Perform single-SNP GWAS
normal.res <- RGWAS.normal(pheno = pheno.GWAS, geno = geno.GWAS,
                           plot.qq = FALSE, plot.Manhattan = FALSE,
                           ZETA = ZETA, n.PC = 4, P3D = TRUE, 
                           skip.check = TRUE, count = FALSE)
See(normal.res$D)  ### Column 4 contains -log10(p) values for markers

``` {r, echo=FALSE} qq(normal.res$D[, 4]) manhattan(normal.res$D)

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

Next, we perform SNP-set GWAS by `RGWAS.multisnp` function.

``` {r, include=TRUE, message=FALSE}
### Perform SNP-set GWAS (by regarding 11 SNPs as one SNP-set, first 300 SNPs)
SNP_set.res <- RGWAS.multisnp(pheno = pheno.GWAS, geno = geno.GWAS[1:300, ], ZETA = ZETA,
                              plot.qq = FALSE, plot.Manhattan = FALSE, count = FALSE,
                              n.PC = 4, test.method = "LR", kernel.method = "linear",
                              gene.set = NULL, skip.check = TRUE, 
                              test.effect = "additive", window.size.half = 5, window.slide = 11)

See(SNP_set.res$D)  ### Column 4 contains -log10(p) values for markers

``` {r, echo=FALSE} qq(SNP_set.res$D[, 4]) manhattan(SNP_set.res$D)

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

You can perform SNP-set GWAS with sliding window by setting `window.slide = 1`.
You can perform SNP-set GWAS with sliding window by setting `window.slide = 1`.
And you can also perform gene-set (or haplotype-block based) GWAS by assigning the following data set to `gene.set` argument. (You can check the example also by `See(Rice_haplo_block)` in R.)

ex.)

|  gene (or haplotype block)   |  marker | 
| :-----: | :------:| 
| haploblock_1    | id1005261 | 
| haploblock_1    | id1005263 | 
| haploblock_2    | id1009557 | 
| haploblock_2    | id1009616 | 
| haploblock_3    | id1020154 | 
| ...    | ... | 



``` {r, include=TRUE, message=FALSE}
### Perform haplotype-block based GWAS (by using hapltype blocks estimated by PLINK, first 400 SNPs)
haplo_block.res <- RGWAS.multisnp(pheno = pheno.GWAS, geno = geno.GWAS[1:400, ], ZETA = ZETA,
                              plot.qq = FALSE, plot.Manhattan = FALSE, count = FALSE,
                              n.PC = 4, test.method = "LR", kernel.method = "linear", 
                              gene.set = Rice_haplo_block, skip.check = TRUE, 
                              test.effect = "additive")

See(haplo_block.res$D)  ### Column 4 contains -log10(p) values for markers

``` {r, echo=FALSE} qq(haplo_block.res$D[, 4]) manhattan(haplo_block.res$D)

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

```

There is no significant block for this dataset because the number of markers and blocks is too small for this dataset. However, when whole-genome sequencing data is available, the impact of using SNP-set/gene-set/haplotype-block methods becomes larger and we strongly recommend you use these methods. Please see Hamazaki and Iwata (2020, PLOS Comp Biol) for more details of the features of these methods.

Help

If you have some help before performing GWAS with RAINBOWR, please see the help for each function by ?function_name.

References

Kennedy, B.W., Quinton, M. and van Arendonk, J.A. (1992) Estimation of effects of single genes on quantitative traits. J Anim Sci. 70(7): 2000-2012.

Storey, J.D. and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci. 100(16): 9440-9445.

Yu, J. et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet. 38(2): 203-208.

Kang, H.M. et al. (2008) Efficient Control of Population Structure in Model Organism Association Mapping. Genetics. 178(3): 1709-1723.

Kang, H.M. et al. (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet. 42(4): 348-354.

Zhang, Z. et al. (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 42(4): 355-360.

Endelman, J.B. (2011) Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP. Plant Genome J. 4(3): 250.

Endelman, J.B. and Jannink, J.L. (2012) Shrinkage Estimation of the Realized Relationship Matrix. G3 Genes, Genomes, Genet. 2(11): 1405-1413.

Su, G. et al. (2012) Estimating Additive and Non-Additive Genetic Variances and Predicting Genetic Merits Using Genome-Wide Dense Single Nucleotide Polymorphism Markers. PLoS One. 7(9): 1-7.

Zhou, X. and Stephens, M. (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet. 44(7): 821-824.

Listgarten, J. et al. (2013) A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics. 29(12): 1526-1533.

Lippert, C. et al. (2014) Greater power and computational efficiency for kernel-based association testing of sets of genetic variants. Bioinformatics. 30(22): 3206-3214.

Jiang, Y. and Reif, J.C. (2015) Modeling epistasis in genomic selection. Genetics. 201(2): 759-768.

Hamazaki, K. and Iwata, H. (2020) RAINBOW: Haplotype-based genome-wide association study using a novel SNP-set method. PLOS Computational Biology, 16(2): e1007663.

Any scripts or data that you put into this service are public.

RAINBOWR documentation built on June 8, 2025, 1:54 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RAINBOWR
Genome-Wide Association Study with SNP-Set Methods

RAINBOWR: Reliable Association INference By Optimizing Weights with R
In RAINBOWR: Genome-Wide Association Study with SNP-Set Methods

NOTE!!!!

The paper for `RAINBOWR` has been published in PLOS Computational Biology (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663). If you use this `RAINBOWR` in your paper, please cite `RAINBOWR` as follows:

The stable version for `RAINBOWR` package is now available at the CRAN (Comprehensive R Archive Network).

Please check the change in RAINBOWR with the version update from NEWS.md.

What is `RAINBOWR`

Installation

Usage

Import RAINBOWR

Load example datasets

View each dataset

Remove SNPs whose MAF <= 0.05

Modify data

View each data for RAINBOWR

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

Help

References

Try the RAINBOWR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

RAINBOWR Genome-Wide Association Study with SNP-Set Methods

RAINBOWR: Reliable Association INference By Optimizing Weights with R In RAINBOWR: Genome-Wide Association Study with SNP-Set Methods

NOTE!!!!

The paper for RAINBOWR has been published in PLOS Computational Biology (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663). If you use this RAINBOWR in your paper, please cite RAINBOWR as follows:

The stable version for RAINBOWR package is now available at the CRAN (Comprehensive R Archive Network).

Please check the change in RAINBOWR with the version update from NEWS.md.

What is RAINBOWR

Installation

Usage

Import RAINBOWR

Load example datasets

View each dataset

Remove SNPs whose MAF <= 0.05

Modify data

View each data for RAINBOWR

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

Automatically draw Q-Q plot and Manhattan if you set plot.qq = TRUE and plot.Manhattan = TRUE.

Help

References

Try the RAINBOWR package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

RAINBOWR
Genome-Wide Association Study with SNP-Set Methods

RAINBOWR: Reliable Association INference By Optimizing Weights with R
In RAINBOWR: Genome-Wide Association Study with SNP-Set Methods

The paper for `RAINBOWR` has been published in PLOS Computational Biology (https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663). If you use this `RAINBOWR` in your paper, please cite `RAINBOWR` as follows:

The stable version for `RAINBOWR` package is now available at the CRAN (Comprehensive R Archive Network).

What is `RAINBOWR`