pvdiv_standard_gwas: Juenger lab standard GWAS function.

Description Usage Arguments Value Examples

View source: R/pvdiv_standard_gwas.R

Description

This function is a wrapper around the standard GWAS procedures in the Juenger lab. Singular value decomposition of the SNPs is done to get principal components for population structure correction; the 'best' number of PCs is chosen as the one that makes lambda_GC, the Genomic Control coefficient, closest to 1. (See the lambdagc parameter to set this yourself.) Next, genome-wide association is conducted, and the GWAS output can be saved, as well as Manhattan plots, QQ-plots, and annotation information for the top SNPs for each phenotype.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
pvdiv_standard_gwas(
  snp,
  df = switchgrassGWAS::pvdiv_phenotypes,
  type = c("linear", "logistic"),
  ncores = nb_cores(),
  outputdir = ".",
  covar = NULL,
  lambdagc = TRUE,
  savegwas = FALSE,
  savetype = c("rds", "fbm", "both"),
  suffix = "",
  saveplots = TRUE,
  saveannos = FALSE,
  txdb = NULL,
  minphe = 200,
  ...
)

Arguments

snp

A "bigSNP" object; load with bigsnpr::snp_attach(). Here, genomic information for Panicum virgatum. SNP data is available at doi:10.18738/T8/ET9UAU

df

Dataframe of phenotypes where the first column is PLANT_ID.

type

Character string. Type of univarate regression to run for GWAS. Options are "linear" or "logistic".

ncores

Number of cores to use. Default is one.

outputdir

String or file.path() to the output directory. Default is the working directory.

covar

Optional covariance matrix to include in the regression. You can generate these using pvdiv_autoSVD().

lambdagc

Default is TRUE - should lambda_GC be used to find the best population structure correction? Alternatively, you can provide a data frame containing "NumPCs" and the phenotype names containing lambda_GC values. This is saved to the output directory by pvdiv_standard_gwas and saved or generated by pvdiv_lambda_GC.

savegwas

Logical. Should the gwas output be saved to the working directory? These files are typically quite large. Default is FALSE.

savetype

Character string. Type of GWAS save file. Options are 'rds', which saves individual rds files for each GWAS; 'fbm', which saves one filebacked big matrix (using the bigsnpr package), or 'both', which saves both file types. These files are typically quite large.

suffix

Optional character vector to give saved files a unique search string/name.

saveplots

Logical. Should Manhattan and QQ-plots be generated and saved to the working directory? Default is TRUE.

saveannos

Logical. Should annotation tables for top SNPs be generated and saved to the working directory? Default is FALSE. Can take additional arguments; requires a txdb.sqlite object used in AnnotationDbi.

txdb

A txdb object such as 'Pvirgatum_516_v5.1.gene.txdb.sqlite'. Load this into your environment with AnnotationDbi::loadDb.

minphe

Integer. What's the minimum number of phenotyped individuals to conduct a GWAS on? Default is 200. Use lower values with caution.

...

Other arguments to pvdiv_lambda_GC or pvdiv_table_topsnps.

Value

A big_SVD object.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
## Not run: 
# Here we specify that we do want to generate and save the gwas dataframes,
# the Manhattan and QQ-plots, and the annotation tables.
pvdiv_standard_gwas(snp, df = pvdiv_phenotypes, type = "linear", covar = svd,
    ncores = nb_cores(), lambdagc = TRUE, savegwas = TRUE, saveplots = TRUE,
    saveannos = TRUE, txdb = txdb)

## End(Not run)
# In this example, we run GWAS on all the phenotypes in pvdiv_phenotypes
# using an example SNP set of ~1800 SNPs.
snpfile <- system.file("extdata", "example_bigsnp.rds", package = "switchgrassGWAS")
library(bigsnpr)
snp <- snp_attach(snpfile)
pvdiv_standard_gwas(snp, df = pvdiv_phenotypes, type = "linear", savegwas = FALSE,
    saveplots = FALSE, ncores = 1)

Alice-MacQueen/switchgrassGWAS documentation built on Jan. 23, 2022, 7:55 p.m.