This tutorial will guide you on how to perform GWAS with SLOPE. Analysis consists of three simple steps.
You need to provide paths to three files:
library(geneSLOPE) famFile <- system.file("extdata", "plinkPhenotypeExample.fam", package = "geneSLOPE") mapFile <- system.file("extdata", "plinkMapExample.map", package = "geneSLOPE") snpsFile <- system.file("extdata", "plinkDataExample.raw", package = "geneSLOPE")
phenotype <- read_phenotype(filename = famFile)
When you have phenotype you can move to reading snp data. Depending on data size reading SNPs may long time. As data is very large, snps are filtered with their marginal test p-value. All snps which p-values are larger than threshold $pValMax$ will be truncated. For details on how to choose $pValMax$ see How changing parameters affects my analysis?
screening.result <- screen_snps(snpsFile, mapFile, phenotype, pValMax = 0.05, chunkSize = 1e2, verbose=FALSE)
Parameter verbose=FALSE suppresses progress bar. Default value is TRUE.
User look into result of reading and screening dataset
summary(screening.result)
When data is successfully read, one can move to the second step of analysis.
Next step is clumping. Highly correlated snps will be clustered. For details see How this procedure works exactly? $rho$ controls number and size of clumps. For details see How changing parameters affects my analysis?
clumping.result <- clump_snps(screening.result, rho = 0.3, verbose = FALSE)
What is the result of clumping procedure?
summary(clumping.result)
We can also plot our results
plot(clumping.result)
If we are interested in specific chromosome we can "zoom it"
plot(clumping.result, chromosomeNumber = 1)
It is possible to identify interactively clump number that contains SNP of interest. The procedure is the following. First plot the whole genome, then run function \emph{identify_clump} and click on SNP of interest.
plot(clumping.result) identify_clump(clumping.result)
Knowing clump number one can zoom into it.
plot(clumping.result, clumpNumber = 1)
Last step of analysis is using SLOPE
slope.result <- select_snps(clumping.result, fdr=0.1)
As before one can plot and summarize results
summary(slope.result) plot(slope.result)
Like with result of clumping, it is possible to identify interactively clump number which contains specific SNP selected by SLOPE. The procedure is the following. First plot the whole genome, then run function \emph{identify_clump} and click on SNP of interest.
plot(slope.result) identify_clump(slope.result)
When clump is identified one can zoom into it
plot(slope.result, clumpNumber = 1)
It is easy to get information about selected SNPs. To get indices of columns in original SNP matrix they refer to use
slope.result$selectedSnpsNumbers
If .map file was given, then one can get more information about SNPs
slope.result$X_info[slope.result$selectedSnpsNumbers,]
For information about SNPs that are part of specific clump use
summary(slope.result, clumpNumber = 1)
There are three numerical parameters that influence result
Input: $rho \in (0, 1)$;
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.