knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
In this vignette, we are going to detect associations between genetic information and one or more quantitative traits.
We will be using simulated data, with a simple relationship between genotype and phenotype. To see how this same analysis is done with more messy data, see the 'test_assoc_qt' vignette.
Set the RNG seed:
set.seed(11)
First load plinkr
:
library(plinkr)
This vignette will build whether or not PLINK is installed
if (is_plink_installed()) { message("PLINK is installed") } else { message( "PLINK is not installed", "", "Tip: run 'plinkr::install_plinks()' to do so" ) }
To do an association (both for a single and multiple traits), we need some data:
.map
table or mapping table, which contains the location
of the single-nucleotide polymorphisms (SNPs).ped
table or pedigree table, which contains the pedigree
of the individuals and their SNP valuesHere, we create some simple data, as to be used in testing, or -in this case- a simple demonstration:
assoc_qt_data <- create_demo_assoc_qt_data( n_individuals = 16 )
knitr::kable(assoc_qt_data$data$map_table)
The PLINK example \code{.ped} contains:
Show only the pedigree:
knitr::kable(assoc_qt_data$data$ped_table[, 1:6])
Note that the pedigree table has a column called case_control_code
,
which will be completely ignored.
Show only the SNP calls:
knitr::kable(assoc_qt_data$data$ped_table[, -(1:6)])
The pedigree table has a column called case_control_code
,
which will be completely ignored, as it only allows to put in
one phenotype. Instead, we use a table of one or more phenotypic
values:
knitr::kable(assoc_qt_data$phenotype_data$phe_table)
The phenotypes are named after their relationship to the genotype:
With the mapping, pedigree and phenotype table, we can detect the association between genotype and the two traits:
if (is_plink_installed()) { assoc_qt_result <- assoc_qt( assoc_qt_data = assoc_qt_data ) knitr::kable(assoc_qt_result$qassoc_table) }
trait_name
: name of the quantitive trait,CHR
: Chromosome numberSNP
: SNP identifierBP
: Physical position (base-pair)NMISS
: Number of non-missing genotypesBETA
: Regression coefficientSE
: Standard errorR2
: Regression r-squaredT
: Wald test (based on t-distribution)P
: Wald test asymptotic p-valueIn the second row, the association between snp_2
and the additive
phenotype is discovered
correctly, as the R2
(the R-squared value, i.e. the
proportion of the phenotype that can be explained by the
genotype) gives the highest possible value of 1.0,
which denotes the phenotype can be perfectly explained
by the genotype.
assoc_qt_data$phenotype_data$phe_table$additive <- assoc_qt_data$phenotype_data$phe_table$additive + runif(n = 16, min = -0.001, max = 0.001 )
if (is_plink_installed()) { assoc_qt_result <- assoc_qt( assoc_qt_data = assoc_qt_data ) knitr::kable(assoc_qt_result$qassoc_table) }
clear_plinkr_cache()
check_empty_plinkr_folder()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.