knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

Introduction

In this vignette, we are going to detect associations between genetic information and one or more quantitative traits.

We will be using simulated data, with a simple relationship between genotype and phenotype. To see how this same analysis is done with more messy data, see the 'test_assoc_qt' vignette.

Setup

Set the RNG seed:

set.seed(11)

First load plinkr:

library(plinkr)

This vignette will build whether or not PLINK is installed

if (is_plink_installed()) {
  message("PLINK is installed")
} else {
  message(
    "PLINK is not installed",
    "",
    "Tip: run 'plinkr::install_plinks()' to do so"
  )
}

Detecting associations for quantitative traits

To do an association (both for a single and multiple traits), we need some data:

Here, we create some simple data, as to be used in testing, or -in this case- a simple demonstration:

assoc_qt_data <- create_demo_assoc_qt_data(
  n_individuals = 16
)

The mapping table

knitr::kable(assoc_qt_data$data$map_table)

The pedigree table

The PLINK example \code{.ped} contains:

Show only the pedigree:

knitr::kable(assoc_qt_data$data$ped_table[, 1:6])

Note that the pedigree table has a column called case_control_code, which will be completely ignored.

Show only the SNP calls:

knitr::kable(assoc_qt_data$data$ped_table[, -(1:6)])

The phenotype table

The pedigree table has a column called case_control_code, which will be completely ignored, as it only allows to put in one phenotype. Instead, we use a table of one or more phenotypic values:

knitr::kable(assoc_qt_data$phenotype_data$phe_table)

The phenotypes are named after their relationship to the genotype:

Detecting the association

With the mapping, pedigree and phenotype table, we can detect the association between genotype and the two traits:

if (is_plink_installed()) {
  assoc_qt_result <- assoc_qt(
    assoc_qt_data = assoc_qt_data
  )
  knitr::kable(assoc_qt_result$qassoc_table)
}

In the second row, the association between snp_2 and the additive phenotype is discovered correctly, as the R2 (the R-squared value, i.e. the proportion of the phenotype that can be explained by the genotype) gives the highest possible value of 1.0, which denotes the phenotype can be perfectly explained by the genotype.

Add a bit of noise to the data

assoc_qt_data$phenotype_data$phe_table$additive <-
  assoc_qt_data$phenotype_data$phe_table$additive + 
  runif(n = 16, min = -0.001, max = 0.001
)
if (is_plink_installed()) {
  assoc_qt_result <- assoc_qt(
    assoc_qt_data = assoc_qt_data
  )
  knitr::kable(assoc_qt_result$qassoc_table)
}

Cleanup

clear_plinkr_cache()
check_empty_plinkr_folder()


richelbilderbeek/plinkr documentation built on March 25, 2024, 3:18 p.m.