knitr::opts_chunk$set(collapse = TRUE)

library("dplyr")
library("kableExtra")
library("knitr")
library("midasHLA")

Introduction

MiDAS is a comprehensive R package for immunogenetic data manipulation and statistical analysis. MiDAS recodes input data in the form of HLA alleles and KIR types into biologically meaningful variables, allowing HLA amino acid fine mapping, analyses of HLA evolutionary divergence as well as validated HLA-KIR interactions. Further, it allows comprehensive statistical association analysis workflows with phenotypes of diverse measurement scales. MiDAS thus closes the gap between the inference of immunogenetic variation and its efficient utilization to make relevant discoveries related to T cell, Natural Killer cell, and disease biology.

Quick start

Reading input data

MiDAS includes functions to read HLA and KIR typing data, checking for the correct format and adherence to the right nomenclature. HLA calls can be reduced to a desired resolution when importing it, e.g. from 6-digit to 4-digit. Phenotypic observations can be read e.g. using the read.table function (make sure to include stringsAsFactors = FALSE argument).

pheno_file <- system.file("extdata", "MiDAS_tut_pheno.txt", package = "midasHLA")
pheno <- read.table(pheno_file, header = TRUE, stringsAsFactors = FALSE)
pheno %>%
  head(10) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed"))
# HLA calls can be loaded using the readHlaCalls function with the desired resolution
hla_calls_file <- system.file("extdata", "MiDAS_tut_HLA.txt", package = "midasHLA")
hla_calls <- readHlaCalls(hla_calls_file, resolution = 4)
hla_calls %>%
  head(10) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed")) %>%
  scroll_box(width = "100%", height = "200px")
# KIR calls (currently presence/absence calls, no allele-level resolution) can be loaded using the readKirCalls function
kir_calls_file <- system.file("extdata", "MiDAS_tut_KIR.txt", package = "midasHLA")
kir_calls <- readKirCalls(kir_calls_file)
kir_calls %>%
  head(10) %>%
  kable() %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed")) %>%
  scroll_box(width = "100%", height = "200px")

For convenience, we will here use test data shipped with the package.

HLA <- reduceHlaCalls(MiDAS_tut_HLA, resolution = 4)
KIR <- MiDAS_tut_KIR
pheno <- MiDAS_tut_pheno

Creating midas objects

MiDAS provides the prepareMiDAS function to combine genetic and phenotypic data for subsequent analysis. In this step, it is also possible to infer amino acid level information from HLA calls, encode HLA-KIR interactions, or calculate HLA evolutionary divergence (experiment argument).

midas <- prepareMiDAS(
  hla_calls = HLA,
  kir_calls = KIR,
  colData = pheno,
  experiment = c("hla_alleles", "hla_aa")
  )

MiDAS object

A MiDAS object contains our input data as well as defined transformations. This is an extension of a MultiAssayExperiment and can be handled as such. MiDAS provides some functions to interact with MiDAS objects (check ?MiDAS-class for a full list).

For example, we can explore the HLA alleles frequencies, compare them with published frequencies from the allelefrequencies.net database, and filter out low frequency alleles.

freq <- getFrequencies(
  object = midas,
  carrier_frequency = FALSE,
  experiment = "hla_alleles",
  compare = TRUE
)
kable(freq) %>%
  kable_styling(bootstrap_options = c("striped", "hover", "condensed")) %>%
  scroll_box(width = "100%", height = "200px")
midas <-
  filterByFrequency(
    object = midas,
    experiment = "hla_alleles",
    lower_frequency_cutoff = 0.01
    )

Association analysis

Model definition

Before the actual analysis can be run, we still need to define a statistical model we would like to use. We can use most of the statistical models available in R, such as lm, glm, coxme etc. (technically, they need to have a tidy method available). MiDAS then provides a wrapper function that will evaluate our model for each HLA allele in our data.

Here we will use a very simple formula disease ~ term, term being a placeholder for each tested HLA allele. This notation is necessary to also allow interaction tests (e.g. disease ~ lab_value:term) and also works with other appropriate operations.

midas has to be passed as a data argument to our statistical function, here glm.

# Logistic regression
object <- glm(disease ~ term, data = midas, family = binomial(link = "logit"))

Running analysis

To run our analysis, we will use the runMiDAS function. This function offers multiple analysis scenarios which can be tuned using conditional and omnibus arguments. It can also pre-filter input data based on frequency (check ?runMiDAS to learn more).

results <-
  runMiDAS(
    object = object,
    experiment = "hla_alleles",
    inheritance_model = "dominant",
    conditional = FALSE,
    omnibus = FALSE,
    lower_frequency_cutoff = 0.05
    )

kableResults(results)


Genentech/midasHLA documentation built on Feb. 12, 2024, 9:38 a.m.