knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
DNA methylation can be thought of as a type of "mark" on DNA that can
affect gene expression. Most methylation marks are erased soon after
conception but methylation is known to be effectively inherited from
parents to their offspring in rare cases. The heritEWAS
package provides
functions to identify these heritable DNA methylation marks using the
method of (Joo et al., 2018). This approach looks for Mendelian
patterns of inheritance within families and is based on the relationship
structure of each family and on methylation data for a subset of the family
members (but no genotype data). This method can handle large, multi-generational
families and it is optimised for methylation data at thousands or millions
of methylation sites, such as the data generated by epigenome-wide
association studies (EWAS) of related individuals.
You can install the released version of heritEWAS from CRAN with:
install.packages("heritEWAS")
library(heritEWAS)
To use this package, you will need two sets of data (described in more detail
in the help page for the function genotype_combinations
):
A data frame containing the pedigree data, i.e. the relationship structure
of the families. Each row of the data frame corresponds to a person, and the
columns correspond to each person's individual identifier (indiv
)
and the identifiers of his or her mother (mother
) and father (father
), as
well as a family identifier (family
) and a binary flag (typed
) which is 1
for people who have methylation data available. No family should contain
a pedigree loop, such as one caused by inbreeding.
A matrix of the M-values, with rows corresponding to methylation sites
(i.e. CpG probes) and columns corresponding to people. The column names should
match the individual identifiers of people in the pedigree data with
typed = 1
.
The pedigree data should look something like the following
(where extra variables like aff
and age
can be included but they will be
ignored):
head(ped) unique(ped$family)
And the M-values matrix should look something like:
# Colnames are the individual IDs of the pedigree data M_values[1:5, 1:5]
The main goal of the package heritEWAS
is to calculate a statistic $\Delta l$
for each methylation site. This statistic measures the strength of evidence
that the site's M-values follow a Mendelian pattern of inheritance within the
families, with larger values of $\Delta l$ corresponding to more heritable
methylation sites. This statistic is the difference in maximised
log-likelihoods of two statistical models, and can be interpreted as
a difference in the Bayesian information criteria of the two models; see (Joo
et al. 2018).
The most time-consuming part of the calculation of $\Delta l$ is
the same for all methylation sites, so the heritEWAS
package calculates
this part once, stores the output, then re-uses this calculation
for each methylation site.
This part of the calculation is performed by the function
genotype_combinations()
:
typed_genos <- genotype_combinations(ped)
The results are stored in a named list of data frames, with one data frame
per family. Each data frame gives the probability of each possible combination
of genotypes for those family members with methylation data
(i.e. those with typed = 1
). The possible genotypes for each person
are 0
and 1
, corresponding to non-carriers and carriers (respectively) of
a hypothetical genetic variant that controls methylation at a given
methylation site under one of the two statistical models used to define
$\Delta l$.
Impossible genotype combinations (those with a probability of 0
) are excluded
from the output of genotype_combinations()
.
str(typed_genos)
Given the genotype probabilities, we can use the package's main function
ML_estimates()
to compute $\Delta l$ for each site. The output is a
data frame with rows corresponding to methylation sites and columns giving
details about certain fitted models (see the help page of the function
ML_estimates
for more details). In particular, the column delta.l
gives the statistic $\Delta l$, which measures how heritable each methylation
site is.
MLEs <- ML_estimates(typed_genos, M_values, ncores = 2) head(MLEs)
Joo JE, Dowty JG, Milne RL, Wong EM, Dugue PA, English D, Hopper JL, Goldgar DE, Giles GG, Southey MC, kConFab. Heritable DNA methylation marks associated with susceptibility to breast cancer. Nat Commun. 2018 Feb 28;9(1):867. \url{https://doi.org/10.1038/s41467-018-03058-6}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.