ML_estimates: Compute maximum likelihood estimates and Deltal

Description Usage Arguments Value References Examples

View source: R/ML_estimates.R

Description

For each methylation site, this function computes certain maximum likelihood estimates and a measure of heritability called Δl (with higher values corresponding to more highly heritable methylation sites), as described briefly below and fully in (Joo et al., 2018).

Usage

1
ML_estimates(typed_genos, M_values, sort = TRUE, na_omit = TRUE, ncores = 1)

Arguments

typed_genos

A named list, usually generated by genotype_combinations. Each element of the list is a data frame specifying all possible joint genotypes of selected family members within each family, and the joint probability of each genotype combination.

M_values

A matrix of M-values,with rows corresponding to the methylation sites and columns corresponding to people. The column names should correspond to the column names appearing in typed_genos.

sort

Re-order the methylation sites to have decreasing values of delta.l if TRUE (the default), or leave the sites in the original order if FALSE

na_omit

Remove any methylation sites with missing values (NA) of delta.l if TRUE (the default), or return the results for all sites if FALSE. Usually, missing values of delta.l are due to well-known singularities in the likelihood of the Gaussian mixtures model.

ncores

The number of cores to be used, with ncores = 1 (the default) corresponding to non-parallel computing. When ncores > 1, the parallel package is used to parallelize the calculation, by dividing the methylation sites between the cores.

Value

A data frame with 15 columns. In the column names, the suffixes .mendel and .mix refer the Mendelian and mixture models of (Joo et al., 2018). Briefly, the mixture model is the standard Gaussian mixture model with two groups (group 0 and group 1), so group memberships are independent and the M-values of each group are normally distributed. The Mendelian model is the same except that group memberships are dependent within families, and are modelled as the carrier status of a rare, autosomal genetic variant. In the column names, the prefixes mu and sd refer to the maximum likelihood estimates of the mean and standard deviation of each group's normal distribution, and the suffix ll refers to each model's maximised log-likelihood (i.e., the log-likelihood function evaluated at the maximum likelihood estimates). The suffix .null refers to the null model that is nested inside both the Mendelian and mixture models, in which the means and standard deviations for the two groups are equal (i.e., mu0 = mu1 and sd0 = sd1). The column delta.l gives the difference between ll.mendel and ll.mix, and is the measure of heritability (Δl) that was introduced in (Joo et al., 2018).

References

Joo JE, Dowty JG, Milne RL, Wong EM, Dugué PA, English D, Hopper JL, Goldgar DE, Giles GG, Southey MC, kConFab. Heritable DNA methylation marks associated with susceptibility to breast cancer. Nat Commun. 2018 Feb 28;9(1):867. https://doi.org/10.1038/s41467-018-03058-6

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
# Example data
str(ped)
str(M_values)

# Calculate genotype probabilities
typed_genos <- genotype_combinations(ped)
str(typed_genos)


# Compute Delta l
MLEs <- ML_estimates(typed_genos, M_values, ncores = 4)
str(MLEs)

heritEWAS documentation built on July 1, 2020, 6:02 p.m.