learnFamilyBasedUDGs: Learning of Gaussian Undirected PGMs from Family Data

Description Usage Arguments Value References Examples

View source: R/learnFamilyBasedPGMs.R

Description

A Gaussian undirected PGM and its decomposition into genetic and environmental components are learned from observational family data by assigning an edge between every pair of variables such that the partial correlation between the two variables in question (in the respective component) given all the other variables is significantly different from zero.

The zero partial correlation tests are derived in the work by \insertCiteribeiro2019family;textualFamilyBasedPGMs. These tests are based on univariate polygenic linear mixed models \insertCitealmasy1998multipointFamilyBasedPGMs, with two components of variance: the polygenic or family-specific random effect, which models the phenotypic variability across the families, and the environmental or subject-specific error, which models phenotypic variability after removing the familial aggregation effect.

Usage

1
2
3
4
learnFamilyBasedUDGs(phen.df, covs.df, pedigrees, sampled, fileID,
  dirToSave, alpha = 0.05, correction = NULL, max_cores = NULL,
  minK = 10, maxFC = 0.01, orthogonal = TRUE, useGPU = FALSE,
  debug = TRUE, logFile = NULL)

Arguments

phen.df

A data.frame with phenotype variables of only sampled subjects. Column names must be properly set with the names of the phenotypes.

covs.df

A data.frame with covariates of only sampled subjects. Column names must be properly set with the names of the covariates.

pedigrees

A data.frame with columuns famid, id, dadid, momid, and sex columns for all sampled and non-sampled subjects.

sampled

A logical vector in which element i indicates whether individual i was sampled or not.

fileID

A character string to be used as prefix in the filenames of R objects with the partial correlation results. Note that covariates are not identified in these files.

dirToSave

Path to the folder you want to save the output objects.

alpha

The significance level to be used in the partial correlation tests.

correction

A character string indicating the correction method to be used in the p.adjust function. The options are: "holm", "hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", and "none".

max_cores

An integer value indicating the maximum number of CPU cores to be used for parallel execution.

minK

A scalar indicating the minimum dimension allowed in the dimensionality reduction for confounding correction.

maxFC

A scalar between 0 and 1 indicating the maximum fraction of confounding allowed.

orthogonal

A logical value indicating whether the transformation matrix used in the confounding correction is orthogonal or not.

useGPU

A logical value indicating whether GPU cores can be used for parallel execution.

debug

A logical value indicating whether some debug messages can be shown.

logFile

Optional file path and name to save progress and error messages. If not provided and debug is True a default file is created in the dirToSave folder.

Value

Returns a list with the following elements:

pcor

A list with the total (pcor_t), genetic (pcor_g), and environmental (pcor_e) partial correlation matrices.

adjM

A list with the total (t), genetic (g), and environmental (e) partial adjacency matrices.

udg

A list with the total (t), genetic (g), and environmental (e) igraph objects representing the respective undirected graphs.

References

\insertAllCited

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
data(scen3) # available simulated datasets are scen1, scen2, scen3, and scen4

scenario = 3 # data was simulated according to scenario 3

fam.nf <- scen3$fam.nf
pedigrees <- scen3$pedigrees
phen.df <- scen3$phen.df[[1]] # accessing the first replicate
covs.df <- NULL # no covariates were used in the simulation process.

N <- sum(fam.nf) # total number of individuals
sampled <- rep(1, N) # in simulated data, all individuals were sampled.

fileID <- paste0("scen", scenario)
dirToSave <- paste0("./objects-UDG-", fileID, "/")
dir.create(dirToSave, showWarnings=FALSE)
alpha = 0.05

udgs.out <- learnFamilyBasedUDGs(phen.df, covs.df, pedigrees, sampled,
                                 fileID, dirToSave, alpha, correction=NULL,
                                 max_cores=NULL, minK=10, maxFC = 0.05,
                                 orthogonal=TRUE, useGPU=FALSE, debug=TRUE)

# the adjacency matrix of the learned undirected genetic PGM
udgs.out$adjM$g

# the estimates, p-values, and effective sizes of the genetic partial correlations
udgs.out$pCor$pCor_g

# plotting the the learned undirected genetic PGM as an `igraph` object:
plot(udgs.out$udg$g, vertex.size=30, vertex.color="lightblue")

#' # the adjacency matrix of the learned undirected environmental PGM
udgs.out$adjM$e

# the estimates, p-values, and effective sizes of the environmental partial correlations
udgs.out$pCor$pCor_e

# plotting the the learned undirected environmental PGM as an `igraph` object:
plot(udgs.out$udg$e, vertex.size=30, vertex.color="lightblue")

adele/FamilyBasedPGMs documentation built on Feb. 16, 2021, 8:29 a.m.