Nothing
#' @name gl.grm.network
#' @title Represents a genomic relationship matrix (GRM) as a network
#' @description
#' This script takes a G matrix generated by \code{\link{gl.grm}} and represents
#' the relationship among the specimens as a network diagram. In order to use
#' this script, a decision is required on a threshold for relatedness to be
#' represented as link in the network, and on the layout used to create the
#' diagram.
#'
#' @param G A genomic relationship matrix (GRM) generated by
#' \code{\link{gl.grm}} [required].
#' @param x A genlight object from which the G matrix was generated [required].
#' @param method One of 'fr', 'kk', 'gh' or 'mds' [default 'fr'].
#' @param node.size Size of the symbols for the network nodes [default 8].
#' @param node.label TRUE to display node labels [default TRUE].
#' @param node.label.size Size of the node labels [default 3].
#' @param node.label.color Color of the text of the node labels
#' [default 'black'].
#' @param node.shape Optionally provide a vector of nPop shapes
#' (run gl.select.shapes() for shape options) [default NULL].
#' @param link.color Colors for links [default gl.select.colors].
#' @param link.size Size of the links [default 2].
#' @param relatedness_factor Factor of relatedness [default 0.125].
#' @param title Title for the plot
#' [default 'Network based on genomic relationship matrix'].
#' @param palette_discrete A discrete set of colors
#' with as many colors as there are populations in the dataset
#' [default NULL].
#' @param plot.dir Directory to save the plot RDS files [default as specified
#' by the global working directory or tempdir()]
#' @param plot.file Name for the RDS binary file to save (base name only, exclude extension) [default NULL]
#' @param verbose Verbosity: 0, silent or fatal errors; 1, begin and end; 2,
#' progress log ; 3, progress and results summary; 5, full report
#' [default 2 or as specified using gl.set.verbosity].
#' @details
#' The gl.grm.network function creates a network diagram that represents
#' genetic relationships among individuals in a dataset using a Genomic
#' Relationship Matrix (GRM). The GRM is generated by the gl.grm function,
#' which utilizes the A.mat function from the rrBLUP package. This method
#' follows the approach developed by Endelman and Jannink (2012).
#'
#' The GRM quantifies the additive genetic relationships between individuals
#' based on genome-wide SNP data. It provides an estimate of the actual genetic
#' similarity — known as realized relatedness— between individuals by measuring
#' how much of their genome they share identical by descent (IBD).
#'
#' Two alleles are Identical by State (IBS) if they are the same in state,
#' regardless of whether they come from a common ancestor. Two alleles are
#' Identical by Descent (IBD) if they are inherited from a common ancestor.
#' While IBS does not necessarily imply IBD, using high-density SNP data
#' improves the estimation of IBD probabilities from IBS measures.
#'
#' The off-Diagonal elements of the GRM represent twice the kinship coefficient
#' between pairs of individuals. The kinship coefficient is the probability
#' that a randomly selected allele from each individual at the same locus is
#' IBD. Diagonal elements represent one plus twice the inbreeding coefficient
#' of each individual. The inbreeding coefficient is the probability that both
#' alleles at a random locus within an individual are IBD.
#'
#' Choosing meaningful thresholds to represent relationships between
#' individuals can be challenging because kinship and inbreeding coefficients
#' re relative measures. To standardize the GRM and facilitate interpretation,
#' the function adjusts the matrix through the following steps:
#'
#' 1. Centering Inbreeding Coefficients: Subtract 1 from the mean of the
#' diagonal elements to calculate the average inbreeding coefficient. This
#' centers the inbreeding coefficients around zero, providing a reference point
#' relative to the population's average inbreeding level.
#'
#' 2. Calculating Kinship Coefficients: Divide the off-diagonal elements by 2
#' to obtain the kinship coefficients. This conversion reflects the probability
#' of sharing alleles IBD between pairs of individuals.
#'
#' 3. Centering Kinship Coefficients: Subtract the adjusted mean inbreeding
#' coefficient (from step 1) from each kinship coefficient (from step 2). This
#' centers the kinship coefficients relative to the population average,
#' allowing for meaningful comparisons.
#'
#' This adjustment method aligns with the approach used by Goudet et al. (2018),
#' enabling the relationships to be interpreted in the context of the overall
#' genetic relatedness within the population.
#'
#' Below is a table modified from Speed & Balding (2015) showing kinship values,
#' and their confidence intervals (CI), for different relationships that could
#' be used to guide the choosing of the relatedness threshold in the function.
#'
#' |Relationship|Kinship|95% CI|
#' |Identical twins/clones/same individual | 0.5 | - |
#'
#' |Sibling/Parent-Offspring | 0.25 | (0.204, 0.296)|
#'
#' |Half-sibling | 0.125 | (0.092, 0.158)|
#'
#' |First cousin | 0.062 | (0.038, 0.089)|
#'
#' |Half-cousin | 0.031 | (0.012, 0.055)|
#'
#' |Second cousin | 0.016 | (0.004, 0.031)|
#'
#' |Half-second cousin | 0.008 | (0.001, 0.020)|
#'
#' |Third cousin | 0.004 | (0.000, 0.012)|
#'
#' |Unrelated | 0 | - |
#'
#' Four layout options are implemented in this function:
#' \itemize{
#' \item 'fr' Fruchterman-Reingold layout \link[igraph]{layout_with_fr}
#' (package igraph)
#' \item 'kk' Kamada-Kawai layout \link[igraph]{layout_with_kk} (package igraph)
#' \item 'gh' Graphopt layout \link[igraph]{layout_with_graphopt}
#' (package igraph)
#' \item 'mds' Multidimensional scaling layout \link[igraph]{layout_with_mds}
#' (package igraph)
#' }
#'
#' @return A network plot showing relatedness between individuals
#' @author Custodian: Arthur Georges -- Post to
#' \url{https://groups.google.com/d/forum/dartr}
#' @examples
#' if (requireNamespace("igraph", quietly = TRUE) & requireNamespace("rrBLUP",
#' quietly = TRUE
#' ) & requireNamespace("fields", quietly = TRUE)) {
#' t1 <- possums.gl
#' # filtering on call rate
#' t1 <- gl.filter.callrate(t1)
#' t1 <- gl.subsample.loc(t1, n = 100)
#' # relatedness matrix
#' res <- gl.grm(t1, plotheatmap = FALSE)
#' # relatedness network
#' res2 <- gl.grm.network(res, t1, relatedness_factor = 0.125)
#' }
#' @references
#' \itemize{
#' \item Endelman, J. B. , Jannink, J.-L. (2012). Shrinkage estimation of the
#' realized relationship matrix. G3: Genes, Genomics, Genetics 2, 1405.
#' \item Goudet, J., Kay, T., & Weir, B. S. (2018). How to estimate kinship.
#' Molecular Ecology, 27(20), 4121-4135.
#' \item Speed, D., & Balding, D. J. (2015). Relatedness in the post-genomic era:
#' is it still useful?. Nature Reviews Genetics, 16(1), 33-44.
#' }
#' @seealso \code{\link{gl.grm}}
#' @family inbreeding functions
#' @export
gl.grm.network <- function(G,
x,
method = "fr",
node.size = 8,
node.label = TRUE,
node.label.size = 2,
node.label.color = "black",
node.shape = NULL,
link.color = NULL,
link.size = 2,
relatedness_factor = 0.125,
title = "Network based on a genomic relationship matrix",
palette_discrete = gl.select.colors(x,
library = "brewer",
palette = "PuOr",
ncolors = nPop(x),
verbose = 0),
plot.dir = NULL,
plot.file = NULL,
verbose = NULL) {
# SET VERBOSITY
verbose <- gl.check.verbosity(verbose)
# SET WORKING DIRECTORY
plot.dir <- gl.check.wd(plot.dir, verbose = 0)
# FLAG SCRIPT START
funname <- match.call()[[1]]
utils.flag.start(
func = funname,
build = "Jody",
verbose = verbose
)
# CHECK DATATYPE
datatype <- utils.check.datatype(x, verbose = verbose)
# FUNCTION SPECIFIC ERROR CHECKING Set a population if none is specified
# (such as if the genlight object has been generated manually)
if (is.null(pop(x)) |
is.na(length(pop(x))) | length(pop(x)) <= 0) {
if (verbose >= 2) {
cat(
important(
" Population assignments not detected, individuals assigned
to a single population labelled 'pop1'\n"
)
)
}
pop(x) <- array("pop1", dim = nInd(x))
pop(x) <- as.factor(pop(x))
}
# check if package is installed
pkg <- "igraph"
if (!(requireNamespace(pkg, quietly = TRUE))) {
cat(error(
"Package",
pkg,
" needed for this function to work. Please install it.\n"
))
return(-1)
}
if (!(method == "fr" ||
method == "kk" ||
method == "gh" || method == "mds")) {
if(verbose>0) cat(warn(
"Warning: Layout method must be one of fr, or kk, gh or mds, set to fr\n"
))
method <- "fr"
}
# DO THE JOB
G2 <- G
G2[upper.tri(G2, diag = TRUE)] <- NA
links <- as.data.frame(as.table(G2))
links <- links[which(!is.na(links$Freq)), ]
colnames(links) <- c("from", "to", "weight")
# using the average inbreeding coefficient (1-f) of the diagonal elements as
# the reference value
MS <- mean(diag(G) - 1)
# the result of the GRM is the summation of the IBD of each allele .
links$kinship <- (links$weight / 2) - MS
links_tmp <- links[, c(1, 2, 4)]
links_tmp <- rbind(links_tmp, cbind(
from = indNames(x),
to = indNames(x), kinship = 0
))
links_matrix <- as.matrix(reshape2::acast(links_tmp,
from ~ to,
value.var = "kinship"
))
links_matrix <- apply(links_matrix, 2, as.numeric)
rownames(links_matrix) <- colnames(links_matrix)
links_plot_tmp <- links_tmp[links_tmp$kinship > relatedness_factor, ]
links_plot_2 <- links_plot_tmp[, c("from", "kinship")]
colnames(links_plot_2) <- c("label.node", "kinship")
links_plot_3 <- links_plot_tmp[, c("to", "kinship")]
colnames(links_plot_3) <- c("label.node", "kinship")
links_plot <- rbind(links_plot_2, links_plot_3)
nodes <- data.frame(cbind(x$ind.names, as.character(pop(x))))
colnames(nodes) <- c("name", "pop")
network <- igraph::graph_from_data_frame(
d = links,
vertices = nodes,
directed = FALSE
)
q <- relatedness_factor
network.FS <-
igraph::delete_edges(network, igraph::E(network)[links$kinship < q])
if (method == "fr") {
layout.name <- "Fruchterman-Reingold layout"
plotcord <-
data.frame(igraph::layout_with_fr(network.FS))
}
if (method == "kk") {
layout.name <- "Kamada-Kawai layout"
plotcord <-
data.frame(igraph::layout_with_kk(network.FS))
}
if (method == "gh") {
layout.name <- "Graphopt layout"
plotcord <-
data.frame(igraph::layout_with_graphopt(network.FS))
}
if (method == "mds") {
layout.name <- "Multidimensional scaling layout"
plotcord <-
data.frame(igraph::layout_with_mds(network.FS))
}
# get edges, which are pairs of node IDs
edgelist <- igraph::get.edgelist(network.FS, names = F)
# convert to a four column edge data frame with source and destination
# coordinates
edges <-
data.frame(plotcord[edgelist[, 1], ], plotcord[edgelist[, 2], ])
# using kinship for the size of the edges
edges$size <- links[links$kinship > q, "kinship"]
X1 <- X2 <- Y1 <- Y2 <- label.node <- NA
colnames(edges) <- c("X1", "Y1", "X2", "Y2", "size")
# node labels
plotcord$label.node <- igraph::V(network.FS)$name
# adding populations
pop_df <-
as.data.frame(cbind(indNames(x), as.character(pop(x))))
colnames(pop_df) <- c("label.node", "pop")
plotcord <- merge(plotcord, pop_df, by = "label.node")
plotcord$pop <- as.factor(plotcord$pop)
plotcord <- merge(plotcord, links_plot, by = "label.node", all.x = TRUE)
plotcord[is.na(plotcord$kinship), "kinship"] <- 0
plotcord$kinship <- as.numeric(plotcord$kinship)
plotcord$kinship <- scales::rescale(plotcord$kinship, to = c(0.1, 1))
colors_pops <- palette_discrete
if (is.null(link.color)) {
link.color <- gl.select.colors(library = "baseR", palette = "rainbow", ncolors = 10, verbose = 0)
}
names(colors_pops) <- as.character(levels(x$pop))
size <- NULL
p1 <-
ggplot() +
geom_segment(
data = edges,
aes(x = X1, y = Y1, xend = X2, yend = Y2, color = size),
size = link.size) +
scale_colour_gradientn(name = "Relatedness",
colours = link.color) +
coord_fixed(ratio = 1) +
theme_void() +
ggtitle(paste(title, "\n[", layout.name, "]")) +
theme(
legend.position = "bottom",
plot.title = element_text(hjust = 0.5, face = "bold", size = 14)
)
if (is.null(node.shape)) {
p1 <- p1 +
geom_point(
data = plotcord, aes(x = X1,
y = X2,
fill = pop),
pch = 21,
size = node.size,
alpha = plotcord$kinship
) +
scale_fill_manual(name = "Populations",
values = colors_pops)
} else {
p1 <- p1 +
geom_point(
data = plotcord, aes(x = X1,
y = X2,
fill = pop,
shape = pop),
size = node.size,
alpha = plotcord$kinship
) +
scale_fill_manual(name = "Populations",
values = colors_pops) +
scale_shape_manual(name = "Populations",
values = node.shape)
}
if (node.label == T) {
p1 <-
p1 + geom_text(
data = plotcord,
aes(
x = X1,
y = X2,
label = label.node
),
size = node.label.size,
show.legend = FALSE,
color = node.label.color
)
}
# PRINTING OUTPUTS
print(p1)
# Optionally save the plot ---------------------
if (!is.null(plot.file)) {
tmp <- utils.plot.save(p1,
dir = plot.dir,
file = plot.file,
verbose = verbose
)
}
# FLAG SCRIPT END
if (verbose >= 1) {
cat(report("Completed:", funname, "\n"))
}
# RETURN
return(invisible(list(p1, links_matrix)))
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.