ringPlotPAM: Make a ring plot to show presence-absence of genotypes and...

View source: R/func__visualisation__ringPlotPAM.R

ringPlotPAMR Documentation

Make a ring plot to show presence-absence of genotypes and allelic co-occurrence

Description

This function requires the ggtree package to run. RGBA (RGB + alpha channel) colours are used by default. It is imortant to note that random effects are always unobservable in LMMs, however, we can deduce the posterior distribution of the coefficents of random effects. Therefore, the significance calls from our Bayesian chi-square test does not refer to the direction of the correlation between a PC and the response variable. As a result, some clades that are void of the response allele (Y) may be shaded in the picture when corresponding principal components (PCs) are negatively contributed to the observed responses. This is not an error because the analysis of structural random effects only concerns whether a projection vector (along a PC) contributes to presence-absence of the response allele.

The rings are filled outwards from the tree. For example, assuming the genotype of interest is defined as c("a1", "a2", "a3"), then the inner-most ring will represent the presence/absence of the allele a1, and the out-most ring will represent the third allele a3.

Usage

ringPlotPAM(
  pam,
  genotypes,
  tree,
  y = NULL,
  y.pat = NULL,
  struc.eff = NULL,
  clade.cor = NULL,
  clade.sizes = NULL,
  struc.pmax = 0.05,
  struc.nmax = 10,
  genotype.cluster = TRUE,
  genotype.dist = "binary",
  cluster.method = "single",
  x.colours = "grey50",
  co.colours = "red",
  null.colour = "grey90",
  y.colour = "grey10",
  highlight.tips = NULL,
  highlight.tip.colour = "red",
  highlight.tip.shape = 16,
  highlight.tip.size = 1,
  highlight.tip.alpha = 0.75,
  clade.colours = rainbow(10),
  output = "ringPlot.png",
  res = 72,
  width = 1600,
  height = 1600,
  unit = "px",
  htmap.width = 0.5,
  offset = -0.001,
  branch.width = 0.25,
  font.size = 2,
  print.colnames = TRUE,
  show.legend = FALSE
)

Arguments

pam

A presence/absence matrix (PAM) of genotypes (either alleles or genes)

genotypes

A named or un-named list of character vectors. Each vector specifies genotypes (e.g. alleles) whose co-occurrence will be plotted.

tree

A phylo object (cf. the ape package) for a bifurcating tree. It can be a phylogenetic tree for samples or a neighbour-joining tree based on projections of samples.

y

(optional) Name for a single allele or gene that is considered as the response variable. Branches in the tree will be coloured for it when there are sample projections significantly correlated with it.

y.pat

(optional) Pattern ID of the y allele or gene. It can be retrieved from the data frame "alle.pat" in the element "alleles" in the output of findPhysLink. This argument must be provided when a user wants to colour branches by significant structural effects.

struc.eff

(optional) The data frame "eff" in the output list of the function testForStruEff.

clade.cor

(optional) A data frame for the correlation between clades and sample projections. The output data frame of the function corCladeProj is an expected input.

clade.sizes

(optional) A named integer vector storing the size of each clade. This function transfers the sizes into the output when this parameter is set, making the result more informative (A user may find it easier to find out which shaded clade is most correlated with which projection vector along a principal component).

struc.pmax

(optional) Maximum of p-values for structural effects to call significant.

struc.nmax

(optional) Maximal number of significant structural effects to be plotted. Default: 10. It cannot exceed 25.

genotype.cluster

(optional) A logical value determining whether to perform hierarchical clustering of genotypes (columns of the PAM). Not applicable when PAM has less than three columns. Notice the order of inner rings may not follow the original one specified in the genotypes list.

genotype.dist

(optional) A string specifying which distance metric is used for the clustering. Default: binary

cluster.method

(optional) A string specifying which clustering method is used. Default: single

x.colours

(optional) A single colour or a named (by allele names) colour vector for all explanatory alleles. It must not be white.

co.colours

(optional) A single or multiple colours for co-occurrence data. They must not contain white.

null.colour

(optional) A single baseline colour for absence of every allele. Default: grey90. Users may choose "white" as an alternative when co-occurrence events are relatively common.

y.colour

(optional) A single colour for the y genotype variable.

highlight.tips

(optional) A character vector of names of tips to be highlighted with coloured circles.

highlight.tip.colour

(optional) One (an unnamed character vector) or more colours (a vector of colours named by tip labels) for highlighted tips.

highlight.tip.shape

(optional) An integer specifying the shape of highlighted tips, which follows the standard pch argument for R plots. Default: 16.

highlight.tip.size

(optional) An integer specifying the size of highlighted tips. Default: 1.

highlight.tip.alpha

(optional) A numeric specifying the alpha of the tip symbol. Default: 0.75.

clade.colours

(optional) A vector of colours for clades (at most 10) that are most correlated with projections that significantly contribute to the response variable y.

output

(optional) Path and name for the output PNG file.

res

(optional) Resolution of the output figure. Default: 72 ppi.

width

(optional) Width of the output image.

height

(optional) Height of the output image.

unit

(optional) The unit of the width and height of the output image. Valid values: "mm" and "px" (default).

htmap.width

(optional) Directly passed to the width parameter of the pheatmap function.

offset

(optional) A parameter directly passed to the pheatmap function in ggtree for the offset argument.

branch.width

(optional) Width of branches in the tree.

font.size

(optional) Size of column names printed on the heat map.

print.colnames

(optional) A logical parameter specifying whether to print column names on the heat map.

show.ledgend

(optional) A logical parameter specifying whether to display the legend for components that are significantly correlated with the response allele y. This is a legend for the heat map. Default: FALSE.

Author(s)

Yu Wan (wanyuac@126.com)

Examples

# Example 1
ringPlotPAM(pam = assoc[["alleles"]][["A"]], genotypes = list(c1 = c("SulI_1616", "DfrA12_1089"),
c2 = c("SulI_1616", "DfrA12_1089", "AadA2_1605.1158")), tree = tr, x.colours = "grey50",
co.colours = c("blue", "red"))

# Example 2
y <- "CmlA5_1538"
y.pat = assoc[["alleles"]][["alle.pat"]]$pattern[which(assoc[["alleles"]][["alle.pat"]]$allele == y)]
View(subset(assoc[["struc"]][["eff"]], y_pat == y.pat & p_adj <= 0.05))

rp <- ringPlotPAM(pam = assoc[["alleles"]][["A"]],
                 genotypes = list(c1 = c("Arr2_274", "CmlA5_1538"),
                                  c2 = c("CTX-M-15_150", "VEB-1_1435")),
                 y = y, y.pat = y.pat, tree = assoc[["struc"]][["C"]][["tr"]],
                 struc.eff = assoc[["struc"]][["eff"]],
                 clade.cor = assoc[["struc"]][["cor"]],
                 clade.sizes = assoc[["struc"]][["clades"]][["size"]],
                 struc.pmax = 0.05, struc.nmax = 20,
                 genotype.dist = "binary", cluster.method = "single",
                 x.colours = "grey50", null.colour = "gray90",
                 co.colours = c("#d73027", "#fc8d59"), y.colour = "grey10",
                 output = "ringPlot_arr2cmlA5Cluster_2017080203.png",
                 branch.width = 0.25, htmap.width = 0.5, width = 190, height = 190, unit = "mm",
                 res = 300, print.colnames = FALSE)

View(rp[["top"]])


wanyuac/GeneMates documentation built on Aug. 12, 2022, 7:37 a.m.