View source: R/plotting_functions.R
plot_manhattan | R Documentation |
Creates a ggplot-based manhattan plot, where chromosomes/scaffolds/etc are concatenated along the x-axis. Can optionally highlight requested SNPs or those that pass an arbitrary significance threshold and facet plots by defined sample-specific variables such as population.
plot_manhattan(
x,
plot_var,
window = FALSE,
facets = NULL,
chr = "chr",
bp = "position",
snp = NULL,
color_var = NULL,
vlines = FALSE,
vline_width = 0.25,
median_line = FALSE,
chr.subfacet = NULL,
sample.subfacet = NULL,
significant = NULL,
suggestive = NULL,
highlight = "significant",
highlight_style = "label",
sig_below = FALSE,
log.p = FALSE,
abs = FALSE,
viridis.option = "plasma",
viridis.hue = c(0.2, 0.5),
t.sizes = c(16, 12, 10),
colors = c("black", "slategray3"),
rug_data = NULL,
rug_style = "point",
rug_label = NULL,
rug_alpha = 0.3,
rug_thickness = ifelse(rug_style == "point", 0.03, 6),
lambda_gc_correction = FALSE,
chr_order = NULL,
abbreviate_labels = FALSE,
simplify_output = FALSE
)
x |
snpRdata or data.frame object containing the data to be plotted. |
plot_var |
character. A character string naming the statistic to be plotted. For snpRdata, these names correspond to any previously calculated statistics. |
window |
logical, default FALSE. If TRUE, sliding window averages will instead be plotted. These averages must have first been calculated with calc_smoothed_averages. Ignored if x is a data.frame. |
facets |
character or NULL, default NULL. Facets by which to break
plots, as described in |
chr |
character, default "chr". Column in either snp metadata or x (for snpRdata or data.frame objects, respectively) which defines the "chromosome" by which SNP positions will be concatenated along the x-axis. If window = TRUE and a snpRdata object, this will be ignored in favor of the SNP specific facet provided to the facets argument. |
bp |
character, default "bp". Column in either snp metadata or x (for snpRdata or data.frame objects, respectively) which defines the position in bp of each SNP. |
snp |
character, default NULL. Column in either snp metadata or x (for snpRdata or data.frame objects, respectively) containing snpIDs to use for highlighting. Ignored if no highlighting is requested. |
color_var |
character, default NULL. If provided, a column by which
to color each point. If used, chromosomes will not be colored, and the
|
vlines |
character (color) or FALSE, default FALSE. If a color, vertical
separator lines will be drawn between each chromosome. Widths controlled
by |
vline_width |
numeric, default 2. Width of chromosome separator lines.
Ignored if |
median_line |
character (color) or FALSE, default FALSE. If TRUE, a
horizontal line will be plotted at the |
chr.subfacet |
character, default NULL. Specific chromosomes to plot. See examples. |
sample.subfacet |
character, default NULL. Specific sample-specific levels of the provided facet to plot. If x is a data.frame, this can refer to levels of a column titled "subfacet". See examples. |
significant |
numeric, default NULL. Value at which a line will be drawn designating significant SNPs. If highlight = "significant", SNPs above this level will also be labeled. |
suggestive |
numeric, default NULL. Value at which a line will be drawn designating suggestive SNPs. If highlight = "suggestive", SNPs above this level will also be labeled. |
highlight |
character, numeric, or FALSE, default "significant". Controls SNP highlighting. If either "significant" or "suggestive", SNPs above those respective values will be highlighted. If a numeric vector, SNPs corresponding to vector entries will be highlighted. See details. |
highlight_style |
character, default "label". Highlighting options:
|
sig_below |
logical, default FALSE. If TRUE, treats values lower than the significance threshold as significant. |
log.p |
logical, default FALSE. If TRUE, plot variables and thresholds will be transformed to -log. |
abs |
logical, default FALSE. If TRUE, converts the plot variable to it's absolute value. |
viridis.option |
character, default "plasma". Viridis color scale option
to use for significance lines and SNP labels. See
|
viridis.hue |
numeric, default c(0.2, 0.5). Two values between 0 and 1 listing the hues at which to start and stop on the viridis palette defined by the viridis.option argument. Lower numbers are darker. |
t.sizes |
numeric, default c(16, 12, 10). Text sizes, given as c(strip.title, axis, axis.ticks). |
colors |
character, default c("black", "slategray3"). Colors to alternate across chromosomes. |
rug_data |
data.frame or tbl, default NULL. Data to plot as a rug below
the manhattan plot containing columns named to match the |
rug_style |
character, default "point". Options for the style of the
rug, ignored if
|
rug_label |
character, default NULL. Names of additional labeling
columns in |
rug_alpha |
numeric between 0 and 1, default 0.3. Alpha (transparency)
applied to a ribbon-style rug. Ignored if |
rug_thickness |
numeric, default .03 for point style and 6 for ribbon
style. The height of the rug lines (if |
lambda_gc_correction |
Correct for inflated significance due to
population and/or family structure using the |
chr_order |
character, default NULL. If provided, an ordered vector of chromosome/scaffold/etc names by which to sort output. |
abbreviate_labels |
numeric or FALSE, default FALSE. If a numeric value,
x-axis chromosome names will be abbreviated using
|
simplify_output |
If TRUE, only the ggplot object will be return. This is optimal, since the data is already returned in that object, but is not the default due to backwards consistency with old code. |
Unlike most snpR functions, this function works with either a snpRdata object
or a data.frame. For snpRdata objects snp-specific or sliding window
statistics can be plotted. In both cases, the facet argument can be used to
define facets to plot, as described in Facets_in_snpR
. For
typical stats, name of the snp meta-data column containing
chromosome/scaffold information must be supplied to the "chr" argument. For
windowed stats, chr is instead inferred from the snp-specific facet used to
create the smoothed windows. In both cases, the requested facets must exactly
match those used to calculate statistics! If x is a data frame, the "chr"
argument must also be given, and the "facets" argument will be ignored.
A column defining the position of the SNP within the chromosome must be provided, and is "position" by default.
Specific snp and chr levels can also be requested using the chr.subfacet and sample.subfacet arguments. See examples. For data.frames, sample.subfacets levels must refer to a column in x titled "subfacet".
Specific snps can be highlighted and annotated. If a significance level is requested, SNPs above this level will be highlighted by default. SNPs above the suggestive line can also be highlighted by providing "suggestive" to the highlight argument. Alternatively, individual SNPs can be highlighted by providing a numeric vector. For snpR data, this will correspond to the SNP's row in the snpRdata object. For data.frames, it will correspond to a ".snp.id" column if it exists, and the row number if not. The label for highlighted SNPs will be either chr_bp by default or given in the column named by the "snp" argument.
A list containing
plot: A ggplot manhattan plot.
data: Raw plot data.
If simplify_output
is FALSE
,
only the ggplot object is returned.
William Hemstrom
Price, A., Zaitlen, N., Reich, D. et al. New approaches to population stratification in genome-wide association studies. Nat Rev Genet 11, 459–463 (2010). https://doi.org/10.1038/nrg2813
# add a dummy phenotype and run an association test.
x <- stickSNPs[pop = c("ASP", "SMR"), chr = c("groupIX", "groupIV")]
sample.meta(x)$phenotype <- sample(c("A", "B"), nsamps(x), TRUE)
x <- calc_association(x, response = "phenotype", method = "armitage")
plot_manhattan(x, "p_armitage_phenotype", chr = "chr",
log.p = TRUE)$plot
# other types of stats:
# make some data
x <- calc_basic_snp_stats(x, "pop.chr", sigma = 200, step = 50)
# plot pi, breaking apart by population, keeping only the groupIX
# and the ASP population, with
# significant and suggestive lines plotted and SNPs
# with pi below the significance level labeled.
plot_manhattan(x, "pi", facets = "pop",
chr = "chr", chr.subfacet = "groupIX",
sample.subfacet = "ASP",
significant = 0.05, suggestive = 0.15, sig_below = TRUE)$plot
# plot FST for the ASP/SMR comparison across all chromosomes,
# labeling the first 10 SNPs in x (by row) with their ID
# Note that since this is thie ony comparison, we don't actually need to
# specify it.
plot_manhattan(x, "fst", facets = "pop.chr",
sample.subfacet = "ASP~SMR", highlight = 1:10,
chr = "chr", snp = ".snp.id")$plot
# plot sliding-window FST between ASP and SMR
# and between OPL and SMR
plot_manhattan(x, "fst", window = TRUE, facets = c("pop.chr"),
chr = "chr", sample.subfacet = "ASP~SMR",
significant = .29, suggestive = .2)$plot
# plot using a data.frame,
# using log-transformed p-values
## grab data
y <- get.snpR.stats(x, "pop", stats = "hwe")$single
## plot
plot_manhattan(y, "pHWE", facets = "subfacet", chr = "chr",
significant = 0.0001, suggestive = 0.001,
log.p = TRUE, highlight = FALSE)$plot
# plot with a rug
rug_data <- data.frame(chr = c("groupIX", "groupIV"), start = c(0, 1000000),
end = c(5000000, 6000000), gene = c("A", "B"))
# point style, midpoints plotted
plot_manhattan(x, "p_armitage_phenotype", chr = "chr",
log.p = TRUE, rug_data = rug_data)
# ribbon style
plot_manhattan(x, "p_armitage_phenotype", chr = "chr",
log.p = TRUE, rug_data = rug_data, rug_style = "ribbon")
# with plotly to mouse over information
## Not run:
plotly::ggplotly(plot_manhattan(x, "p_armitage_phenotype", chr = "chr",
log.p = TRUE, rug_data = rug_data,
rug_style = "ribbon",
rug_label = "gene")$plot)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.