An implementation of Circos plots for epidemiologists
Full documentation is available on the EpiCircos website.
If you use the package please use this
citation.
Please also cite Circlize
and
if using the legend functions cite
ComplexHeatmap
, both
citations can be found
here.
Install directly from GitHub using the following code:
# Install devtools
install.packages("devtools")
library(devtools)
# Install EpiCircos directly from GitHub
devtools::install_github("mattlee821/EpiCircos")
library(EpiCircos)
You may be unable to install the pakcage because of an issue installing
ComplexHeatmap
. An example error:
Skipping 1 packages not available: ComplexHeatmap
Installing 12 packages: circlize, ComplexHeatmap, digest, dplyr, ellipsis, GlobalOptions, pillar, Rcpp, rlang, shape, tibble, vctrs
Error: (converted from warning) package ‘ComplexHeatmap’ is not available (for R version 3.5.3)
To fix this you should install
ComplexHeatmap
first and then EpiCircos
. Install
ComplexHeatmap
as follows:
# Install devtools
install.packages("devtools")
library(devtools)
# Install ComplexHeatmap directly from Bioconductor
if (!requireNamespace("BiocManager", quietly=TRUE))
install.packages("BiocManager")
BiocManager::install("ComplexHeatmap")
# Install EpiCircos directly from GitHub
devtools::install_github("mattlee821/EpiCircos")
library(EpiCircos)
Epidemiologists using large complex data are limited in choice of visualisation tools. Circos plots proivde an informative visualisation tool for examining large complex data, but are traditionally used in genomics and are not easily adaptable for epidemiology. In genomics work Circos plots provide an efficeint way of visually inspecting and comparing data and results.
EpiCircos
is a function that simplifies the
Circlize
package for
use with epidemiological data. Data can be displayed in a number of
ways:
Circos plots can be created by callingcircos_plot()
. You can plot up
to three tracks using track_number =
. The plot is limited to three
tracks for readability.
Example data is stored in the function, you can access it in your
environment with data <- EpiCircos_data
. The data can be used directly
in the circos_plot()
function by assigning it to track1_data =
EpiCircos_data
. This is simulated data based on a Mendelian
randomization analysis of body mass index to 123 metabolites. It has 123
rows (outcomes) and 8 columns (variables). The variables include betas,
standard errors and p-values. For more info
?EpiCircos::EpiCircos_data
. NOTE: The example data is the ideal
situation for how your own dataframe should be formatted for use with
EpiCircos.
head(EpiCircos::EpiCircos_data)
label outcome_group outcome_subgroup effect_estimate standard_error
1 IBJCX2116O A Section label 1 -0.036797778 -0.012265926
2 VVUHQ0448G B Section label 2 -0.009396427 -0.003132142
3 XAXHR5573J C Section label 3 -0.128358126 -0.042786042
4 AOATO7677O D Section label 4 -0.122093654 -0.040697885
5 OMMTE5780R E Section label 5 0.095617488 0.031872496
6 VFSCU5692N F Section label 6 0.017775970 0.005925323
Pvalue lower_confidence_interval upper_confidence_interval bars
1 0.017387080 -0.012756563 -0.06083899 9.547940
2 0.006434143 -0.003257428 -0.01553543 8.661527
3 0.043749939 -0.044497484 -0.21221877 7.245723
4 0.056193359 -0.042325800 -0.20186151 9.627461
5 0.065877771 0.033147396 0.15808758 7.943525
6 0.094817010 0.006162336 0.02938960 7.968774
lines
1 77.39050
2 112.82122
3 89.58869
4 67.29705
5 90.07912
6 88.89458
The simplest Circos plot to make is with 1 track.
circos_plot(track_number = 1, # how many track do you want to plot
track1_data = EpiCircos::EpiCircos_data, # what is the dataframe for your first track
track1_type = "points", # how do you want to plot your first track
label_column = 1, # whats is the column of your labels
section_column = 2, # what is the column of your sections
estimate_column = 4, # what is the column of your estimate (beta, OR etc.)
pvalue_column = 5, # what is the column of your p-value
pvalue_adjustment = 0.05, # what do you want your p-value adjustment to be (multiple testing threshold)
lower_ci = 7, # what is the column of your lower confidence interval
upper_ci = 8) # what is the column of your upper confidence interval
You can have multiple tracks each with differnt styles. Track styles can
be: "points"
, "lines"
, "bar"
, "histogram"
.
circos_plot(track_number = 3,
track1_data = EpiCircos::EpiCircos_data,
track2_data = EpiCircos::EpiCircos_data,
track3_data = EpiCircos::EpiCircos_data,
track1_type = "points",
track2_type = "lines",
track3_type = "bar",
label_column = 1,
section_column = 2,
estimate_column = 4,
pvalue_column = 5,
pvalue_adjustment = 0.05,
lower_ci = 7,
upper_ci = 8,
lines_column = 10,
lines_type = "o",
bar_column = 9,
histogram_column = 4,
histogram_binsize = 0.01,
histogram_densityplot = F)
The legend function is taken from ComplexHeatmap
. It will place a
legend at the bottom of the plot. The legend will be populated with:
points coloured for each track and a label for each track, a point for
p-value label, and section headers.
circos_plot(track_number = 3,
track1_data = EpiCircos::EpiCircos_data,
track2_data = EpiCircos::EpiCircos_data,
track3_data = EpiCircos::EpiCircos_data,
track1_type = "points",
track2_type = "lines",
track3_type = "bar",
label_column = 1,
section_column = 2,
estimate_column = 4,
pvalue_column = 5,
pvalue_adjustment = 0.05,
lower_ci = 7,
upper_ci = 8,
lines_column = 10,
lines_type = "o",
bar_column = 9,
legend = TRUE,
track1_label = "Track 1",
track2_label = "Track 2",
track3_label = "Track 3",
pvalue_label = "<= 0.05",
circle_size = 25)
For best results save your plot as PDF
or SVG
. Both can be converted
to other image formats. The following code can be used to save as either
PDF
or SVG
. Adjust the width
and height
functions to get the
correct sizing for your plot and then adjust the pointsize
function.
The following values for each work for most plots:
pdf("my_plot.pdf",
width = 30, height = 30, pointsize = 35)
circos_plot(...)
dev.off()
If just using the RStudio
plots panel you will not be able to see the
finished plot as it will appear. Similarly, saving as anything other
than PDF
will not give a good visualisation.
Plot saved as PNG
file:
sessionInfo()
## R version 3.6.2 (2019-12-12)
## Platform: x86_64-apple-darwin15.6.0 (64-bit)
## Running under: macOS Mojave 10.14.6
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib
##
## locale:
## [1] en_GB.UTF-8/en_GB.UTF-8/en_GB.UTF-8/C/en_GB.UTF-8/en_GB.UTF-8
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] compiler_3.6.2 magrittr_1.5 tools_3.6.2 htmltools_0.4.0
## [5] yaml_2.2.0 Rcpp_1.0.3 stringi_1.4.3 rmarkdown_2.0
## [9] knitr_1.26 stringr_1.4.0 xfun_0.11 digest_0.6.23
## [13] rlang_0.4.2 evaluate_0.14
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.