README.md

dotsViolin. Dot Plots Mimicking Violin Plots

Buy Me a Coffee at ko-fi.com

Modifies dot plots to have different sizes of dots mimicking violin plots and identifies modes or peaks for them (Rosenblatt, 1956; Parzen, 1962).

dotsViolin, an R package (R Core Team, 2023) uses gridExtra (Auguie, 2017), gtools (Bolker et al., 2022), tidyr (Wickham et al., 2023c), stringr (Wickham, 2022), dplyr (Wickham et al., 2023b), ggplot2 (Wickham et al., 2023a), lazyeval (Wickham, 2019), magrittr (Bache and Wickham, 2022), rlang (Henry and Wickham, 2023), scales (Wickham and Seidel, 2022), tidyselect (Henry and Wickham, 2022)

Documentation was written with R-packages roxygen2 (Wickham et al., 2022), knitr (Xie, 2023), Rmarkdown (Allaire et al., 2023).

Academic presentation related (Roa-Ovalle, 2019)

Installation

devtools::install_gitlab(repo = "ferroao/dotsViolin")

Releases

News

Citation

To cite package ‘dotsViolin’ in publications use:

Roa-Ovalle F, Telles M (2023). dotsViolin: Integrated tables in dot and violin R ggplots. R package version 0.0.1, https://gitlab.com/ferroao/dotsViolin.

To write citation to file:

sink("dotsViolin.bib")
toBibtex(citation("dotsViolin"))
sink()

Authors

Fernando Roa Mariana PC Telles

Plot window

Define your plotting window size with something like par(pin=c(10,6)), or with svg(), png(), etc.

In VSCode, you could use something like this

{
  "r.plot.useHttpgd": false,
  "r.plot.devArgs": {
    "width": 800,
    "height": 600
  }
}

Examples

1 Discrete Data:

library(dotsViolin)

fabaceae_mode_counts <- get_modes_counts(fabaceae_clade_n_df, "clade", "parsed_n")
fabaceae_mode_counts

| clade | m1 | m2 | m3 | count | |:----------------------|:----|:-----|:----|:------| | Caesalpinieae | 12 | NA | NA | 29 | | Cassieae | 14 | 8 | 12 | 64 | | Cercidoideae | 14 | 7 | NA | 33 | | Detarioideae | 12 | 8,17 | NA | 50 | | Dialioideae | 14 | NA | NA | 6 | | Dimorphandra and rel. | 14 | 13 | NA | 16 | | Mimosoids | 13 | 26 | 14 | 221 | | outgroup | 8 | 12 | 11 | 145 | | Papilionoideae | 8 | 11 | 7 | 1410 | | Umtiza and rel. | 14 | NA | NA | 7 |

library(dotsViolin)

fabaceae_clade_n_df_count <- make_legend_with_stats(fabaceae_mode_counts, "label_count", 1, TRUE)
fabaceae_clade_n_df$label_count <- fabaceae_clade_n_df_count$label_count[match(
  fabaceae_clade_n_df$clade,
  fabaceae_clade_n_df_count$clade
)]
desiredorder1 <- unique(fabaceae_clade_n_df$clade)
fabaceae_clade_n_df
                        tip.label          clade parsed_n
1     KX374504_Abarema_centiflora      Mimosoids       13
2   KX213142_Adenodolichos_bussei Papilionoideae       11
3      KX792912_Almaleea_cambagei Papilionoideae        8
4 KP109982_Amphithalea_cymbifolia Papilionoideae        9
5 KP230727_Argyrolobium_tuberosum Papilionoideae       13
6        GU220019_Ateleia_arsenii Papilionoideae       14
                              label_count
1 Mimosoids             13   26 14  (221)
2 Papilionoideae         8   11  7 (1410)
3 Papilionoideae         8   11  7 (1410)
4 Papilionoideae         8   11  7 (1410)
5 Papilionoideae         8   11  7 (1410)
6 Papilionoideae         8   11  7 (1410)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))

dots_and_violin(
  fabaceae_clade_n_df, "clade", "label_count", "parsed_n", 2,
  30, "Chromosome haploid number", desiredorder1, 1, .85, 4,
  "ownwork",
  violin = FALSE
)

par(mar = c(0, 0, 0, 0), omi = rep(0, 4))

dots_and_violin(
  fabaceae_clade_n_df, "clade", "label_count", "parsed_n", 2,
  30, "Chromosome haploid number", desiredorder1, 1, .85, 4,
  dots = FALSE
)

par(mar = c(0, 0, 0, 0), omi = rep(0, 4))

dots_and_violin(
  fabaceae_clade_n_df, "clade", "label_count", "parsed_n", 2,
  30, "Chromosome haploid number", desiredorder1, 1, .85, 4
)

2 Continuous Data:

Define your plotting window size with something like par(pin=c(10,6)), or with svg(), png(), etc.

library(dotsViolin)

fabaceae_Cx_peak_counts_per_clade_df <- get_peaks_counts_continuous(
  fabaceae_clade_1Cx_df,
  "clade", "Cx", 2, 0.25, 1, 2
)
fabaceae_Cx_peak_counts_per_clade_df

| | clade | m1 | m2 | counts | |:--------------------|:--------------------|:---------------|:----------|-------:| | Caesalpinieae | Caesalpinieae | 0.85,1.80 | | 2 | | Cassieae | Cassieae | 0.69 | 0.52,0.56 | 6 | | Cercidoideae | Cercidoideae | 0.60 | | 5 | | COM clade | COM clade | 0.35,0.50,0.83 | | 3 | | Detarioideae | Detarioideae | 2.21 | 0.84,2.01 | 4 | | Dimorphandra & rel. | Dimorphandra & rel. | 0.73,0.79 | | 2 | | Malvids | Malvids | 0.40 | 0.63 | 8 | | Mimosoids | Mimosoids | 0.70 | 0.43 | 42 | | outgroups | outgroups | 0.48 | 1.38,2.76 | 9 | | Papilionoideae | Papilionoideae | 0.59 | | 212 | | Polygala amara | Polygala amara | 0.42 | | 1 | | Umtiza & rel. | Umtiza & rel. | 0.65,1.05 | | 2 | | Vitis vinifera | Vitis vinifera | 0.43 | | 1 |

library(dotsViolin)

namecol <- "labelcountcustom"
fabaceae_clade_1Cx_modes_count_df <- make_legend_with_stats(
  fabaceae_Cx_peak_counts_per_clade_df,
  namecol, 1, TRUE
)
fabaceae_clade_1Cx_df$labelcountcustom <-
  fabaceae_clade_1Cx_modes_count_df$labelcountcustom[match(
    fabaceae_clade_1Cx_df$clade,
    fabaceae_clade_1Cx_modes_count_df$clade
  )]
desiredorder <- unique(fabaceae_clade_1Cx_df$clade)
fabaceae_clade_1Cx_df
                              name     clade     Cx      genus ownwork
6      'Silene_latifolia_JF715055' outgroups 2.7000     Silene      no
7  'Fagopyrum_esculentum_NC010776' outgroups 1.4350  Fagopyrum      no
11    'Helianthus_annuus_NC007977' outgroups 2.4250 Helianthus      no
12        'Daucus_carota_NC008325' outgroups 2.8375     Daucus      no
14        'Olea_europaea_NC013707' outgroups 1.9500       Olea      no
18       'Coffea_arabica_NC008535' outgroups 0.6000     Coffea      no
                                      labelcountcustom
6  outgroups                     0.48 1.38,2.76    (9)
7  outgroups                     0.48 1.38,2.76    (9)
11 outgroups                     0.48 1.38,2.76    (9)
12 outgroups                     0.48 1.38,2.76    (9)
14 outgroups                     0.48 1.38,2.76    (9)
18 outgroups                     0.48 1.38,2.76    (9)
par(mar = c(0, 0, 0, 0), omi = rep(0, 4))

dots_and_violin(
  fabaceae_clade_1Cx_df, "clade", "labelcountcustom", "Cx", 3,
  3, "Genome Size", desiredorder, 0.03, 0.25, 2,
  "ownwork"
)

par(mar = c(0, 0, 0, 0), omi = rep(0, 4))

dots_and_violin(
  fabaceae_clade_1Cx_df, "clade", "labelcountcustom", "Cx", 3,
  3, "Genome Size", desiredorder, 0.03, 0.25, 2,
  dots = FALSE
)

par(mar = c(0, 0, 0, 0), omi = rep(0, 4))

dots_and_violin(
  fabaceae_clade_1Cx_df, "clade", "labelcountcustom", "Cx", 3,
  3, "Genome Size", desiredorder, 0.03, 0.25, 2,
  "ownwork",
  violin = FALSE
)

References

R-packages

Allaire J, Xie Y, Dervieux C, McPherson J, Luraschi J, Ushey K, Atkins A, Wickham H, Cheng J, Chang W, Iannone R. 2023. *Rmarkdown: Dynamic documents for r*. R package version 2.24.
Auguie B. 2017. *gridExtra: Miscellaneous functions for "grid" graphics*. R package version 2.3.
Bache SM, Wickham H. 2022. *Magrittr: A forward-pipe operator for r*. R package version 2.0.3.
Bolker B, Warnes GR, Lumley T. 2022. *Gtools: Various r programming tools*. R package version 3.9.4.
Henry L, Wickham H. 2022. *Tidyselect: Select from a set of strings*. R package version 1.2.0.
Henry L, Wickham H. 2023. *Rlang: Functions for base types and core r and tidyverse features*. R package version 1.1.1.
R Core Team. 2023. *R: A language and environment for statistical computing* R Foundation for Statistical Computing, Vienna, Austria.
Wickham H. 2019. *Lazyeval: Lazy (non-standard) evaluation*. R package version 0.2.2.
Wickham H. 2022. *Stringr: Simple, consistent wrappers for common string operations*. R package version 1.5.0.
Wickham H, Chang W, Henry L, Pedersen TL, Takahashi K, Wilke C, Woo K, Yutani H, Dunnington D. 2023a. *ggplot2: Create elegant data visualisations using the grammar of graphics*. R package version 3.4.4.
Wickham H, Danenberg P, Csárdi G, Eugster M. 2022. *roxygen2: In-line documentation for r*. R package version 7.2.3.
Wickham H, François R, Henry L, Müller K, Vaughan D. 2023b. *Dplyr: A grammar of data manipulation*. R package version 1.1.3.
Wickham H, Seidel D. 2022. *Scales: Scale functions for visualization*. R package version 1.2.1.
Wickham H, Vaughan D, Girlich M. 2023c. *Tidyr: Tidy messy data*. R package version 1.3.0.
Xie Y. 2023. *Knitr: A general-purpose package for dynamic report generation in r*. R package version 1.43.

Academia

Parzen E. 1962. On estimation of a probability density function and mode *The Annals of Mathematical Statistics*, 33: 1065–1076.
Roa-Ovalle F. 2019. Poliploidia e duplicação genômica nas leguminosas brasileiras In: Rocha LL da (ed) Sociedade Botânica do Brasil.
Rosenblatt M. 1956. Remarks on some nonparametric estimates of a density function *The Annals of Mathematical Statistics*, 27: 832–837.


Try the dotsViolin package in your browser

Any scripts or data that you put into this service are public.

dotsViolin documentation built on Nov. 2, 2023, 6:09 p.m.