Clean annotation files (CSV or TSV) for Pseudomonas aeruginosa from

link <- paste0(


Or add some extra columns and fill empty names with the corresponding locus tag.

tr_anno_cleaner(link, extra_cols = TRUE, fill_names = TRUE)


Takes a DESeq2 results object, and returns the significant DE genes, printing a message summarizing the comparison and number of significant genes when inform = TRUE (default).

ex_deseq_results <- 
  readRDS(system.file("extdata", "ex_deseq_results.rds", package = "tRavis"))

The default filters applied to the data are: padj < 0.05 and abs(log2FoldChange) > log2(1.5).


Compare two lists to find the common/unique elements, with an optional names argument to apply to the results.

tr_compare_lists(c(1, 2, 3, 4), c(3, 4, 5, 6), names = c("A", "B"))


Create a named list of files, easily piped into purrr::map(~read.csv(.x)) to generate a named list of data frames. Supports recursive searching, custom string/pattern removal, and date removal assuming a format like YYYYMMDD (can't contain punctuation/symbols).

  directory = system.file("extdata", package = "tRavis"),
  pattern = "test",
  date = TRUE,
  remove_string = "test_"


Generate RNA-Seq QC plots from MultiQC outputs. Currently only supports summary plots for FastQC (Phred scores and read counts), STAR, and HTSeq. Plots are created with ggplot2 for simplicity. A few arguments are provided to modify the overall font size, and toggle a threshold line at a given number of reads/counts:

multiqc_data <- system.file("extdata/tr_qc_plots_data", package = "tRavis")

qc_plot_output <- tr_qc_plots(
  directory = multiqc_data,
  font_size = 12,
  threshold_line = 5e6


All underlying data is also returned, so one can easily generate their own plots if further refinements are required.


Sort a column of alphanumeric strings in (non-binary) numerical order given an input data frame and desired column. You can use the column name or index, and its compatible with pipes.

my_dataframe <- data.frame(
    colA = c("a11", "a1", "b1", "a2"),
    colB = c(3, 1, 4, 2)

tr_sort_alphanum(input_df = my_dataframe, sort_col = "colA")


Simple wrapper around Fisher's test for gene enrichment, which constructs the matrix for you and returns the p value.

all_genes <- paste0("gene", sample(1:10000, 5000))
de_genes <- sample(all_genes, 1500)
gene_set <- sample(all_genes, 100)

  query_set = de_genes, 
  enrichment_set = gene_set, 
  total_genes = 5000


Clean themes for ggplot2 that improve on the default by increasing font size, changing the background to white, and adding a border. By default, it uses a minimal grid, but you can easily remove the grid entirely.

ggplot(mtcars, aes(as.factor(cyl), mpg)) + geom_boxplot() + tr_theme()
ggplot(mtcars, aes(as.factor(cyl), mpg)) + geom_boxplot() + tr_theme(grid = FALSE)


Combines the items "greater" and "less" from the list output by gage into a single tidy data frame (tibble), and provides an option to filter the results based on q value.

gage_untidy <- 
  readRDS(system.file("extdata", "ex_gage_results.rds", package = "tRavis"))

lapply(gage_untidy, function(x) head(x, 3))

tr_tidy_gage(gage_untidy, qval = 1)


Simple function to truncate long strings without breaking them in the middle of a word. Useful for trimming long axis labels in a plot.

  x = "This is a long string that we want to break neatly",
  l = 40

It's also been designed for use inside of a mutate call:

ex_trunc_df <- dplyr::tibble(
  col1 = c(1, 2, 3),
  col2 = c(
    "This is a pretty long string",
    "This string is actually a bit longer",
    "Here is the longest string of them all, just!"

  col3 = purrr::map_chr(col2, ~tr_trunc_neatly(.x, l = 20))

