knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
library(tRavis)

tr_anno_cleaner

Clean annotation files (CSV or TSV) for Pseudomonas aeruginosa from https://pseudomonas.com.

# A URL can also be supplied, instead of a local file path
link <- paste0(
  "https://pseudomonas.com/downloads/pseudomonas/pgd_r_22_1/",
  "Pseudomonas_aeruginosa_PAO1_107/Pseudomonas_aeruginosa_PAO1_107.csv.gz"
)

tr_anno_cleaner(input_file = link)

Or add some extra columns and fill empty names with the corresponding locus tag.

tr_anno_cleaner(link, extra_cols = TRUE, fill_names = TRUE)

tr_clean_deseq2_result

Takes a DESeq2 results object, and returns the significant DE genes, printing a message summarizing the comparison and number of significant genes when inform = TRUE (default).

ex_deseq_results <- 
  readRDS(system.file("extdata", "ex_deseq_results.rds", package = "tRavis"))
ex_deseq_results
tr_clean_deseq2_result(ex_deseq_results)

The default filters applied to the data are: padj < 0.05 and abs(log2FoldChange) > log2(1.5).

tr_compare_lists

Compare two lists to find the common/unique elements, with an optional names argument to apply to the results.

tr_compare_lists(c(1, 2, 3, 4), c(3, 4, 5, 6), names = c("A", "B"))

tr_get_files

Create a named list of files, easily piped into purrr::map(~read.csv(.x)) to generate a named list of data frames. Supports recursive searching, custom string/pattern removal, and date removal assuming a format like YYYYMMDD (can't contain punctuation/symbols).

tr_get_files(
  directory = system.file("extdata", package = "tRavis"),
  pattern = "test",
  date = TRUE,
  remove_string = "test_"
)

tr_qc_plots

Generate RNA-Seq QC plots from MultiQC outputs. Currently only supports summary plots for FastQC (Phred scores and read counts), STAR, and HTSeq. Plots are created with ggplot2 for simplicity. A few arguments are provided to modify the overall font size, set the limits, and toggle a threshold line at a given number of reads/counts:

multiqc_data <- system.file("extdata/tr_qc_plots_data", package = "tRavis")
list.files(multiqc_data)

qc_plot_output <- tr_qc_plots(
  directory = multiqc_data,
  threshold_line = 5e6,
  font_size = 14
)

qc_plot_output[["plots"]]

Alternate boxplots

The bar plots work well enough for relatively few samples, but quickly become unwieldy with lots of samples. Box plots can also be generated using the same function as follows:

qc_plot_output_box <- tr_qc_plots(
  directory = multiqc_data,
  type = "box",
  threshold_line = 5e6,
  font_size = 16
)

qc_plot_output_box[["plots"]][c("fastqc_reads", "star", "htseq")]

The points can be toggled on or off using the add_points argument.

Data {.tabset}

All the underlying tidy data is also returned, so one can easily generate their own plots or further examine the data as desired:

Phred scores

qc_plot_output[["data"]][["phred_scores"]]

FastQC reads

qc_plot_output[["data"]][["fastqc_reads"]]

STAR

qc_plot_output[["data"]][["star"]]

HTSeq

qc_plot_output[["data"]][["htseq"]]

tr_sort_alphanum

Sort a column of alphanumeric strings in (non-binary) numerical order given an input data frame and desired column. You can use the column name or index, and its compatible with pipes.

df_unsorted <- data.frame(
  colA = c("a11", "a1", "b1", "a2"),
  colB = c(3, 1, 4, 2)
)

tr_sort_alphanum(input_df = df_unsorted, sort_col = "colA")

tr_test_enrichment

Simple wrapper around Fisher's test for gene enrichment, which constructs the matrix for you and returns the p value.

all_genes <- paste0("gene", sample(1:10000, 5000))
de_genes <- sample(all_genes, 1500)
gene_set <- sample(all_genes, 100)

tr_test_enrichment(
  query_set = de_genes, 
  enrichment_set = gene_set, 
  total_genes = 5000
)

tr_theme

Clean themes for ggplot2 that improve on the default by increasing font size, changing the background to white, and adding a border. By default it uses a minimal grid, but you can easily remove the grid entirely.

library(ggplot2)
basic_box_plot <- ggplot(mtcars, aes(as.factor(cyl), mpg)) + geom_boxplot()

basic_box_plot + tr_theme()
basic_box_plot + tr_theme(grid = FALSE)

tr_tidy_gage

Combines the items "greater" and "less" from the list output by gage into a single tidy data frame (tibble), and provides an option to filter the results based on q value.

tibble_head <- function(x) {
  head(dplyr::as_tibble(x, rownames = "rownames"))
}

gage_untidy <- 
  readRDS(system.file("extdata", "ex_gage_results.rds", package = "tRavis"))

# Have a look at the original results
lapply(gage_untidy, tibble_head)

tr_tidy_gage(gage_untidy, qval = 1)

tr_trunc_neatly

Simple function to truncate long strings without breaking them in the middle of a word. Useful for trimming long axis labels in a plot.

tr_trunc_neatly(
  x = "This is a long string that we want to break neatly",
  l = 40
)

It's can also be used inside of a mutate call:

ex_df <- data.frame(
  col1 = c(1, 2, 3),
  col2 = c(
    "This is a pretty long string",
    "This string is actually a bit longer",
    "Here is the longest string of them all, just!"
  )
)
dplyr::mutate(
  ex_df,
  col3 = purrr::map_chr(col2, ~tr_trunc_neatly(.x, l = 20))
)

Session information

sessionInfo()


travis-m-blimkie/tRavis documentation built on July 25, 2024, 9:07 p.m.