knitr::opts_chunk$set(collapse = TRUE, comment = "#>")
library(tRavis)
Clean annotation files (CSV or TSV) for Pseudomonas aeruginosa from https://pseudomonas.com.
# A URL can also be supplied, instead of a local file path link <- paste0( "https://pseudomonas.com/downloads/pseudomonas/pgd_r_22_1/", "Pseudomonas_aeruginosa_PAO1_107/Pseudomonas_aeruginosa_PAO1_107.csv.gz" ) tr_anno_cleaner(input_file = link)
Or add some extra columns and fill empty names with the corresponding locus tag.
tr_anno_cleaner(link, extra_cols = TRUE, fill_names = TRUE)
Takes a DESeq2 results object, and
returns the significant DE genes, printing a message summarizing the comparison
and number of significant genes when inform = TRUE
(default).
ex_deseq_results <- readRDS(system.file("extdata", "ex_deseq_results.rds", package = "tRavis")) ex_deseq_results
tr_clean_deseq2_result(ex_deseq_results)
The default filters applied to the data are: padj < 0.05
and
abs(log2FoldChange) > log2(1.5)
.
Compare two lists to find the common/unique elements, with an optional names
argument to apply to the results.
tr_compare_lists(c(1, 2, 3, 4), c(3, 4, 5, 6), names = c("A", "B"))
Create a named list of files, easily piped into purrr::map(~read.csv(.x))
to
generate a named list of data frames. Supports recursive searching, custom
string/pattern removal, and date removal assuming a format like YYYYMMDD (can't
contain punctuation/symbols).
tr_get_files( directory = system.file("extdata", package = "tRavis"), pattern = "test", date = TRUE, remove_string = "test_" )
Generate RNA-Seq QC plots from MultiQC outputs. Currently only supports summary plots for FastQC (Phred scores and read counts), STAR, and HTSeq. Plots are created with ggplot2 for simplicity. A few arguments are provided to modify the overall font size, set the limits, and toggle a threshold line at a given number of reads/counts:
multiqc_data <- system.file("extdata/tr_qc_plots_data", package = "tRavis") list.files(multiqc_data) qc_plot_output <- tr_qc_plots( directory = multiqc_data, threshold_line = 5e6, font_size = 14 ) qc_plot_output[["plots"]]
The bar plots work well enough for relatively few samples, but quickly become unwieldy with lots of samples. Box plots can also be generated using the same function as follows:
qc_plot_output_box <- tr_qc_plots( directory = multiqc_data, type = "box", threshold_line = 5e6, font_size = 16 ) qc_plot_output_box[["plots"]][c("fastqc_reads", "star", "htseq")]
The points can be toggled on or off using the add_points
argument.
All the underlying tidy data is also returned, so one can easily generate their own plots or further examine the data as desired:
qc_plot_output[["data"]][["phred_scores"]]
qc_plot_output[["data"]][["fastqc_reads"]]
qc_plot_output[["data"]][["star"]]
qc_plot_output[["data"]][["htseq"]]
Sort a column of alphanumeric strings in (non-binary) numerical order given an input data frame and desired column. You can use the column name or index, and its compatible with pipes.
df_unsorted <- data.frame( colA = c("a11", "a1", "b1", "a2"), colB = c(3, 1, 4, 2) ) tr_sort_alphanum(input_df = df_unsorted, sort_col = "colA")
Simple wrapper around Fisher's test for gene enrichment, which constructs the matrix for you and returns the p value.
all_genes <- paste0("gene", sample(1:10000, 5000)) de_genes <- sample(all_genes, 1500) gene_set <- sample(all_genes, 100) tr_test_enrichment( query_set = de_genes, enrichment_set = gene_set, total_genes = 5000 )
Clean themes for ggplot2 that improve on the default by increasing font size, changing the background to white, and adding a border. By default it uses a minimal grid, but you can easily remove the grid entirely.
library(ggplot2) basic_box_plot <- ggplot(mtcars, aes(as.factor(cyl), mpg)) + geom_boxplot() basic_box_plot + tr_theme() basic_box_plot + tr_theme(grid = FALSE)
Combines the items "greater" and "less" from the list output by gage into a single tidy data frame (tibble), and provides an option to filter the results based on q value.
tibble_head <- function(x) { head(dplyr::as_tibble(x, rownames = "rownames")) } gage_untidy <- readRDS(system.file("extdata", "ex_gage_results.rds", package = "tRavis")) # Have a look at the original results lapply(gage_untidy, tibble_head) tr_tidy_gage(gage_untidy, qval = 1)
Simple function to truncate long strings without breaking them in the middle of a word. Useful for trimming long axis labels in a plot.
tr_trunc_neatly( x = "This is a long string that we want to break neatly", l = 40 )
It's can also be used inside of a mutate
call:
ex_df <- data.frame( col1 = c(1, 2, 3), col2 = c( "This is a pretty long string", "This string is actually a bit longer", "Here is the longest string of them all, just!" ) ) dplyr::mutate( ex_df, col3 = purrr::map_chr(col2, ~tr_trunc_neatly(.x, l = 20)) )
sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.