PAC_trna: tsRNA analysis of PAC object

View source: R/PAC_trna.R

PAC_trnaR Documentation

tsRNA analysis of PAC object

Description

Analyzing and plotting tsRNAs between groups.

Usage

PAC_trna(
  PAC,
  norm = "cpm",
  filter = 100,
  join = FALSE,
  top = 15,
  log2fc = FALSE,
  pheno_target = NULL,
  anno_target_1 = NULL,
  ymax_1 = NULL,
  anno_target_2 = NULL,
  paired = FALSE,
  paired_IDs = NULL
)

Arguments

PAC

PAC-list object. The Anno object needs to contain at least two tsRNA classification columns, typically isotype (e.g. LysCTT, GlyGCC) and rangetype (5'tsRNA, 5'tsRNA-halves, i'-tsRNA, 3'tsRNA, 3'tsRNA-halves). These can be generated by first mapping sequences to tRNA reference using PAC_mapper and extract isotypes from tRNA names, and then obtain rangetype classifications using map_rangetype.

norm

Character indicating what type of data in PAC to be used as input. If norm="raw" the raw counts in Counts will be used. Given any other character string, the function will search for the string as a name on a dataframe stored in the PAC$norm list-folder (created for example by PAC_rpm), (default="rpm")

filter

Integer specifying the minimum coverage value (specified by norm) for a sequence to be included in the analysis. If filter=10 and norm="rpm", only sequences that have at least 10 rpm in 100 will be included. (default=NULL).

join

Logical, whether separate expression tables of anno_target_1 and anno_target_2 should be generated for each group in pheno_target (FALSE), or if pheno_target groups should be joined (TRUE).

top

Integer, specifying the number of the most highly expressed classifications in anno_target_1 that should be reported in the graphs.

log2fc

Logical, whether log2 fold change point-bars (independent pheno_target groups) or errorbars (paired pheno_target groups) should be reported.

pheno_target

List with: 1st object being a character vector of the target column name in Pheno, and the 2nd object being a character vector specifying the names of two target groups in the target column (1st object). Note, that this function will only work using two groups from the pheno_target column, like this: pheno_target=list(<Pheno_column_name>, c(<group_1>, <group_2>)). If there are more groups than two in the target column, unspecified groups will not be analyzed.

anno_target_1

List with: 1st object being a character of the target column name in Anno, 2nd object being a character vector of the target classifications in the 1st target Anno column. The anno_target_1 is typically used for isotype classification in tsRNA analysis (example: LysCTT, GlyGGC etc). Specifying only the column name, will automatically include all classifications in the anno_target column.

ymax_1

Integer that sets the maximum y for all mean plots (all plots gets the same y-axes). If ymax_1=NULL (default), then ggplot2 will automatically set ymax for each plot individually (different y-axes).

anno_target_2

Same as anno_target_2 but specifying the the second classification typically used for 3'-5' classification in tsRNA analysis (example: 5'tsRNA, 5'tsRNA-halves, i'-tsRNA, 3'tsRNA, 3'tsRNA-halves). Note, that anno_target_2[[2]] is order sensitive, meaning that the order will be preserved in the graphs.

paired

Logical, whether pheno_target is given as paired samples (e.g. before and after treatment of the same patient). Only works with paired_IDs.

paired_IDs

Character, specifying the name of a column in Pheno containing the paired sample names. For example, if pheno_target=list("treatment", c("before", "after")), and paired=TRUE, then paired_IDs="patient", must be a column in Pheno containing the same patient ID reported twice for each patient (before and after).

Details

Given a PAC and a PAC_map object generated by PAC_mapper this function will attempt to analyze and plot tsRNA over two pheno_target groups, using two tsRNA classifications (anno_target_1 and annot_target_2).

Value

List of ggplot2 plots and the data used for generating the plots. Use ls.str() to explore each level.

See Also

https://github.com/Danis102 for updates on the current package.

Other PAC analysis: PAC_covplot(), PAC_deseq(), PAC_filter(), PAC_filtsep(), PAC_gtf(), PAC_jitter(), PAC_mapper(), PAC_nbias(), PAC_norm(), PAC_pca(), PAC_pie(), PAC_saturation(), PAC_sizedist(), PAC_stackbar(), PAC_summary(), as.PAC(), filtsep_bin(), map_rangetype(), tRNA_class()

Examples



###########################################################
### tRNA analysis in seqpac 
# (More details see vignette and manuals.)

##-------------------------------##
## Setup environment for testing ##

# First create an annotation blanc PAC with group means
load(system.file("extdata", "drosophila_sRNA_pac_filt_anno.Rdata", 
                package = "seqpac", mustWork = TRUE))
anno(pac) <- anno(pac)[,1, drop=FALSE]
pac_trna <- PAC_summary(pac, norm = "cpm", type = "means", 
                       pheno_target=list("stage"), merge_pac = TRUE)


# Then load paths to example trna ref and ss files
trna_ref <- system.file("extdata/trna2", "tRNA2.fa", 
                       package = "seqpac", mustWork = TRUE)

ss_file <- system.file("extdata/trna2", "tRNA2.ss",
                      package = "seqpac", mustWork = TRUE)


##--------------------------------------##
## Create a map object using PAC_mapper ##
map_object <- PAC_mapper(pac_trna, ref=trna_ref, 
                        N_up = "NNN", N_down = "NNN",
                        mismatches=0, threads=2, 
                        report_string=TRUE, override = TRUE)
# Warning: override = TRUE, will remove everything in temporary output path.
# Note: bowtie indexes are not nedded for PAC_mapper.


##------------------------------------------##
## Coverage plots of tRNA using PAC_covplot ##

# Single tRNA targeting a summary data.frame 
PAC_covplot(pac_trna, map=map_object, summary_target= list("cpmMeans_stage"),
           map_target="tRNA-Ala-AGC-1-1")

# Find tRNAs with many fragments and then plot
n_tRFs <- unlist(lapply(map_object, function(x){nrow(x[[2]])}))
selct <- (names(map_object)[n_tRFs>1])[c(1, 16, 25, 43)]
cov_plt <- PAC_covplot(pac_trna, map=map_object, 
                      summary_target= list("cpmMeans_stage"), 
                      map_target=selct)
cowplot::plot_grid(plotlist=cov_plt, nrow=2, ncol=2)


##-------------------------------------------##
## Classify tRNA fragment with map_rangetype ## 

# Classify according to loop structure using ss file provided with seqpac
map_object_ss <- map_rangetype(map_object, type="ss", ss=ss_file, 
                              min_loop_width=4)          
# Note 1: You may download your own ss file at for example GtRNAdb
# Note 2: The ss file must match the reference used in creating the map_object

# Classify instead according to nucleotide position or percent intervals 
map_object_rng <- map_rangetype(map_object, type="nucleotides",
                               intervals=list(start=1:5, end=95:100))

map_object_prc <- map_rangetype(map_object, type="percent",
                               intervals=list(start=1:5, mid=45:50,
                                              end=95:100))

# Compare the output (same fragment different classifications)
map_object_ss[[2]]
map_object_prc[[2]]
map_object_rng[[2]]


##-------------------------------##
## Classify tRFs with tRNA_class ##
# Warning: this function currently only works using map objects 
# created using ss files with specific names for loop1/2/3 columns: 
colnames(map_object_ss[[2]][[2]])  

# Important: We added three "Ns" in PAC_mapper (above). If, terminal
# classification should correspond to the original tRNA sequences we need to
# account for these Ns when defining what is a terminal fragment. Here, we set
# "terminal=5", which means that PAC sequences will receive 5'/3'
# classification if they map to the first two or last two nucleotides of the
# original full-length tRNAs (2+NNN =5).

pac_trna <- tRNA_class(pac_trna, map_object_ss, terminal=5)

##---------------------------------##
## Plot some results with PAC_trna ##

# Here is one example of tRNA visualization using our PAC_trna function: 
trna_result <- PAC_trna(pac_trna, norm="cpm", filter = NULL,
                       join = TRUE, top = 15, log2fc = TRUE,
                       pheno_target = list("stage", c("Stage1", "Stage5")),
                       anno_target_1 = list("type"),
                       anno_target_2 = list("class"))

cowplot::plot_grid(trna_result$plots$Expression_Anno_1$Grand_means,
                  trna_result$plots$Log2FC_Anno_1,
                  trna_result$plots$Percent_bars$Grand_means,
                  nrow=1, ncol=3)
# Note: There are no 3ยด fragments in our test data.  

# By setting join = FALSE you will get group means instead of grand means:
trna_result <- PAC_trna(pac_trna, norm="cpm", filter = NULL,
                       join = FALSE, top = 15, log2fc = TRUE,
                       pheno_target = list("stage", c("Stage1", "Stage3")),
                       anno_target_1 = list("type"),
                       anno_target_2 = list("class"))

cowplot::plot_grid(trna_result$plots$Expression_Anno_1$Stage1,
                  trna_result$plots$Expression_Anno_1$Stage3,
                  trna_result$plots$Log2FC_Anno_1,
                  trna_result$plots$Percent_bars$Stage1,
                  trna_result$plots$Percent_bars$Stage3,
                  nrow=1, ncol=5)

##-----------------------------------------##
## Clean up temp folder                    ##
# (Sometimes needed for automated examples) 

closeAllConnections()
fls_temp  <- tempdir()
fls_temp  <- list.files(fls_temp, recursive=TRUE, 
                       full.names = TRUE)
suppressWarnings(file.remove(fls_temp)) 


Danis102/seqpac documentation built on Aug. 26, 2023, 10:15 a.m.