Package for analysis of Bliss data from Rad Lab
load_annotated_peaks() -> load a series annotated peak files in the R environment
load_bams() -> load a series of alignment files in the R environment
add_cancer_gene_census() -> add cancer gene census information to the annotation table
add_cpg_islands() -> add cpg_islands annotation to annotation table
add_repeat_masker_table() -> add repeat masker table information to annotation table
annotation_pie() -> create a pie plot of the annotation regions
coverage_plot() -> create a bar plot of the percentage of total reads in the top n peaks
plot_selection_frequencies() -> select a column of the annotation table and plot the frequence of the elements in it
venn_generator() -> create a venn plot by peaks overlapping or by gene hits of all the peaks(or genes) or of the ones that have a specified characteristic
plot_density() -> to plot the density over chromosomes of the bam files
arguments:
organism -> "human", "hg", "homo sapiens" or "mouse", "mm", "mus musculus", default is "human"
blacklist_folder -> the folder in which the blacklist files for human and mouse are stored
This function use the sample.csv table contained in the working directory to choose, baseed on the organism, the files to load. This files should be stored in the ./data folder
The output of the function os a data table with all the coordinates of the peaks and the annotation
arguments:
organism -> "human", "hg", "homo sapiens" or "mouse", "mm", "mus musculus", default is "human"
blacklist_folder -> the folder in which the blacklist files for human and mouse are stored
This function use the sample.csv table contained in the working directory to choose, baseed on the organism, the files to load. This files should be stored in the ./data folder
The output of the function os a data table with all the coordinates of the mapped reads
arguments:
samples -> a list of annotated peaks
cgc_folder -> the folder where the cancer gene census table is stored, default is ./utils
This function add to the annotation table information about wheter or not the hit gene is a cancer gene presented in the Cancer Gene Census, if yes, also the role of the gene in cancer and the type of tumor in which i usually involved You can download Cancer Gene Census table from the website: https://cancer.sanger.ac.uk/census#cl_search
arguments:
samples -> a list of annotated peaks
genome -> saved when you load the annotated peak files
cpg_annots -> also saved when you load the annotated peak files, both the genome and the cpg_annots depends on the organism
This function uses the Bioconductor library "annotatr" (https://www.bioconductor.org/packages/release/bioc/html/annotatr.html) to add a column called "cpg_island" which value is "yes" if the peak coordinates fall inside a CpG island
# add_repeat_masker_table(samples, table_folder) arguments:
samples -> a list of annotated peaks
table_folder -> the folder where the repeat masker table is stored, default is ./utils
This function add to the annotation three other columns that says wheter or not the sequence fall in a repeated region such as a transposable element, and provide informations about the classification of that repetitive element The repeat masker table can be downloaded from the UCSC Table Browser at this link: http://www.genome.ucsc.edu/cgi-bin/hgTables
# plot_selection_frequencies(samples, column_name) arguments:
samples -> a list of annotated peaks
column_name -> the name of the annotation column that you want to select
This function plots the percentage of each element found in that column
example: plot_selection_frequencies(samples, "Tier") plot the frequency of the three elements of the Tier column, that are "1", "2" or "none"
arguments:
samples -> a list of annotated peaks
n_top -> the number of top peaks that you want to display in the plot, default is 100
plot_dir -> the folder where the plot will be saved, default is ./plots
This function plots a bar plot of the percentage of reads that fall in each of the n_top peaks. The variable n_top is set to 100 by default, but its value can be modified
example: coverage_plot(samples, 200, "plots") plot the percentage of reads in the top 200 peaks
arguments:
samples -> a list of annotated peaks
plot_dir -> the folder where the plot will be stored, default is ./plots
This function plots a bar plot of the number of peaks per length window, where the length windows are the following:
)0-100bp
)100-1'000bp
)1'000-10'000bp
)10'000-100'000bp
)100000-200'000bp
)more than 200'000bp
example: DSBs_per_gene_length(samples, "plots")
arguments:
samples -> a list of annotated peaks
plot_dir -> the folder where the plot will be stored, default is ./plots
This function plots the density of the mapped reads loaded with 'load_bam()' on the chromosomes. By defaults all the chromosomes are displayed, to display only specific chromosome or chromosomes you should change the code(I will implement an argument that allow you to do this)
example: plot_density(samples, plot_dir)
arguments:
samples -> a list of annotated peaks
plot_dir -> the folder where the plot will be stored, default is ./plots
This function plots a pie plot of the annotation regions(UTR, Promoters, Exons, etc.)
arguments:
samples -> a list of annotated peaks
by -> it can be "peaks" or "genes". If it's "peaks" the venn plot is built upon peaks overlapping, if it's "genes" it's built based on gene hits.
selection -> provide the possibility to build the venn plot on subsets of the samples based on selected annotation characteristics. For example, I could build a venn plot in "genes" mode with only genes that are involved in cancer. Default is "all"
example: venn_generator(samples, by = "genes", Tier==1) NOTE: the synthax of the selection must follow the rule Column_name==Value
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.