Finding Tn-Seq Essential genes (FiTnEss)
Poster: FiTnEss_poster.pdf
FiTnEss is an R package using Transposon insertion sequencing data to identify essential genes in the genome.
Original paper on bioRxiv: Defining the core essential genome of Pseudomonas aeruginosa
Publication: Defining the core essential genome of Pseudomonas aeruginosa
After installing FiTnEss package, run main FiTnEss function by FiTnEss_Run
Arguments in this function include:
- strain
- file_location: path and name of tally file for run:
e.g. "/home/your_folder/your_tally.txt"
- permissive_file: path and name of non-permissive TA site file that generated from genomic pre-processing step:
e.g. "/home/your_folder/non_permissive_TA_sites.txt"
- homologous_file: path and name of homologous TA site file that generated from pre-processing step:
e.g. "/home/your_folder/homologous_TA_sites.txt"
- gene_file: path and name of GFF3 gene annotation file. For example, GFF3 file could be downloaded from Pseudomonas Genome Database:
e.g. "/your/folder/location/your_gff3_file.txt"
- save_location: path and name of where to save final results file:
e.g. "/home/results_folder/results.xlsx"
- repeat_time: how many times to run the pipeline in order to obtain best results: by default, we run 3 times.
install.packages("devtools")
devtools::install_github("ruy204/FiTnEss")
Packages <- c("dplyr","fBasics","goftest","openxlsx","scales","stats","tidyr")
lapply(Packages, library, character.only = TRUE)
require(FiTnEss)
FiTnEss_Run("PA14",
"/your/folder/location/Test_set_P_aeruginosa/sample_data/PA14_M9_rep1_tally.txt",
"/your/folder/location/Test_set_P_aeruginosa/TAsite_info/nonpermissive_TA_sites.txt",
"/your/folder/location/Test_set_P_aeruginosa/TAsite_info/homologous_TA_sites.txt",
"/your/folder/location/Test_set_P_aeruginosa/genome_info/PA14_gff.txt",
"/your/folder/location/Test_set_P_aeruginosa/sample_data/test_results.xlsx",
repeat_time = 3)
|Locus.CIA |gtot|Nta|pvalue |padj|Ess_fwer|pfdr |Ess_fdr| |----------|----|---|--------|----|--------|--------|-------| |PA14_00410| 5| 1|0.015989| 1| NE_fwer|0.093033| NE_fdr|
Each tab in the .xlsx file saves results from each replicate. Within each results table, there are 8 columns: - Locus.CIA: gene index - gtot: total reads for the gene - Nta: number of TA sites in this gene - pvalue: unadjusted p-value of being essential - padj: FWER-adjusted p-value - Ess_fwer: confident essential category - pfdr: FDR-adjusted p-value - Ess_fdr: candidate essential category
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.