View source: R/fusion_standardization.R
fusion_standardization | R Documentation |
Various fusion callers have different formats that make aggregating and filtering data difficult. By standardizing fusion callers output we capture the required columns which we use for downstream analysis
fusion_standardization(
fusion_calls,
caller = c("STARFUSION", "ARRIBA", "CUSTOM"),
tumorID = "tumorID",
input_json_file = "No file exists"
)
fusion_calls |
A dataframe from star fusion or arriba (more callers to be added) |
caller |
string options STARFUSION/ARRIBA |
tumorID |
string or character vector of same length as fusion_calls |
input_json_file |
(optional) json format config file to provide input and output columns headers required for CUSTOM type and not required for other callers |
Standardized fusion calls ready for filtering
Krutika S Gaonkar, Saksham Phul (phuls@chop.edu)
# read in arriba fusion file
fusionfileArriba <- read_arriba_calls(
system.file("extdata", "arriba_example.tsv", package = "annoFuseData")
)
# read in starfusion file
fusionfileStarFusion <- read_starfusion_calls(
system.file("extdata", "starfusion_example.tsv", package = "annoFuseData")
)
formattedArriba <- fusion_standardization(fusionfileArriba,
caller = "ARRIBA",
tumorID = "tumorID"
)
formattedStarFusion <- fusion_standardization(fusionfileStarFusion,
caller = "STARFUSION",
tumorID = "tumorID"
)
# read in CUSTOM type file
fusionfileCustom <- data.frame(
Sample = c("BS_WDC88K6G", "BS_6J9HGSSB", "BS_K62F9BCS"),
FusionName = c("KIAA1549--BRAF", "TFG--GPR128", "SPECC1L--NTRK2"),
Gene1A = c("KIAA1549", "TFG", "SPECC1L"),
Gene1B = c("BRAF", "GPR128", "NTRK2"),
Gene2A = c("", "", ""),
Gene2B = c("", "", ""),
Fusion_Type = c("", "", ""),
annots = c(
"[Cosmic,ChimerPub,ChimerSeq,chimerdb_pubmed,ChimerKB,INTRACHROMOSOMAL[chr7:1.74Mb]]",
"[ChimerPub,GTEx,ChimerKB,Greger_Normal,ChimerSeq]",
"[INTERCHROMOSOMAL[chr22--chr9]]")
)
formattedCUSTOM <- fusion_standardization(fusionfileCustom,
caller = "CUSTOM",
tumorID = "All",
input_json_file = system.file("extdata", "config", package = "annoFuseData")
)
# format of the input_json_file ("Input_header" : "Output_header")
# {
# "CUSTOM":{
# "Sample": "Sample_output",
# "FusionName": "FusionName_output",
# "Gene1A": "Gene1A_output",
# "Gene1B": "Gene1B_output",
# "Gene2A": "Gene2A_output",
# "Gene2B": "Gene2B_output",
# "Fusion_Type":"Fusion_Type_output",
# "annots":"annots_output"
# }
# }
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.