fusion_standardization: Standardizes fusion calls

View source: R/fusion_standardization.R

fusion_standardizationR Documentation

Standardizes fusion calls

Description

Various fusion callers have different formats that make aggregating and filtering data difficult. By standardizing fusion callers output we capture the required columns which we use for downstream analysis

Usage

fusion_standardization(
  fusion_calls,
  caller = c("STARFUSION", "ARRIBA", "CUSTOM"),
  tumorID = "tumorID",
  input_json_file = "No file exists"
)

Arguments

fusion_calls

A dataframe from star fusion or arriba (more callers to be added)

caller

string options STARFUSION/ARRIBA

tumorID

string or character vector of same length as fusion_calls

input_json_file

(optional) json format config file to provide input and output columns headers required for CUSTOM type and not required for other callers

Value

Standardized fusion calls ready for filtering

Author(s)

Krutika S Gaonkar, Saksham Phul (phuls@chop.edu)

Examples

# read in arriba fusion file
fusionfileArriba <- read_arriba_calls(
  system.file("extdata", "arriba_example.tsv", package = "annoFuseData")
)
# read in starfusion file
fusionfileStarFusion <- read_starfusion_calls(
  system.file("extdata", "starfusion_example.tsv", package = "annoFuseData")
)
formattedArriba <- fusion_standardization(fusionfileArriba,
  caller = "ARRIBA",
  tumorID = "tumorID"
)
formattedStarFusion <- fusion_standardization(fusionfileStarFusion,
  caller = "STARFUSION",
  tumorID = "tumorID"
)
# read in CUSTOM type file
fusionfileCustom <- data.frame(
  Sample = c("BS_WDC88K6G", "BS_6J9HGSSB", "BS_K62F9BCS"),
  FusionName = c("KIAA1549--BRAF", "TFG--GPR128", "SPECC1L--NTRK2"),
  Gene1A = c("KIAA1549", "TFG", "SPECC1L"),
  Gene1B = c("BRAF", "GPR128", "NTRK2"),
  Gene2A = c("", "", ""),
  Gene2B = c("", "", ""),
  Fusion_Type = c("", "", ""),
  annots = c(
    "[Cosmic,ChimerPub,ChimerSeq,chimerdb_pubmed,ChimerKB,INTRACHROMOSOMAL[chr7:1.74Mb]]", 
    "[ChimerPub,GTEx,ChimerKB,Greger_Normal,ChimerSeq]", 
    "[INTERCHROMOSOMAL[chr22--chr9]]")
)
formattedCUSTOM <- fusion_standardization(fusionfileCustom,
  caller = "CUSTOM",
  tumorID = "All",
  input_json_file = system.file("extdata", "config", package = "annoFuseData")
)
# format of the input_json_file ("Input_header" : "Output_header")
#  {
#  "CUSTOM":{
# 	  	"Sample": "Sample_output",
# 		  "FusionName": "FusionName_output",
# 		  "Gene1A": "Gene1A_output",
# 		  "Gene1B": "Gene1B_output",
# 		  "Gene2A": "Gene2A_output",
# 		  "Gene2B": "Gene2B_output",
# 		  "Fusion_Type":"Fusion_Type_output",
# 		  "annots":"annots_output"
# 	    }
#  }

d3b-center/annoFuse documentation built on Oct. 2, 2024, 4:17 a.m.