isoformSwitchAnalysisCombined: Isoform Switch Analysis Workflow: Extract, Annoate and...

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/high_level_functions.R

Description

This high-level function takes a CuffSet object or a pre-existing switchAnalyzeRlist as input. If the input is a CuffSet a switchFindeRList is created and else the function uses the provieded switchAnalyzeRlist.

Then isoform switches are identified, annotated with ORF and intron retion. Then functional consequences are identified and isoform switch analysis plots are generated for the top n isoform switches. Lastly a plot summarizing the global effect of isoform switches with functional consequences is generated. If external analysis of protein domians (Pfam), coding potential (CPAT) or signal peptides (SignalP) should be incorporated please use the combination of isoformSwitchAnalysisPart1 and isoformSwitchAnalysisPart2 instead.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
isoformSwitchAnalysisCombined(
    switchAnalyzeRlist,
    alpha=0.05,
    dIFcutoff = 0.1,
    switchTestMethod='DEXSeq',
    n=NA,
    pathToOutput=getwd(),
    overwriteORF=FALSE,
    outputSequences=FALSE,
    genomeObject,
    orfMethod='longest',
    cds=NULL,
    consequencesToAnalyze=c('intron_retention','ORF_seq_similarity','NMD_status'),
    fileType='pdf',
    asFractionTotal=FALSE,
    outputPlots=TRUE,
    quiet=FALSE
)

Arguments

switchAnalyzeRlist

A switchAnalyzeRlist.

alpha

The cutoff which the (callibrated) fdr correct p-values must be smaller than for calling significant switches. Defualit is 0.05.

dIFcutoff

The cutoff which the changes in (absolute) isoform usage must be larger than before an isoform is considered switching. This cutoff can remove cases where isoforms with (very) low dIF values are deemed signicant and thereby included in the downstream analysis. This cutoff is analogours to having a cutoff on log2 fold change in a normal differential expression analysis of genes to ensure the genes have a certain effect size. Default is 0.1 (10%).

switchTestMethod

A sting indicating which statistical method should be used for testing differential isoform usage. The following options are avilable:

  • 'DEXSeq' : Uses DEXSeq to perform the statiscal test. See isoformSwitchTestDEXSeq. Default

  • 'DRIMSeq' : Uses the DRIMSeq package to perform the statiscal test. See isoformSwitchTestDRIMSeq. Default

  • 'none' : No statistical test is performed. Should only be used if a test have already been performed and should not be overwritten (e.g when importing cuffdiff data).

n

The number of top genes (after filtering and sorted according to sortByQvals) that should be saved to each subfolder indicated by splitConditions, splitFunctionalConsequences. Use NA to create all. Default is NA (all).

pathToOutput

A path to the folder in which the plots should be made. Default is working directory ( getwd() ).

overwriteORF

A logical indicating whether to overwrite the ORF analysis already stored in the supplied switchAnalyzeRlist. Default is FALSE.

outputSequences

A logic indicating whether transcript nucleotide and amino acid sequences should be outputtet to outputDestination. Default is TRUE.

genomeObject

A BSgenome object (for example Hsapiens for Homo sapiens).

orfMethod

A string indicating which of the 3 ORF identification methods should be used. The methods are:

  • longest : Identifies the longest ORF in the transcript. This approach is similar to what the CPAT tool uses in it's analysis of coding potential

  • longestAnnotated : Identifies the longest ORF downstream of an annotated translation start site (supplied via the cds argument)

  • mostUpstreamAnnoated : Identifies the ORF downstream of the most upstream overlapping annotated translation start site (supplied via the cds argument)

Default is longest.

cds

A CDSSet object containing annotated coding regions, see ?CDSSet and ?getCDS for more information. Only nessesary if \'orfType\' arguments is \'longestAnnotated\' or \'mostUpstreamAnnoated\'.

consequencesToAnalyze

A vector of strings indicating what type of functional consequences to analyze. Note there is bound to be some differences between transcripts (else there would be identical). See details in analyzeSwitchConsequences for full list of usable strings and their meaning. Default is c('intron_retention','ORF_seq_similarity','NMD_status') (corresponding to analyze: intron retention, ORF AA sequence similarity and NMD status).

fileType

A string indicating which file type is generated. Available options are \'pdf\' and \'png\'. Default is pdf.

asFractionTotal

A logic indicating whether the number of consequences should be calculated as numbers (if FALSE) or as a fraction of the total number of switches in the plot summarizing general consequences of all the isoform swithces. Default is FALSE.

outputPlots

A logic indicating whether all isoform switches as well as the summary of functional consequences should be outputted in the directory specified by pathToOutput argument. Default is TRUE.

quiet

A logic indicating whether to avoid printing progress messages (incl. progress bar). Default is FALSE

Details

This function performs the full Isoform Analysis Workflow by

  1. Remove non-expressed isoforms and single-isoform genes (see preFilter)

  2. predict swithces (only if switches is not already annotated, see isoformSwitchTestDEXSeq)

  3. Analyzing the isoforms for open reading frames (ORFs, see analyzeORF)

  4. Output fasta files containing the nucleotide and amino acid sequences which enables external sequence analysis with CPAT, Pfam and SignalP (see extractSequence)

  5. Predict functional consequences of switching (see analyzeSwitchConsequences)

  6. Ouput Isoform Switch Analysis plots for all genes with a signicant switch (see switchPlot)

  7. Ouput a visualization of general consequences of isoform switches.

Value

This function outputs:

  1. The supplied switchAnalyzeRlist now annotated with all the analysis described above

  2. One folder per comparison of condition containing the isoform switch analysis plot of all significant isoforms.

  3. A plot summarizing the overall consequences off all the isoform switches.

Author(s)

Kristoffer Vitting-Seerup

References

Vitting-Seerup et al. The Landscape of Isoform Switches in Human Cancers. Mol. Cancer Res. (2017).

See Also

isoformSwitchAnalysisPart1
isoformSwitchAnalysisPart2
preFilter
isoformSwitchTestDEXSeq
isoformSwitchTestDRIMSeq
analyzeORF
extractSwitchSummary
analyzeSwitchConsequences
switchPlotTopSwitches

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data("exampleSwitchList")
exampleSwitchList

library(BSgenome.Hsapiens.UCSC.hg19)
exampleSwitchListAnalyzed <- isoformSwitchAnalysisCombined(
    switchAnalyzeRlist=exampleSwitchList,
    genomeObject = Hsapiens,
    dIFcutoff = 0.4,         # Set high for short runtime in example data
    outputSequences = FALSE, # keeps the function from outputting the fasta files from this example
    outputPlots = FALSE      # keeps the function from outputting the Isoform Switch AnalyzeR Plots from this example
)

exampleSwitchListAnalyzed

kvittingseerup/IsoformSwitchAnalyzeR documentation built on July 20, 2019, 8:54 a.m.