isoformSwitchAnalysisPart1: Isoform Switch Analysis Workflow Part 1: Extract Isoform...

View source: R/high_level_functions.R

isoformSwitchAnalysisPart1R Documentation

Isoform Switch Analysis Workflow Part 1: Extract Isoform Switches and Their Bio-sequences

Description

This high-level function takes a pre-existing switchAnalyzeRlist as input (see importRdata). Then part 1 of the workflow is performed. Specifically it is filtered to remove low expression, identifies significant isoform switches and ORF are predicted if not already annotated. Lastly the function extracts the nucleotide sequence and the ORF AA sequences of the isoforms involved in isoform switches. To enable external and internal sequence analysis these sequences are both saved to the computer (as fasta files) and added to the switchAnalyzeRlist.

This function is meant to be used as part 1 of the isoform switch analysis workflow, which can be followed by the second step via isoformSwitchAnalysisPart2.

Usage

isoformSwitchAnalysisPart1(
    ### Core arguments
    switchAnalyzeRlist,

    ### Annotation arguments
    genomeObject = NULL,
    pathToGTF = NULL,

    ### Output arguments
    prepareForWebServers,
    pathToOutput = getwd(),
    outputSequences = TRUE,

    ### Other arguments
    quiet = FALSE
)

Arguments

switchAnalyzeRlist

A switchAnalyzeRlist.

genomeObject

A BSgenome object (for example Hsapiens for Homo sapiens).

pathToGTF

A string indicating the full path to the (gziped or unpacked) GTF file which contains the the known annotation (aka from a official source) which was used to guided the transcript assembly (isoform deconvolution).

prepareForWebServers

A logical indicating whether the amino acid fasta files saved (if outputSequences=TRUE) should be prepared for the online web-services currently supported (as they have some limitations on what can submitted). See details. Default is FALSE (for backward compatibility).

pathToOutput

A path to the folder in which the plots should be made. Default is working directory ( getwd() ).

outputSequences

A logical indicating whether transcript nucleotide and amino acid sequences should be outputted to pathToOutput. Default is TRUE.

quiet

A logical indicating whether to avoid printing progress messages (incl. progress bar). Default is FALSE

Details

This function performs the first part of a Isoform Analysis Workflow by

  1. Remove low-expressed isoforms and single-isoform genes (see preFilter)

  2. Identify isoform switches. This uses satuRn if there are more than 5 replicates in any condition and else DEXseq.

  3. If no ORFs are annotated the isoforms are analyzed for open reading frames (ORFs, see analyzeORF)

  4. The isoform nucleotide and ORF amino acid sequences are extracted and saved to fasta files as well as added to the switchAnalyzeRlist enabling external sequence analysis with CPAT, Pfam and SignalP (see vignette for more info).

if prepareForWebServers=TRUE both the "removeLongAAseq" and "alsoSplitFastaFile" will be enabled in the extractSequence function.

Value

This function have two outputs. It returns a switchAnalyzeRlist object where information about the isoform switch test, ORF prediction and nt and aa sequences have been added. Secondly (if outputSequences=TRUE) the nucleotide and amino acid sequence of transcripts involved in switches are also save as fasta files enabling external sequence analysis.

Author(s)

Kristoffer Vitting-Seerup

References

Vitting-Seerup et al. The Landscape of Isoform Switches in Human Cancers. Mol. Cancer Res. (2017).

See Also

preFilter
isoformSwitchTestDEXSeq
isoformSwitchTestSatuRn
analyzeORF
extractSequence

Examples

### load example data
data("exampleSwitchList")

### Subset for quick runtime
exampleSwitchList <- subsetSwitchAnalyzeRlist(
    exampleSwitchList,
    abs(exampleSwitchList$isoformFeatures$dIF) > 0.4
)

### Show summary
summary(exampleSwitchList)

### Run Part 1
exampleSwitchList <- isoformSwitchAnalysisPart1(
    switchAnalyzeRlist=exampleSwitchList,
    prepareForWebServers = FALSE,
    outputSequences = FALSE # keeps the function from outputting the fasta files from this example
)

### Show summary
summary(exampleSwitchList)

kvittingseerup/IsoformSwitchAnalyzeR documentation built on Jan. 1, 2025, 9:08 p.m.