importPairedGSEA: Import PairedGSEA Results into IsoformSwitchAnalyzeR

importPairedGSEAR Documentation

Import PairedGSEA Results into IsoformSwitchAnalyzeR

Description

Function for importing paired differential expression and splicing results from pairedGSEA, along with a (gziped or unpacked) GTF file into R as a switchAnalyzeRlist.

Usage

importPairedGSEA(
    splicing_results,
    diff_results,
    pathToGTF,
    isoCount = 10,
    min.Count.prop = 0.7,
    IFcutoff = 0.1,
    min.IF.prop = 0.5,
    acceptedGeneBiotype = NULL,
    acceptedIsoformClassCode = NULL,
    removeSingleIsoformGenes = TRUE,
    reduceToSwitchingGenes = FALSE,
    reduceFurtherToGenesWithConsequencePotential = FALSE,
    onlySigIsoforms = FALSE,
    keepIsoformInAllConditions = FALSE,
    alpha = 0.05,
    dIFcutoff = 0.1,
    detectUnwantedEffects = TRUE,
    addAnnotatedORFs = TRUE,
    onlyConsiderFullORF = FALSE,
    removeNonConvensionalChr = FALSE,
    ignoreAfterBar = TRUE,
    ignoreAfterSpace = TRUE,
    ignoreAfterPeriod = FALSE,
    removeTECgenes = TRUE,
    PTCDistance = 50,
    foldChangePseudoCount = 0.01,
    fixStringTieAnnotationProblem = TRUE,
    fixStringTieViaOverlapInMultiGenes = TRUE,
    fixStringTieMinOverlapSize = 50,
    fixStringTieMinOverlapFrac = 0.2,
    fixStringTieMinOverlapLog2RatioToContender = 0.65,
    estimateDifferentialGeneRange = TRUE,
    showProgress = TRUE,
    quiet = FALSE
)

Arguments

splicing_results

A DEXSeqResults object or a data frame from pairedGSEA, which contains splicing results, including isoform-level p-values (padj) and other splicing metrics. It is the intermediate isoform-level output of paired_diff() when you set store_results = FALSE. The object must contain columns such as groupID, featureID, and padj.

diff_results

A data.frame containing the paired differential expression results from pairedGSEA. It must include the following columns:

  • gene: Gene identifiers.

  • lfc_expression: Log fold change for expression.

  • pvalue_expression: Raw p-value for expression.

  • padj_expression: Adjusted p-value for expression.

pathToGTF

Can either be:

  • 1: A string indicating the full path to the (gziped or unpacked) GTF file which have been quantified. If supplied the exon structure and isoform annotation will be obtained from the GTF file. An example could be "myAnnotation/myGenome/isoformsQuantified.gtf")

  • 2: A string indicating the full path to the (gziped or unpacked) RefSeq GFF file which have been quantified. If supplied the exon structure and isoform annotation will be obtained from the GFF file. Please note only GFF files from RefSeq downloaded from ftp://ftp.ncbi.nlm.nih.gov/genomes/ are supported (see database FAQ in vignette for more info). An example could be "RefSeq/isoformsQuantified.gff")

isoCount, min.Count.prop, IFcutoff, min.IF.prop

Arguments for the filtering step. See preFilter for details.

acceptedGeneBiotype, acceptedIsoformClassCode, removeSingleIsoformGenes

Arguments to control which genes and isoforms are retained. See preFilter for details.

reduceToSwitchingGenes, reduceFurtherToGenesWithConsequencePotential, onlySigIsoforms, keepIsoformInAllConditions

Arguments for filtering based on isoform switches and downstream consequences. See preFilter for details.

alpha, dIFcutoff

Thresholds for significance and differential isoform fraction used in filtering. See preFilter for details.

detectUnwantedEffects, addAnnotatedORFs, onlyConsiderFullORF, removeNonConvensionalChr, ignoreAfterBar, ignoreAfterSpace, ignoreAfterPeriod, removeTECgenes, PTCDistance, foldChangePseudoCount, fixStringTieAnnotationProblem, fixStringTieViaOverlapInMultiGenes, fixStringTieMinOverlapSize, fixStringTieMinOverlapFrac, fixStringTieMinOverlapLog2RatioToContender, estimateDifferentialGeneRange

Advanced arguments for handling isoform annotations and data preprocessing. See importRdata for details.

showProgress

Logical, indicating whether progress messages should be displayed. Default is TRUE.

quiet

Logical, indicating whether to suppress all output messages. Default is FALSE.

Details

This function is specifically designed to import the paired differential gene expression and splicing analyses from pairedGSEA (particularly the paired_diff() function) into IsoformSwitchAnalyzeR. By integrating these results with GTF annotations, the function generates a switchAnalyzeRlist object that is ready for downstream analysis.

The function leverages the pre-built IsoformSwitchAnalyzeR functions importRdata() and preFilter() to streamline the integration and filtering process. Specifically, it:

  • Extracts and processes count and design matrices from splicing_results.

  • Integrates gene-level differential expression results, and splicing analyses into the switchAnalyzeRlist.

  • Applies the preFilter() function to refine the dataset based on user-defined thresholds for significance, log fold change, expression levels, and isoform switching criteria.

If you encounter issues regarding specific arguments, refer to the documentation for importRdata and preFilter for detailed explanations of their functionality and parameters.

Value

A switchAnalyzeRlist containing filtered and annotated gene and isoform information. See ?switchAnalyzeRlist for more details.

If no genes match after filtering, an empty switchAnalyzeRlist is returned with a warning.

Author(s)

Kristoffer Vitting-Seerup, Chunxu Han

References

Vitting-Seerup et al. The Landscape of Isoform Switches in Human Cancers. Mol. Cancer Res. (2017). Dam, S.H., Olsen, L.R. & Vitting-Seerup, K. Expression and splicing mediate distinct biological signals. BMC Biol 21, 220 (2023).

See Also

importRdata
preFilter


kvittingseerup/IsoformSwitchAnalyzeR documentation built on Jan. 1, 2025, 9:08 p.m.