kissplice2counts: Conversion of 'KisSplice' or 'KisSplice2RefGenome' outputs

View source: R/kissplice2counts.R

kissplice2countsR Documentation

Conversion of KisSplice or KisSplice2RefGenome outputs

Description

Function that converts KisSplice (.fa) output or KisSplice2RefGenome (tab-delimited) output to a counts data frame that can be used by other functions of the KissDE package.

Usage

kissplice2counts(fileName, counts = 2, pairedEnd = FALSE, order = NULL, 
    exonicReads = TRUE, k2rg = FALSE, keep = c("All"), remove = NULL)

Arguments

fileName

a string indicating the path to the KisSplice (.fa) or the KisSplice2RefGenome (tab-delimited) file.

counts

an interger (0, 1 or 2) corresponding to the KisSplice counts option used (see Details below).

pairedEnd

a logical indicating if the data is paired-end (FALSE, default). If set to TRUE, the sum of the counts from the pair of reads will be computed. It can be used along with counts option. By default, it is assumed that, in the KisSplice command line, two reads of the same pair has been inputed as following each other. If it is not the case, see option order.

order

a numeric vector indicating the actual order of the corresponding paired reads in the columns of the KisSplice output such that they can be summed. This option goes along with pairedEnd = TRUE, if the read pairs are not in the expected order (see pairedEnd option). It has as many elements as there are samples in total. For more information on this parameter, see Details below.

exonicReads

a logical indicating if exonic/intronic read counts will be kept (TRUE, default) or discareded (FALSE). This option only works if counts = 2.

k2rg

a logical indicating if the input file is a KisSplice2RefGenome (TRUE) output or a KisSplice (FALSE, default) output file.

keep

a character vector listing the names of the events to be kept for the statistical test (for k2rg = TRUE, analyses all of the events by default). The test will be more sensitive the selected events. Event(s) name(s) must be part of this list: deletion, insertion, indel, IR, ES, altA, altD, altAD, alt, - (for unclassified events). For more information on this parameter, see Details below.

remove

a character vector listing the names of the events to remove for the statistical test (for k2rg = TRUE, does not remove any event by default). The test will be more sensitive for the non-selected events. Event(s) name(s) must be part of this list: deletion, insertion, indel, IR, ES, altA, altD, altAD, alt, - (for unclassified events), MULTI. This option can not be used along with the keep option, unless ES is one of the events to be kept. In this case, the remove option will work on specific ES events. For more information on this parameter, see Details below.

Details

The counts parameter:

By default, as in KisSplice, the counts option is set to 0, assuming there is no special counting option. Below, an example of the upper path counts format output by KisSplice when counts is set to 2:

|AS1_0|SB1_0|S1_0|ASSB1_0|AS2_27|SB2_41|S2_0|ASSB2_21|

AS3_0|SB3_0|S3_0|ASSB3_0|AS4_7|SB4_8|S4_0|ASSB4_2.

In a regular KisSplice output (counts = 0), it would be:

|C1_0|C2_47|C3_1|C4_13 (with 47 = 27+41+0-21 and 13 = 7+8+0-2)

The order parameter:

If the reads corresponding to a paired-end fragments have not been passed to Kissplice next to each other, the order needs to be explicitly given to the kissplice2counts function. For instance, if there are two paired-end samples and if the input in Kissplice has been: -r sample1_readPair1.fa -r sample2_readPair1.fa -r sample1_readPair2.fa -r sample2_readPair2.fa, the input is not organised with the reads of one pair next to each other. The vector order to give would be order = c(1, 2, 1, 2).

The keep and remove parameters:

The options keep and remove allow the user to select the type of alternative splicing events from KisSplice2RefGenome that have to be analysed. To work only with intron retention events, the vector should be: keep = c("IR"). To work on all events except insertions and deletions, the vector should be remove = c("insertion","deletion"). To work specifically on single exon skipping (ES) events (only one exon can be included or excluded), both keep and remove options must be used. The keep option should be set to c("ES") and the remove option should be set to c("alt","altA","altD","altAD","MULTI").

Value

kissplice2counts returns a list of 4 objects:

countsEvents

a data frame containing several columns: a first column (events.names) with the name of the event based on KisSplice notation, a second one (events.length) containing the length of the event, and the remaining others columns (counts1 to countsN) with the counts corresponding to the replicates of the conditions.

psiInfo

a data frame containing information to compute the PSI values. This data frame is used only when counts is different from 0.

exonicReadsInfo

a logical indicating if exonicReads are used.

k2rgFile

a string containing the KisSplice2RefGenome path and file name. It is equal to NULL if the input file comes from KisSplice.

Only countsEvents is shown when kissplice2counts output is printed.

Examples

fpath <- system.file("extdata", "output_kissplice_SNV.fa", package="kissDE")
mySNVcounts <- kissplice2counts(fpath, counts = 0, pairedEnd=TRUE)
names(mySNVcounts)
str(mySNVcounts)
head(mySNVcounts$countsEvents)

aursiber/kissDE documentation built on Jan. 28, 2024, 10:01 a.m.