iso.switch | R Documentation |
This function is used to search and score transcript isoform switch in time-series expression.
iso.switch(data.exp, mapping, times, rank = F, min.t.points = 2,
min.difference = 1, spline = F, spline.df = NULL, verbose = T)
data.exp |
Time-series isoform expression data with first row indicating the replicate labels and second row indicating the time points. The remained lines are isoform names in the first column followed by the expression values. All the replicates for each time point should be grouped together and the time points follow the sequential order. |
mapping |
gene and isoform mapping table with gene names in first column and transcript isoform names in the second column |
times |
a numeric vector of time labellings of all the relicated samples, e.g. 1,1,1,2,2,2,3,3,3,... |
rank |
logical (TRUE or FALSE). Should isoform expression be converted to rank of isoform expression in sample basis? |
min.t.points |
if the number of time points in all intervals < |
min.difference |
If the mean of differences of average isoform expression or spline fitted expression < |
spline |
logical, whether to use spline method to fit isoform expression (TRUE) or mean expression of time points (FALSE). |
spline.df |
the degree of freedom used in spline method. See |
verbose |
logical, to track the running progressing (TRUE) or not (FALSE). |
The detailed steps:
Figure 1: Isoform switch analysis methods. Expression data with 3 replicates for
each condition/time point is simulated for isoforms iso_i
and iso_j
.
The points in the plots represent the samples and the black lines connect the average
of samples. (A) is the iso-kTSP algorithm for comparisons of two conditions c_1
and c_2
.
Time-Series Isoform Switch (TSIS) tool is designed for detection and characterization of
isoform switches for time series data shown in figure (B). The time-series with 6 time
points is divided into 4 intervals by the intersection points of average expression.
Step 1: determine switch points.
Given that a pair of isoforms iso_i
and iso_j
may have a number of switches
in a time-series, we have offered two approaches to search for the switch time points in TSIS:
The first approach takes the average values of the replicates for each time point for each transcript isoform. Then it searches for the cross points of the average value of two isoforms across the time points (seen in Figure 1(B)).
The second approach uses natural spline curves to fit the time-series data for each transcript isoform and find cross points of the fitted curves for each pair of isoforms.
In most cases, these two methods produce very similar results. However, average values of expression may lose precision without having information of backward and forward time points. The spline method fit time-series of expression with control points (depending on spline degree of freedom provided) and weights of several neighbours to obtain designed precision (Hastie and Tibshirani, 1990). The spline method is useful to find global trends in the time-series data when the data is very noisy. But it may lack details of isoform switch in the local region. It is recommended that users use both average and spline method to search for the switch points and examine manually when inconsistent results were produced by the above two methods.
Step 2: define switch scoring measures
We define each transcript isoform switch by 1) the switch point P_i
,
2) time points between switch points P_(i-1)
and P_i
as interval I_1
before switch P_i
and 3) time points between switch points P_i
and P_(i+1)
as interval I_2
after the switch P_i
(see Figure 1(B)). We defined five measurements
to characterize each isoform switch. The first two are the probability/frequency of switch and
the sum of average sample differences before and after switch, which are similar to Score 1 and
Score 2 in iso-kTSP method (Sebestyen, et al., 2015) (see Figure 1(A))).
Firstly, for a switch point P_i
of two isoforms iso_i
and iso_j
with interval I_1
before the switch and interval I_2
after the switch ( Figure1 (B)), Score 1 is defined as
S_1 (iso_i,iso_j |I_1,I_2)=|p(iso_i>iso_j |I_1)+p(iso_i<iso_j |I_2)-1|,
Where p(iso_i>iso_j |I_1)
and p(iso_i<iso_j |I_2)
are the frequencies/probabilities that the samples of one isoform is greater or less than in the other in corresponding intervals.
Secondly, instead of rank differences as in iso-kTSP to avoid possible ties, we directly use the average abundance differences. The sum of mean differences of samples in intervals I_1
and I_2
are calculated as
S_2 (iso_i,iso_j |I_1,I_2)=d(iso_i,iso_j |I_1)+d(iso_i,iso_j |I_2)
Where d(iso_i,iso_j |I_k)
is the average expression difference in interval I_k,k=1,2
defined as
d(iso_i, iso_j |I_k)=\frac{1}{|I_k|}\sum_{m_{I_k}} |exp(iso_i |s_{m_{I_k}},I_k)-exp(iso_j |s_{m_{I_k}},I_k) |
|I_k |
is the number of samples in interval I_k
and exp(iso_i |s_(m_(I_k ) ),I_k)
is the expression of iso_i
of sample s_(m_(I_k ) )
in interval I_k
.
Thirdly, paired t-tests were carried out to test if there are significant differences for the two switched isoforms within the intervals before and after the switch.
Fourthly, the numbers of time points in intervals I_1
and I_2
were also provided, which indicate whether this switch is transient or a long lived change.
Finally, Isoforms with high negative correlations across the time points may point to the splicing regulation have antagonistic effects on the switched isoforms. Thus they are of great interest for investigation of alternative splicing regulations. As an additional score, we calculated the Pearson correlation of two isoforms across the whole time series.
For TSIS analysis details and examples, please go the the user manual on Github: https://github.com/wyguo/TSIS.
a data frame of scores. The column names:
iso1,iso2: the isoform pairs.
iso1.mean.ratio, iso2.mean.ratio: the mean ratios of isoforms to their gene.
left.interval, left.interval: The intervals before and after switch points.
x.value, y.value: The values of x axis (time) and y axis (expression) coordinates of the switch points.
left.prob, right.prob: the frequencies/probabilities that the samples of an isoform is greater or less than the other in left and right intervals, respectively.
left.dist, right.diff: the average sample differences in intervals before and after switch, respectively.
left.pval, right.pval: the paired t-test p-values of the samples in the intervals before and after switch points.
left.t.points, right.t.points: the number of time points in intervals before and after the switch points.
prob: Score1: the probability/frequency of switch.
diff: Score2: the sum of average sample differences before and after switch.
cor: Pearson correlation of two isoforms.
1. Sebestyen E, Zawisza M, Eyras E: Detection of recurrent alternative splicing switches in tumor samples reveals novel signatures of cancer. Nucleic Acids Res 2015, 43(3):1345-1356.
2. Hastie, T.J. and Tibshirani, R.J. Generalized additive models. Chapter 7 of Statistical Models in S eds. Wadsworth & Brooks/Cole 1990.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.