Description Usage Arguments Details Author(s) References See Also Examples

Relative Frequency Sequence Plots (RFS plots) plot a selection of representative sequences as sequence index plots (see `seqIplot`

). RFS plots proceed in several steps. First a set of sequences is ordered according to a substantively meaningful principle, e.g. according to their score on the first factor derived by applying Multidimensional scaling (default) or a user defined sorting variable, such as the timing of a transition of interest. Then the sorted set of sequences is partitioned in to k equal sized frequency groups. For each frequency group the medoid sequence is selected as a representative. The selected representatives are plotted as sequence index plots. RFS plots come with an additional distance-to-medoid box plot that visualizes the distances of all sequences in a frequency group to their respective medoid. Further, an R2 and F-statistic are given that indicate how well the selected medoids represent a given set of sequences.

1 2 |

`seqdata` |
a state sequence object created with the |

`k` |
integer: Number of groupings (frequency groups?) |

`diss` |
matrix of pairwise dissimilarities between sequences in |

`sortv` |
an optional sorting variable that may be used to compute the frequency groups. If |

`ylab` |
string. An optional label for the y-axis. If set as |

`yaxis` |
logical. Controls whether a y-axis is plotted. When set as |

`main` |
main graphic title. Default is |

`which.plot` |
string. One of |

`...` |
arguments passed to |

RFS plots are useful to visualize large sets of sequences that cannot be plotted with sequence index plots due to overplotting (see `seqIplot`

). Due to the partitioning into equal sized frequency groups each selected sequence represents an equal portion of the original sample and thereby visually maintains the relative proportion of different types of sequences along the sorting criterion. The ideal number of *k* fequency groups depends on the size of the original sample and the empirical distribution of the sequences. The larger the sample and the more heterogeneous the sequences, higher numbers of *k* will be advisable. To avoid overplotting *k* should generally not be higher than 200.

Note that distance-to-medoid plots are meaningful only if there are at least 5-10 sequences in each frequency group. The distance-to-medoid plot is not only a quality criterion of how well the medoids represent a respective frequency group. They also provide additional substantive information about how large the variation of sequences is at a given location of the ordered sequences (see Fasang and Liao 2014).

Since ties in `sortv`

or mds are randomly ordered, one has to set the seed to reproduce exactly the same plot (see `set.seed`

).

Unlike the other `TraMineR`

plotting functions, the `seqplot.rf`

function ignores the `weights`

and does not support the `group`

argument.

Matthias Studer, Anette Eva Fasang and Tim Liao.

Fasang, Anette Eva and Tim F. Liao. 2014. "Visualizing Sequences in the Social Sciences: Relative Frequency Sequence Plots." Sociological Methods & Research 43(4):643-676.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | ```
## Defining a sequence object with the data in columns 10 to 25
## (family status from age 15 to 30) in the biofam data set
data(biofam)
biofam.lab <- c("Parent", "Left", "Married", "Left+Marr",
"Child", "Left+Child", "Left+Marr+Child", "Divorced")
## Here, we use only 100 cases selected such that all elements
## of the alphabet be present.
## (More cases and a larger k would be necessary to get a meaningful example.)
biofam.seq <- seqdef(biofam[501:600, ], 10:25, labels=biofam.lab)
diss <- seqdist(biofam.seq, method="LCS")
## Using 12 groups and default MDS sorting
seqplot.rf(biofam.seq, diss=diss, k=12,
main="Non meaningful example (n=100)")
## With a user specified sorting variable
## Here time spent in parental home
parentTime <- seqistatd(biofam.seq)[, 1]
seqplot.rf(biofam.seq, diss=diss, k=12, sortv=parentTime,
main="Sorted by parent time")
``` |

```
Loading required package: TraMineR
TraMineR stable version 2.0-11.1 (Built: 2019-05-12)
Website: http://traminer.unige.ch
Please type 'citation("TraMineR")' for citation information.
TraMineRextras stable version 0.4.5 (Built: 2019-05-11)
Functions provided by this package are still in test
and subject to changes in future releases.
[>] 8 distinct states appear in the data:
1 = 0
2 = 1
3 = 2
4 = 3
5 = 4
6 = 5
7 = 6
8 = 7
[>] state coding:
[alphabet] [label] [long label]
1 0 0 Parent
2 1 1 Left
3 2 2 Married
4 3 3 Left+Marr
5 4 4 Child
6 5 5 Left+Child
7 6 6 Left+Marr+Child
8 7 7 Divorced
[>] 100 sequences in the data set
[>] min/max sequence length: 16/16
[>] 100 sequences with 8 distinct states
[>] creating a 'sm' with a substitution cost of 2
[>] creating 8x8 substitution-cost matrix using 2 as constant value
[>] 76 distinct sequences
[>] min/max sequence length: 16/16
[>] computing distances using the LCS metric
[>] elapsed time: 0.029 secs
[>] Using k=12 frequency groups
[>] Pseudo/median-based-R2: 0.5391125
[>] Pseudo/median-based-F statistic: 9.357815
[>] computing state distribution for 100 sequences ...
[>] Using k=12 frequency groups
[>] Pseudo/median-based-R2: 0.4666667
[>] Pseudo/median-based-F statistic: 7
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.