seqplot.rf | R Documentation |

Relative Frequency Sequence Plots (RFS plots) plot a selection of representative sequences as sequence index plots (see `seqIplot`

). RFS plots proceed in several steps. First a set of sequences is ordered according to a substantively meaningful principle, e.g. according to their score on the first factor derived by applying Multidimensional scaling (default) or a user defined sorting variable, such as the timing of a transition of interest. Then the sorted set of sequences is partitioned in to k equal sized frequency groups. For each frequency group the medoid sequence is selected as a representative. The selected representatives are plotted as sequence index plots. RFS plots come with an additional distance-to-medoid box plot that visualizes the distances of all sequences in a frequency group to their respective medoid. Further, an R2 and F-statistic are given that indicate how well the selected medoids represent a given set of sequences.

seqplot.rf(seqdata, k = floor(nrow(seqdata)/10), diss, sortv = NULL, ylab=NA, yaxis=FALSE, main=NULL, which.plot="both", plus.one = "first", ...)

`seqdata` |
a state sequence object created with the |

`k` |
integer: Number of groupings (frequency groups?) |

`diss` |
matrix of pairwise dissimilarities between sequences in |

`sortv` |
an optional sorting variable that may be used to compute the frequency groups. If |

`ylab` |
string. An optional label for the y-axis. If set as |

`yaxis` |
logical. Controls whether a y-axis is plotted. When set as |

`main` |
main graphic title. Default is |

`which.plot` |
string. One of |

`plus.one` |
character string. One of |

`...` |
arguments passed to |

RFS plots are useful to visualize large sets of sequences that cannot be plotted with sequence index plots due to overplotting (see `seqIplot`

). Due to the partitioning into equal sized frequency groups each selected sequence represents an equal portion of the original sample and thereby visually maintains the relative proportion of different types of sequences along the sorting criterion. The ideal number of *k* fequency groups depends on the size of the original sample and the empirical distribution of the sequences. The larger the sample and the more heterogeneous the sequences, higher numbers of *k* will be advisable. To avoid overplotting *k* should generally not be higher than 200.

Note that distance-to-medoid plots are meaningful only if there are at least 5-10 sequences in each frequency group. The distance-to-medoid plot is not only a quality criterion of how well the medoids represent a respective frequency group. They also provide additional substantive information about how large the variation of sequences is at a given location of the ordered sequences (see Fasang and Liao 2014).

Since ties in `sortv`

or mds are randomly ordered (see argument `ties.method="random"`

of function `rank`

), one has to set the seed to reproduce exactly the same plot (see `set.seed`

).

Unlike the other `TraMineR`

plotting functions, the `seqplot.rf`

function ignores the `weights`

and does not support the `group`

argument.

A vector with the group membership (medoid of the group) of each sequence.

Matthias Studer, Anette Eva Fasang, Tim Liao, and Gilbert Ritschard.

Fasang, Anette Eva and Tim F. Liao. 2014. "Visualizing Sequences in the Social Sciences: Relative Frequency Sequence Plots." Sociological Methods & Research 43(4):643-676.

See also `seqplot`

and `seqrep`

.

## Defining a sequence object with the data in columns 10 to 25 ## (family status from age 15 to 30) in the biofam data set data(biofam) biofam.lab <- c("Parent", "Left", "Married", "Left+Marr", "Child", "Left+Child", "Left+Marr+Child", "Divorced") ## Here, we use only 100 cases selected such that all elements ## of the alphabet be present. ## (More cases and a larger k would be necessary to get a meaningful example.) biofam.seq <- seqdef(biofam[501:600, ], 10:25, labels=biofam.lab) diss <- seqdist(biofam.seq, method="LCS") ## Using 12 groups and default MDS sorting seqplot.rf(biofam.seq, diss=diss, k=12, main="Non meaningful example (n=100)") ## With a user specified sorting variable ## Here time spent in parental home: there are ties ## We set a seed because of random order in ties set.seed(123) parentTime <- seqistatd(biofam.seq)[, 1] seqplot.rf(biofam.seq, diss=diss, k=12, sortv=parentTime, main="Sorted by parent time")

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.