Plot state sequence objects
High level plot functions for state sequence objects that can produce state distribution (chronograms), frequency, index, transversal entropy, sequence of modes, meant time, and representative plots.
1 2 3 4 5 6 7 8 9 10 11 12 13 14
seqplot(seqdata, group=NULL, type="i", title=NULL, cpal=NULL, missing.color=NULL, ylab=NULL, yaxis=TRUE, axes="all", xtlab=NULL, cex.plot=1, withlegend="auto", ltext=NULL, cex.legend=1, use.layout=(!is.null(group) | withlegend!=FALSE), legend.prop=NA, rows=NA, cols=NA, ...) seqdplot(seqdata, group=NULL, title=NULL, ...) seqfplot(seqdata, group=NULL, title=NULL, ...) seqiplot(seqdata, group=NULL, title=NULL, ...) seqIplot(seqdata, group=NULL, title=NULL, ...) seqHtplot(seqdata, group=NULL, title=NULL, ...) seqmsplot(seqdata, group=NULL, title=NULL, ...) seqmtplot(seqdata, group=NULL, title=NULL, ...)
a state sequence object created with the
Plots one plot for each level of the factor given as argument.
the type of the plot. Available types are
title for the graphic. Default is
Color palette used for the states. By default, the
alternative color for representing missing values inside the sequences. By default, this color is taken from the
an optional label for the y-axis. If set to
controls whether a y-axis is plotted. When set to
if set to
optional labels for the x-axis tick labels. If unspecified, the column names of the
expansion factor for setting the size of the font for the axis labels and names. The default value is 1. Values lesser than 1 will reduce the size of the font, values greater than 1 will increase the size.
defines if and where the legend of the state colors is plotted. The default value
optional description of the states to appear in the legend. Must be a vector of character strings with number of elements equal to the size of the alphabet. If unspecified, the
expansion factor for setting the size of the font for the labels in the legend. The default value is 1. Values lesser than 1 will reduce the size of the font, values greater than 1 will increase the size.
sets the proportion of the graphic area used for plotting the legend when
optional arguments to arrange plots when
arguments to be passed to the function called to produce the appropriate statistics and the associated plot method (see details), or other graphical parameters. For example the
seqplot is the generic function for high level plots of state sequence objects with group splits and automatic display of the color legend. Many different types of plots can be produced by means of the
type argument. Except for sequence index plots,
seqplot first calls the specific function producing the required statistics and then the plot method for objects produced by this function (see below). For sequence index plots, the state sequence object itself is plotted by calling the
plot.stslist method. When splitting by groups and/or displaying the color legend, the
layout function is used for arranging the plots.
seqrplot functions are aliases for calling
type argument set respectively to
State distribution plot (
type="d") represent the sequence of the cross-sectional state frequencies by position (time point) computed by the
seqstatd function. Such plots are also known as chronograms.
Sequence frequency plots (
type="f") display the most frequent sequences, each one with an horizontal stack bar of its successive states. Sequences are displayed bottom-up in decreasing order of their frequencies (computed by the
seqtab function). The
plot.stslist.freq plot method is called for producing the plot.
tlim optional argument may be specified for selecting the sequences to be plotted (default is 1:10, i.e. the 10 most frequent sequences). The width of the bars representing the sequences is by default proportional to their frequencies, but this can be disabled with the
pbarw=FALSE optional argument. If weights have been specified when creating
seqdata, weighted frequencies will be returned by
seqtab since the default option is
weighted=TRUE. See examples below, the
plot.stslist.freq manual pages for a complete list of optional arguments and Müller et al., (2008) for a description of sequence frequency plots.
In sequence index plots (
type="I"), the requested individual sequences are rendered with horizontal stacked bars depicting the states over successive positions (time). Optional arguments are
tlim for specifying the indexes of the sequences to be plotted (when
type="i" defaults to the first ten sequences, i.e
tlim=1:10). For plotting nicely a (big) whole set one can use
type="I" which is the same as using
tlim=0 together with the additional graphical parameters
space=0 to suppress bar borders and space between bars. The
sortv argument can be used to pass a vector of numerical values for sorting the sequences or to specify a sorting method. See
plot.stslist for a complete list of optional arguments and their description.
The interest of sequence index plots has, for instance, been stressed by Scherer (2001) and Brzinsky-Fay et al. (2006). Notice that index plots for thousands of sequences result in very heavy PDF or POSTSCRIPT graphic files. Dramatic file size reduction may be achieved by saving the figures in bitmap format with using for instance the
png graphic device instead of
The transversal entropy plot (
type="Ht") displays the evolution over positions of the transversal entropies (Billari, 2001). Transversal entropies are computed by calling
seqstatd function and then plotted by calling the
plot.stslist.statd plot method.
The modal state sequence plot (
type="ms") displays the sequence of the modal states with each mode proportional to its frequency at the given position. The
seqmodst function is called which returns the sequence and the result is plotted by calling the
plot.stslist.modst plot method.
The mean time plot (
type="mt") displays the mean time spent in each state of the alphabet as computed by the
seqmeant function. The
plot.stslist.meant plot method is used to plot the resulting statistics. Set
serr=TRUE to display error bars on the mean time plot.
The representative sequence plot (
type="r") displays a reduced, non redundant set of representative sequences extracted from the provided state sequence object and sorted according to a representativeness criterion. The
seqrep function is called to extract the representative set which is then plotted by calling the
plot.stslist.rep method. A distance matrix is required that is passed with the
dist.matrix argument or by calling the
seqdist function if
criterion argument sets the representativeness criterion used to sort the sequences. See examples below, the
plot.stslist.rep manual pages for a complete list of optional arguments and Gabadinho et al. (2009) for more details on the extraction of representative sets.
Alexis Gabadinho (with Gilbert Ritschard for the help page)
Billari, F. C. (2001). The analysis of early life courses: Complex description of the transition to adulthood. Journal of Population Research 18(2), 119-142.
Brzinsky-Fay C., U. Kohler, M. Luniak (2006). Sequence Analysis with Stata. The Stata Journal, 6(4), 435-460.
Gabadinho, A., G. Ritschard, N. S. Müller and M. Studer (2011). Analyzing and Visualizing State Sequences in R with TraMineR. Journal of Statistical Software 40(4), 1-37.
Gabadinho A, Ritschard G, Studer M, Müller NS (2011). "Extracting and Rendering Representative Sequences", In A Fred, JLG Dietz, K Liu, J Filipe (eds.), Knowledge Discovery, Knowledge Engineering and Knowledge Management, volume 128 of Communications in Computer and Information Science (CCIS), pp. 94-106. Springer-Verlag.
Müller, N. S., A. Gabadinho, G. Ritschard and M. Studer (2008). Extracting knowledge from life courses: Clustering and visualization. In Data Warehousing and Knowledge Discovery, 10th International Conference DaWaK 2008, Turin, Italy, September 2-5, LNCS 5182, Berlin: Springer, 176-185.
Scherer S (2001). Early Career Patterns: A Comparison of Great Britain and West Germany. European Sociological Review, 17(2), 119-144.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101
## ====================================================== ## Creating state sequence objects from example data sets ## ====================================================== ## biofam data set data(biofam) ## We use only a sample of 300 cases set.seed(10) biofam <- biofam[sample(nrow(biofam),300),] biofam.lab <- c("Parent", "Left", "Married", "Left+Marr", "Child", "Left+Child", "Left+Marr+Child", "Divorced") biofam.seq <- seqdef(biofam, 10:25, labels=biofam.lab) ## actcal data set data(actcal) ## We use only a sample of 300 cases set.seed(1) actcal <- actcal[sample(nrow(actcal),300),] actcal.lab <- c("> 37 hours", "19-36 hours", "1-18 hours", "no work") actcal.seq <- seqdef(actcal,13:24,labels=actcal.lab) ## ex1 using weights data(ex1) ex1.seq <- seqdef(ex1, 1:13, weights=ex1$weights) ## ======================== ## Sequence frequency plots ## ======================== ## Plot of the 10 most frequent sequences seqplot(biofam.seq, type="f") ## Grouped by sex seqfplot(actcal.seq, group=actcal$sex) ## Unweighted vs weighted frequencies seqfplot(ex1.seq, weighted=FALSE) seqfplot(ex1.seq, weighted=TRUE) ## ===================== ## Modal states sequence ## ===================== seqplot(biofam.seq, type="ms") ## same as seqmsplot(biofam.seq) ## ==================== ## Representative plots ## ==================== ## Computing a distance matrix ## with OM metric costs <- seqsubm(biofam.seq, method="TRATE") biofam.om <- seqdist(biofam.seq, method="OM", sm=costs) ## Plot of the representative sets grouped by sex ## using the default density criterion seqrplot(biofam.seq, group=biofam$sex, dist.matrix=biofam.om) ## Plot of the representative sets grouped by sex ## using the "dist" (centrality) criterion seqrplot(biofam.seq, group=biofam$sex, criterion="dist", dist.matrix=biofam.om) ## ==================== ## Sequence index plots ## ==================== ## First ten sequences seqiplot(biofam.seq) ## All sequences sorted by age in 2000 ## grouped by sex ## using 'border=NA' and 'space=0' options to have a nicer plot seqiplot(actcal.seq, group=actcal$sex, tlim=0, border=NA, space=0, sortv=actcal$age00) ## ======================= ## State distribution plot ## ======================= ## biofam grouped by sex seqplot(biofam.seq, type="d", group=actcal$sex) ## actcal grouped by sex seqplot(actcal.seq, type="d", group=actcal$sex) ## =================== ## Cross-sectional entropy plot ## =================== seqplot(biofam.seq, type="Ht", group=biofam$sex) ## =============== ## Meant time plot ## =============== ## actcal data set, grouped by sex seqplot(actcal.seq, type="mt", group=actcal$sex) ## biofam data set, grouped by sex seqmtplot(biofam.seq, group=biofam$sex)