| ggseqiplot | R Documentation |
Function for rendering sequence index plots with
ggplot2 \insertCitewickham2016ggseqplot instead
of base R's plot function that is used by
TraMineR::seqplot
\insertCitegabadinho2011ggseqplot.
ggseqiplot(
seqdata,
no.n = FALSE,
group = NULL,
sortv = NULL,
weighted = TRUE,
border = FALSE,
ytlab = NULL,
facet_scale = "free_y",
facet_ncol = NULL,
facet_nrow = NULL,
...
)
seqdata |
State sequence object (class |
no.n |
specifies if number of (weighted) sequences is shown as part of
the y-axis title or group/facet title (default is |
group |
A vector of the same length as the sequence data indicating group membership. When not NULL, a distinct plot is generated for each level of group. |
sortv |
Vector of numerical values sorting the sequences or a sorting
method (either |
weighted |
Controls if weights (specified in |
border |
if |
ytlab |
Specifies the type of y-axis labels. Options are:
When using |
facet_scale |
Specifies if y-scale in faceted plot should be free
( |
facet_ncol |
Number of columns in faceted (i.e. grouped) plot |
facet_nrow |
Number of rows in faceted (i.e. grouped) plot |
... |
if group is specified additional arguments of |
Sequence index plots have been introduced by \insertCitescherer2001;textualggseqplot and display each sequence as horizontally stacked bar or line. For a more detailed discussion of this type of sequence visualization see, for example, \insertCitebrzinsky-fay2014;textualggseqplot, \insertCitefasang2014;textualggseqplot, and \insertCiteraab2022;textualggseqplot.
The function uses TraMineR::seqformat
to reshape seqdata stored in wide format into a spell/episode format.
Then the data are further reshaped into the long format, i.e. for
every sequence each row in the data represents one specific sequence
position. For example, if we have 5 sequences of length 10, the long file
will have 50 rows. In the case of sequences of unequal length not every
sequence will contribute the same number of rows to the long data.
The reshaped data are used as input for rendering the index plot using
ggplot2's geom_rect. ggseqiplot uses
geom_rect instead of geom_tile
because this allows for a straight forward implementation of weights.
If weights are specified for seqdata and weighted=TRUE
the sequence height corresponds to its weight.
When using grouped plots (i.e., when group is specified) with
facet_scale = "fixed", the function internally uses scales = "free_y"
in ggplot2::facet_wrap but applies
coord_cartesian with fixed ylim
to achieve the effect of a fixed y-scale across facets. This approach allows
for consistent y-axis ranges while maintaining flexibility in the labeling.
When a sortv is specified, the sequences are arranged in the order of
its values. With sortv="from.start" sequence data are sorted
according to the states of the alphabet in ascending order starting with the
first sequence position, drawing on succeeding positions in the case of
ties. Likewise, sortv="from.end" sorts a reversed version of the
sequence data, starting with the final sequence position turning to
preceding positions in case of ties.
When ytlab is set to "id", "all", "id-all", the
y-axis labeling behavior changes. With "all", all sequences are labeled
with sequential numbers (1, 2, 3, ...) instead of using pretty breaks. With
"id", the rownames of the sequence object are used as y-axis labels with
pretty breaks. With "id-all", all sequences are labeled with their rownames.
If the sequence object has no rownames, sequential numbers (1, 2, 3, ...) are used
as identifiers. These features are especially useful when working with sorted
sequences or when displaying specific cases with meaningful identifiers. Note that
with "all" and "id-all", overlapping labels are automatically prevented
to maintain readability. When there are many sequences and insufficient space, not
all labels may be displayed. In such cases, consider increasing the plot height if you
insist on seeing all labels displayed.
Note that the default aspect ratio of ggseqiplot is different from
TraMineR::seqIplot. This is most obvious
when border=TRUE. You can change the ratio either by adding code to
ggseqiplot or by specifying the ratio when saving the code with
ggsave.
A sequence index plot. If stored as object the resulting list object also contains the data (spell format) used for rendering the plot.
Marcel Raab
library(TraMineR)
# Use example data from TraMineR: actcal data set
data(actcal)
# We use only a sample of 300 cases
set.seed(1)
actcal <- actcal[sample(nrow(actcal), 300), ]
actcal.lab <- c("> 37 hours", "19-36 hours", "1-18 hours", "no work")
actcal.seq <- seqdef(actcal, 13:24, labels = actcal.lab)
# ex1 using weights
data(ex1)
ex1.seq <- seqdef(ex1, 1:13, weights = ex1$weights)
# sequences sorted by age in 2000 and grouped by sex
# with TraMineR::seqplot
seqIplot(actcal.seq, group = actcal$sex, sortv = actcal$age00)
# with ggseqplot
ggseqiplot(actcal.seq, group = actcal$sex, sortv = actcal$age00)
# sequences of unequal length with missing state, and weights
seqIplot(ex1.seq)
ggseqiplot(ex1.seq)
# ... turn weights off and add border
seqIplot(ex1.seq, weighted = FALSE, border = TRUE)
ggseqiplot(ex1.seq, weighted = FALSE, border = TRUE)
# Use sequence IDs as y-axis labels, and "fixed" y scale
ggseqiplot(ex1.seq,group = c(1, 1, 1, 2, 2, 2, 2),
weighted = FALSE, border = TRUE, ytlab = "id", facet_scale = "fixed")
# Display all sequences with sequential numbers and with ids
ggseqiplot(actcal.seq[1:20, ], sortv = "from.end", ytlab = "all")
ggseqiplot(actcal.seq[1:20, ], sortv = "from.end", ytlab = "id-all")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.