| seqformat | R Documentation | 
Convert a sequence data set from one representation format to another.
seqformat(data, var = NULL, from, to, compress = FALSE, nrep = NULL, tevent,
  stsep = NULL, covar = NULL, SPS.in = list(xfix = "()", sdsep = ","),
  SPS.out = list(xfix = "()", sdsep = ","), id = 1, begin = 2, end = 3,
  status = 4, process = TRUE, pdata = NULL, pvar = NULL, limit = 100,
  overwrite = TRUE, fillblanks = NULL, tmin = NULL, tmax = NULL, missing = "*",
  with.missing = TRUE, right="DEL", compressed, nr)
| data | Data frame, matrix,  A data frame or a matrix with sequence data in one or more columns when
 A data frame with at least four columns when  A state sequence object when  | 
| var | 
 | 
| from | String.
Format of the input sequence data.
It can be  | 
| to | String.
Format of the output data.
It can be  | 
| compress | Logical.
Default:  | 
| nrep | Integer.
Number of shifted replications when  | 
| tevent | Matrix.
The transition-definition matrix when  | 
| stsep | 
 | 
| covar | List of Integers or Strings.
When  | 
| SPS.in | List.
Default:  | 
| SPS.out | List.
Default:  | 
| id | 
 When  When  When  | 
| begin | Integer or String.
Default:  | 
| end | Integer or String.
Default:  | 
| status | Integer or String.
Default:  | 
| process | Logical.
Default:  This  | 
| pdata | 
 To be used only with  If  If  A data frame containing the ID and the birth time of the individuals when
 | 
| pvar | List of Integers or Strings.
The indexes or names of the columns of the data frame  | 
| limit | Integer.
Default:  | 
| overwrite | Logical.
Default:  | 
| fillblanks | Character.
Token used to fill gaps between episodes when  | 
| tmin | 
 | 
| tmax | 
 | 
| missing | String.
Default:  | 
| with.missing | Logical.
Default:  | 
| right | One of  | 
| compressed | Deprecated. Use  | 
| nr | Deprecated. Use  | 
The seqformat function converts data from one format to
another. The input data is first converted into STS format and then
converted into the output format. Depending on input and output formats, some
information can be lost during the conversion process. The output is a matrix or
a data frame, NOT a sequence stslist object. To process, print, and plot
the sequences with TraMineR functions, you will have to first transform the returned data frame
into a stslist state sequence object with seqdef.
See Gabadinho et al. (2009) and Ritschard et al. (2009) for more
details on longitudinal data formats and conversion between them.
When data is in "SPELL" format (from = "SPELL"), begin and end times are expected to be positions in the sequences. Therefore, they should be strictly positive integers.
With process=TRUE, the outcome sequences will be aligned on ages (process duration since birth), while with process=FALSE they will be aligned on dates (position on the calendar time). If process=TRUE, values in the begin and end columns of data are assumed to be ages when pdata is NULL and integer dates otherwise. If process=FALSE, begin and end values are assumed to be integer dates when pdata is NULL and ages otherwise.
To convert from person-period data use from = "SPELL" and set both begin and end as the index or name of the time (period) column. Alternatively, use the reshape command of stats, which is more efficient.
A data frame for SRS, TSE, and SPELL outcomes, otherwise a matrix.
When from="SPELL", outcome has an attribute issues with indexes of sequences with issues (truncated sequences, missing start time, spells before birth year, ...)
Gilbert Ritschard, Alexis Gabadinho, Pierre-Alexandre Fonta, Nicolas S. Müller, Matthias Studer
Gabadinho, A., G. Ritschard, M. Studer and N. S. Müller (2009). Mining
Sequence Data in R with the TraMineR package: A user's guide.
Department of Econometrics and Laboratory of Demography, University of Geneva.
Ritschard, G., A. Gabadinho, M. Studer and N. S. Müller. Converting between various sequence representations. in Ras, Z. & Dardzinska, A. (eds.) Advances in Data Management, Springer, 2009, 223, 155-175.
seqdef, reshape
## ========================================
## Examples with raw STS sequences as input
## ========================================
## Loading a data frame with sequence data in the columns 13 to 24
data(actcal)
## Converting to SPS format
actcal.SPS.A <- seqformat(actcal, 13:24, from = "STS", to = "SPS")
head(actcal.SPS.A)
## Converting to compressed SPS format with no
## prefix/suffix and with "/" as state/duration separator
actcal.SPS.B <- seqformat(actcal, 13:24, from = "STS", to = "SPS",
  compress = TRUE, SPS.out = list(xfix = "", sdsep = "/"))
head(actcal.SPS.B)
## Converting to compressed DSS format
actcal.DSS <- seqformat(actcal, 13:24, from = "STS", to = "DSS",
  compress = TRUE)
head(actcal.DSS)
## ==============================================
## Examples with a state sequence object as input
## ==============================================
## Loading a data frame with sequence data in the columns 10 to 25
data(biofam)
## Limiting the number of considered cases to the first 20
biofam <- biofam[1:20, ]
## Creating a state sequence object
biofam.labs <- c("Parent", "Left", "Married", "Left/Married",
  "Child", "Left/Child", "Left/Married/Child", "Divorced")
biofam.short.labs <- c("P", "L", "M", "LM", "C", "LC", "LMC", "D")
biofam.seq <- seqdef(biofam, 10:25, alphabet = 0:7,
  states = biofam.short.labs, labels = biofam.labs)
## Converting to SPELL format
bf.spell <- seqformat(biofam.seq, from = "STS", to = "SPELL",
  pdata = biofam, pvar = c("idhous", "birthyr"))
head(bf.spell)
## Converting to shifted replicated sequences (SRS)
bf.srs <- seqformat(biofam, var=10:25, from="STS", to="SRS", 
                    covar=c("sex","plingu02"))
tail(bf.srs)
## ======================================
## Examples with SPELL sequences as input
## ======================================
## Loading two data frames: bfspell20 and bfpdata20
## bfspell20 contains the first 20 biofam sequences in SPELL format
## bfpdata20 contains the IDs and the years at which the
## considered individuals were aged 15
data(bfspell)
## Converting to STS format with alignement on calendar years
bf.sts.y <- seqformat(bfspell20, from = "SPELL", to = "STS",
  id = "id", begin = "begin", end = "end", status = "states",
  process = FALSE)
head(bf.sts.y)
## Converting to STS format with alignement on ages
bf.sts.a <- seqformat(bfspell20, from = "SPELL", to = "STS",
  id = "id", begin = "begin", end = "end", status = "states",
  process = TRUE, pdata = bfpdata20, pvar = c("id", "when15"),
  limit = 16)
names(bf.sts.a) <- paste0("a", 15:30)
head(bf.sts.a)
## ==================================
## Examples for TSE and SPELL output
## in presence of missing values
## ==================================
data(ex1) ## STS data with missing values
## creating the state sequence object with by default
## the end missings coded as void ('%')
sqex1 <- seqdef(ex1[,1:13])
as.matrix(sqex1)
## Creating state-event transition matrices
ttrans <- seqetm(sqex1, method='transition')
tstate <- seqetm(sqex1, method='state')
## Converting into time stamped events
seqformat(sqex1, from = "STS", to = "TSE", tevent = ttrans)
seqformat(sqex1, from = "STS", to = "TSE", tevent = tstate)
## Converting into vertical spell data
seqformat(sqex1, from = "STS", to = "SPELL", with.missing=TRUE)
seqformat(sqex1, from = "STS", to = "SPELL", with.missing=TRUE, right=NA)
seqformat(sqex1, from = "STS", to = "SPELL", with.missing=FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.