seqdomassoc: Measures of association between domains of sequence data

View source: R/seqdomassoc.R

seqdomassocR Documentation

Measures of association between domains of sequence data

Description

The function computes pairwise domain association based on cross-tabulation of the states observed in the sequences of the two domains involved. The association measure returned can be Cramer's V or the likelihood ratio (LRT).

Usage

seqdomassoc(
  seqdata.dom,
  rep.method = "overall",
  assoc = c("LRT", "V"),
  diss.dom = NULL,
  wrange = NULL,
  p.value = TRUE,
  struct.zero = TRUE,
  cross.table = FALSE,
  with.missing = FALSE,
  weighted = TRUE,
  seqrep.args = list(coverage = 0.8, pradius = 0.1),
  seqrf.args = list(k = 20),
  dnames = names(seqdata.dom)
)

Arguments

seqdata.dom

List of stslist objects (one per dimension)

rep.method

Character string. Method for determining the sequences on which the association is computed. One of "rep" (representative sequences), "eq.group" (medoids of equally spaced groups), or "overall".

assoc

Character string. The association measure to be computed. One of "V" (Cramer V) or "LRT" or a vector with both.

diss.dom

List of dissimilarity matrices used for selecting representatives. Ignored when rep.method="overall".

wrange

Vector of two integers. Window range for count of co-occurrences. A state at p in the first domain is compared with states in [p+wrange[1], p+wrange[2]] in the second domain.

p.value

Logical. Should p-values be returned?

struct.zero

Logical. Should zeros in cross tables be treated as structural zeros?

cross.table

Logical. Should cross tables be returned? If TRUE, cross tables are returned as the list attribute cross.tables.

with.missing

Logical. Should missing be treated as a regular state.

weighted

Logical. Should sequence weights be taken into account when present in the sequence objects? When applicable, weights of the first domain are used.

seqrep.args

List of arguments passed to seqrep when rep.method="rep".

seqrf.args

List of arguments passed to seqrf when rep.method="eq.group".

dnames

String vector: names of dimensions.

Details

For each pair of domains, seqdomassoc cross-tabulates the position-wise states across domains using all sequences when rep.method = "overall". When rep.method = "rep", each observed sequence is first replaced by the closest representative sequence and, when rep.method = "eq.group", each observed sequence is replaced by the group medoid of its group. Then, the selected association measures are computed on the resulting cross-tables.

The "overall" method implies a strong position-wise association and will not detect association occurring after a small time warp. With representative sequences, the same holds, but for representatives only. Using dissimilarity measures that allow for time warp for identifying representatives, observed sequences may differ from their representatives in the timing of the states. Therefore, using representatives instead of all sequences relaxes somewhat the strong timing constraint.

Value

An object of class sdomassoc, which is the table (matrix) of association statistics with the list of cross tables in attribute cross.tables.

The print method for objects sdomassoc prints only the table of association statistics.

Author(s)

Gilbert Ritschard

References

Ritschard, G., T.F. Liao, and E. Struffolino (2023). Strategies for multidomain sequence analysis in social research. Sociological Methodology, 53(2), 288-322. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1177/00811750231163833")}.

See Also

dissdomassoc

Examples

data(biofam)

## Building one channel per type of event (left, children or married)
cases <- 1:50
bf <- as.matrix(biofam[cases, 10:25])
children <-  bf==4 | bf==5 | bf==6
married <- bf == 2 | bf== 3 | bf==6
left <- bf==1 | bf==3 | bf==5 | bf==6

## Building sequence objects
child.seq <- seqdef(children, weights = biofam[cases,'wp00tbgs'])
marr.seq <- seqdef(married, weights = biofam[cases,'wp00tbgs'])
left.seq <- seqdef(left, weights = biofam[cases,'wp00tbgs'])

## distances by channel
dchild <- seqdist(child.seq, method="OM", sm="INDELSLOG")
dmarr <- seqdist(marr.seq, method="OM", sm="INDELSLOG")
dleft <- seqdist(left.seq, method="OM", sm="INDELSLOG")
dbiofam <- list(dchild,dmarr,dleft)
dnames <- names(dbiofam) <- c("child","marr","left")


seqdomassoc(list(child.seq,marr.seq,left.seq), dnames=dnames)
seqdomassoc(list(child.seq,marr.seq,left.seq), diss.dom=dbiofam,
            rep.method="rep", assoc="V", dnames=dnames)
seqdomassoc(list(child.seq,marr.seq,left.seq), diss.dom=dbiofam,
            rep.method="eq.group", assoc="V", dnames=dnames)



TraMineR documentation built on Dec. 8, 2024, 3:01 p.m.