seqpolyads: Measuring the Degree of Within-Polyadic Similarities

seqpolyadsR Documentation

Measuring the Degree of Within-Polyadic Similarities

Description

The function computes measures of the degree of similarities within polyadic member sequences compared to randomly assigned polyadic member sequences.

Usage

seqpolyads(seqlist, a=1, method="HAM", ...,
    w=rep(1,ncol(combn(1:length(seqlist),2))),
    s=36963, T=1000, core=1, replace=TRUE, weighted=TRUE,
    with.missing=FALSE, rand.weight.type=1, role.weights=NULL,
    show.time=FALSE)

Arguments

seqlist

A list of J>1 state sequence stslist objects. List of input sets (polyads) of polyadic sequences. The state sequence objects in the list must all have the same number N of sequences and the same alphabet. The state sequence objects should be created with seqdef and the list with list. E.g., list(gen1.seq,gen2.seq,gen3.seq).

a

Integer, 1 or 2. Random generation mechanism. If 1 (default), draws from the observed set of sequences, and if 2, in addition random draws of states from each randomly drawn sequence. See reference below for detail.

method

String. Method for computing sequence distances. See seqdist. Additional arguments may be required depending on the method chosen.

...

Additional arguments passed to seqdist

s

Integer. Default 36963. Using the same seed number on the same computer guarantees the same results each time. Set s=NULL if you don't want to set a seed. The random generator can be chosen with RNGkind.

w

Integer vector. Default 1. The weights assigned to between-polyadic member sets in the weight matrix. For example, for dyadic sequences, no weight is necessary and the distance computation takes on the default of 1. For triadic sequences, there are three weights between the first and the second members, the first and the third members, and the second and the third members, in a row-wise order. See reference below.

T

Integer. Default 1,000. The number of randomized computations.

core

Integer. Default 1. Number of cores for the computation. When greater than 1, the procedure utilizes parallel processing.

replace

Logical. When a=2, should state sampling in each sequence be done with replacement? Default is TRUE. Ignored when a=1.

weighted

Logical. Should we account for the weights when present in the sequence objects? See details. Default is TRUE.

with.missing

Logical. Should the missing state be considered as a regular state? Default is FALSE.

rand.weight.type

Integer, 1 or 2. Ignored when weighted=FALSE. If 1 (default), weight of each randomized polyad is the average of original weights of its members. If 2, member weights are adjusted by dividing them by the sum of weights of all drawn members of the same type.

role.weights

NULL or vector of non-negative weights of same length as the list seqlist. Ignored when weighted=FALSE. If non null, role weights for determining the weights of the randomized polyads.

show.time

Logical. Should elapsed time be displayed? Default is FALSE.

Details

The function computes the polyadic distance of the observed polyads, i.e., the (weighted) mean of the pairwise distances between members of the polyad. In addition, the following statistics are computed:

The U statistic measures for each observed polyad by how much its polyadic distance differs from the mean polyadic distance of T randomized polyads. U.tp is the p-value for a two-tailed t-test of the U statistic.

The V statistic is, for each observed polyad, the proportion of T randomized polyads that have a greater polyadic distance. V.95 is an associated dummy that takes value 1 when the proportion V is greater than 95% and 0 otherwise.

When the sequence objects in seqlist have weights and weighted=TRUE, the randomized sequences are sampled using the weights of the first element in the list. Each member of an observed polyad is supposed to have the same weight. This does not hold for the randomized polyads that are obtained by sampling their members independently. The weights of each randomized sequence is set as the average of the weights of its members. When role weights are provided with role.weights, a weighted average of the member weights is used. When rand.weight.type=1, original member weights are used. When rand.weight.type=2, the weights of randomly selected members are adjusted by the sum of weights of all randomly drawn members of the same type.

When core > 1, the function uses the doParallel package for parallel computation.

Value

The function outputs a list of seven objects:

mean.dist

Vector of length 2 with the average observed and random within-polyadic distances.

U

Vector of N number of U statistics (see reference).

U.tp

Vector of N number of p-values for a two-tailed t-test of the U statistic.

V

Vector of N number of V statistics (see reference).

V.95

Vector of N number of 1s or 0s: 1 if a V value is at least 95 percent confident, 0 otherwise.

observed.dist

Vector of within-polyadic distances for the observed polyadic members.

random.dist

Vector of within-polyadic distances for the T number of randomly matched polyadic members.

Author(s)

Tim Liao and Gilbert Ritschard

References

Tim F. Liao (2021), "Using Sequence Analysis to Quantify How Strongly Life Courses Are Linked.” Sociological Science 8, 48-72, \Sexpr[results=rd]{tools:::Rd_expr_doi("10.15195/v8.a3")}.

Examples

data(polyads)
Gen <- polyads$Gen
seqGrandP <- seqdef(polyads[Gen=="1st Generation",2:11])
seqParent <- seqdef(polyads[Gen=="2nd Generation",2:11])
seqChild <- seqdef(polyads[Gen=="3rd Generation",2:11])
Seq <- rbind(seqGrandP,seqParent,seqChild)
slgth <- ncol(Seq)
colnames(Seq) <- 21:30
seqIplot(Seq,group=Gen,idxs=10:1,ylab="Triad",xlab="Age")
seqL <- list(seqGrandP,seqParent,seqChild)
core=1
seqG2.Tim <- seqpolyads(seqL[1:2],method="HAM",a=1,core=core,T=100)
seqG3.Tim <- seqpolyads(seqL,method="HAM",a=1,core=core,T=100)
seqG2.Dur <- seqpolyads(seqL[1:2],method="CHI2",step=slgth,core=core,T=100)
seqG3.Dur <- seqpolyads(seqL,method="CHI2",step=slgth,core=core,T=100)


TraMineRextras documentation built on Sept. 11, 2024, 6:52 p.m.