Identifying discriminating subsequences

Share:

Description

Identify and sort the most discriminating subsequences by their discriminating power.

Usage

1
2
seqecmpgroup(subseq, group, method="chisq", pvalue.limit=NULL,
             weighted = TRUE)

Arguments

subseq

A subseqelist object (list of subsequences) such as produced by seqefsub

group

Group membership, i.e., a variable or factor defining the groups which we want to discriminate

method

The discrimination method; one of "bonferroni" or "chisq"

pvalue.limit

Can be used to filter the results. Only subsequences with a p-value lower than this parameter are selected. If NULL all subsequences are returned (regardless of their p-values).

weighted

Logical. If TRUE, seqecmpgroup uses the weights specified in subseq, (see seqefsub).

Details

The following discrimination test functions are implemented: chisq, the Pearson Independence Chi-squared test, and bonferroni, the Pearson Independence Chi-squared test with Bonferroni correction.

Value

An objet of type subseqelistchisq (subtype of subseqelist) with the following elements

subseq

Sorted list of found discriminating subsequences

seqe

The event sequence object on which the tests were computed

constraint

Time constraints used for searching the subsequences (see seqeconstraint)

labels

Levels (value labels) of the target group variable

type

Type of test used

data

A data frame with columns support, index (original order of the subsequence) and a pair of frequency and Pearson residual columns for each group

Author(s)

Matthias Studer (with Gilbert Ritschard for the help page)

References

Studer, M., M<fc>ller, N.S., Ritschard, G. & Gabadinho, A. (2010), "Classer, discriminer et visualiser des s<e9>quences d'<e9>v<e9>nements", In Extraction et gestion des connaissances (EGC 2010), Revue des nouvelles technologies de l'information RNTI. Vol. E-19, pp. 37-48.

See Also

See also plot.subseqelistchisq to plot the results

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data(actcal.tse)
actcal.seqe <- seqecreate(actcal.tse)

##Searching for frequent subsequences, that is, appearing at least 20 times
fsubseq <- seqefsub(actcal.seqe, pMinSupport=0.01)

##searching for susbsequences discriminating the most men and women
data(actcal)
discr <- seqecmpgroup(fsubseq, group=actcal$sex, method="bonferroni")
##Printing discriminating subsequences
print(discr)
##Plotting the six most discriminating subsequences
plot(discr[1:6])

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.