| build_hypa | R Documentation |
Constructs a k-th order De Bruijn graph from sequential trajectory data and uses a hypergeometric null model to detect paths with anomalous frequencies. Paths occurring more or less often than expected under the null model are flagged as over- or under-represented.
build_hypa(data, k = 3L, alpha = 0.05, min_count = 5L, p_adjust = "BH")
data |
A data.frame (rows = trajectories), list of character vectors,
|
k |
Integer. Order of the De Bruijn graph (default 2). Detects anomalies in paths of length k. |
alpha |
Numeric. Significance threshold for anomaly classification (default 0.05). Paths with HYPA score < alpha are under-represented; paths with score > 1-alpha are over-represented. |
min_count |
Integer. Minimum observed count for a path to be
classified as anomalous (default 2). Paths with fewer observations
are always classified as |
p_adjust |
Character. Method for multiple testing correction of
p-values. Default |
An object of class net_hypa with components:
Data frame with path, from, to, observed, expected,
ratio, p_value, p_adjusted_under, p_adjusted_over, anomaly
columns. The path column shows the full state sequence
(e.g., "A -> B -> C"); from is the context (conditioning
states); to is the next state; ratio is
observed / expected; p_value is the raw hypergeometric
CDF value; p_adjusted_under and p_adjusted_over
are the corrected p-values for under- and over-representation
tests respectively.
Weighted adjacency matrix of the De Bruijn graph.
Fitted propensity matrix.
Order of the De Bruijn graph.
Significance threshold used.
Multiple testing correction method used.
Number of anomalous paths detected.
Number of over-represented paths.
Number of under-represented paths.
Total number of edges.
Node names in the De Bruijn graph.
LaRock, T., Nanumyan, V., Scholtes, I., Casiraghi, G., Eliassi-Rad, T., & Schweitzer, F. (2020). HYPA: Efficient Detection of Path Anomalies in Time Series Data on Networks. SDM 2020, 460–468.
seqs <- list(c("A","B","C"), c("B","C","A"), c("A","C","B"), c("A","B","C"))
hyp <- build_hypa(seqs, k = 2)
trajs <- list(c("A","B","C"), c("A","B","C"), c("A","B","C"),
c("A","B","D"), c("C","B","D"), c("C","B","A"))
h <- build_hypa(trajs, k = 2)
print(h)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.