README.md

Nestimate

Unified network estimation, analysis, and validation for behavioral, psychological, and panel data.

R-CMD-check CRAN status License: MIT

Nestimate is a comprehensive R package for estimating, validating, and comparing networks from behavioral sequence data, psychological scales, and longitudinal panel data. A single entry point — build_network() — dispatches to 13 built-in estimators. Every network type shares the same validation pipeline: bootstrap confidence intervals, permutation testing, split-half reliability, and centrality stability. The entire package has only 4 hard imports (ggplot2, glasso, data.table, cluster).

Installation

# From CRAN
install.packages("Nestimate")

# Development version
devtools::install_github("mohsaqr/Nestimate")

What Nestimate Covers

| Area | Key Functions | |------|--------------| | Dynamic / Transition Networks | build_network(), wtna(), cooccurrence() | | Psychological Networks | build_network(method = "glasso/pcor/cor/ising/mgm") | | Multilevel VAR | build_mlvar() | | Idiographic Networks | build_gimme() | | Cluster & Group Networks | build_clusters(), build_mmm(), build_mcml() | | Higher-Order Networks | build_hon(), build_honem(), build_hypa(), build_mogen() | | Topological Analysis | build_simplicial(), persistent_homology(), q_analysis() | | Sequence Visualization | sequence_plot(), distribution_plot() | | Sequence Pattern Comparison | sequence_compare() | | Association Mining | association_rules() | | Link Prediction | predict_links(), evaluate_links() | | Markov Chain Analysis | markov_stability(), passage_time() | | Statistical Validation | bootstrap_network(), permutation(), nct(), network_reliability(), centrality_stability() |

Dynamic Networks

All dynamic network methods use build_network(). Pass an event log with action, actor, and time columns — no preprocessing needed.

Estimation Methods

| Method | Aliases | Description | |--------|---------|-------------| | "relative" | "tna", "transition" | Transition probabilities (directed) | | "frequency" | "ftna", "counts" | Raw transition counts (directed) | | "attention" | "atna" | Decay-weighted transitions emphasising recent events (directed) | | "co_occurrence" | "cna" | Co-occurrence from sequential data (undirected) |

library(Nestimate)
data(human_long)

net <- build_network(human_long, method = "tna",
                     action = "action", actor = "session_id", time = "time")

# Per-group networks in one call
group_nets <- build_network(human_long, method = "tna",
                            action = "action", actor = "session_id",
                            time = "time", group = "phase")

Window-Based TNA

wtna() builds networks from binary (one-hot) data using temporal windows — directed transitions between windows, undirected co-occurrence within windows, or a mixed network combining both.

data(learning_activities)
net_wtna  <- wtna(learning_activities, actor = "student",
                  method = "transition", type = "relative")
net_mixed <- wtna(learning_activities, actor = "student",
                  method = "both", type = "relative")

Co-occurrence Networks

cooccurrence() builds undirected co-occurrence networks from 6 input formats (delimited fields, long/bipartite, binary matrix, wide sequence, lists) with 8 similarity methods (Jaccard, cosine, association strength, Dice, and more).

# From a long-format data frame
net_co <- cooccurrence(human_long, field = "action", by = "session_id",
                       similarity = "jaccard", threshold = 0.1)

Psychological Networks

| Method | Description | |--------|-------------| | "cor" | Pearson correlations | | "pcor" | Partial correlations (precision matrix inversion) | | "glasso" | EBICglasso — sparse regularised partial correlations | | "ising" | L1-regularised logistic regression for binary items | | "mgm" | Mixed Graphical Model — continuous + categorical variables together |

All implemented from scratch with no dependency on igraph, qgraph, or bootnet.

data(srl_strategies)
net_gl  <- build_network(srl_strategies, method = "glasso")
net_mgm <- build_network(mixed_data, method = "mgm")   # scales + demographics

predictability(net_gl)   # R-squared per node from network structure

Multilevel VAR

build_mlvar() estimates three networks simultaneously from ESM/EMA diary data — the three pillars of mlVAR analysis in a single function call:

Machine-precision equivalence to mlVAR::mlVAR() validated across 25 real ESM datasets, runs 1.45× faster.

data(chatgpt_srl)
fit <- build_mlvar(chatgpt_srl, vars = c("planning", "monitoring", "evaluation"),
                   id = "id", day = "day", beep = "beep")

fit$temporal          # directed network of lagged effects
fit$contemporaneous   # undirected within-person partial correlations
fit$between           # undirected between-persons partial correlations
coefs(fit)            # tidy data.frame: beta, SE, t, p, CI for every edge

Idiographic Networks

build_gimme() estimates a separate network for each person using the Group Iterative Mean Estimation (GIMME) algorithm, then aggregates to a group-level picture. Use this when between-person heterogeneity matters and a single group network would average over meaningfully different individuals.

fit_g <- build_gimme(panel_data, vars = c("x1", "x2", "x3"), id = "id")
fit_g$group_network       # aggregated group-level paths
fit_g$individual_networks # one network per person

Cluster & Group Networks

Sequence Clustering

build_clusters() partitions sequences into k groups using pairwise distance matrices. Supports 9 distance metrics and 8 clustering algorithms. Both build_clusters() and build_mmm() results pass directly to build_network().

clust <- build_clusters(net, k = 3, dissimilarity = "hamming", method = "ward.D2")
plot(clust, type = "silhouette")
cluster_nets <- build_network(clust, method = "tna")

Mixed Markov Models

build_mmm() discovers latent subgroups of sequences that share similar transition dynamics via EM — without pre-labelling groups. BIC/AIC/ICL model selection via compare_mmm().

mmm   <- build_mmm(net, k = 3)
compare_mmm(net, k = 2:6)   # model selection plot + table
mmm_nets <- build_network(mmm, method = "tna")

MCML

build_mcml() decomposes a network into macro (between-cluster) and micro (within-cluster) layers when nodes belong to known groups.

clusters <- list(Metacognitive = c("Planning", "Monitoring"),
                 Cognitive     = c("Elaboration", "Organisation"))
mcml <- cluster_summary(net, clusters)
mcml$macro$weights

Higher-Order Networks

Capture dependencies beyond first-order transitions:

| Function | What it finds | |----------|--------------| | build_hon() | Variable-length memory paths | | build_honem() | Higher-order network embedding | | build_hypa() | Statistically anomalous paths (over/under-represented) | | build_mogen() | Optimal Markov order per node |

hon  <- build_hon(net, max_order = 2)
pathways(hon)                      # arrow-notation path strings
hypa <- build_hypa(net)
hypa$over                          # over-represented paths with p-values

Topological Analysis

Go beyond edges — find cliques, holes, and high-order connectivity using tools from algebraic topology.

sc <- build_simplicial(net, method = "clique")
betti_numbers(sc)          # connected components, cycles, voids
euler_characteristic(sc)
ph <- persistent_homology(net)  # track topology across thresholds
plot(ph)
qa <- q_analysis(sc)       # Atkin's Q-connectivity structure vectors

Sequence Visualization

Visualize raw sequence data as index plots or state distribution charts — before or after clustering.

# Sequence index plot: one row per person, coloured by state
sequence_plot(net)

# After clustering: faceted by cluster
clust <- build_clusters(net, k = 3)
sequence_plot(clust, type = "index")

# State distribution over time
distribution_plot(net, type = "area")
distribution_plot(clust, type = "bar")

Sequence Pattern Comparison

sequence_compare() extracts all k-gram patterns from grouped sequences, counts per-group frequencies, and tests statistical differences via permutation — answering the question "do these groups actually behave differently, and where?"

data(human_long)
net <- build_network(human_long, method = "tna",
                     action = "action", actor = "session_id",
                     time = "time", group = "phase")

res <- sequence_compare(net, sub = 2:4, test = "chisq", adjust = "fdr")
res$patterns                        # per-pattern frequencies + p-values
plot(res)                           # back-to-back pyramid chart
plot(res, style = "heatmap")        # heatmap for many patterns

Association Rule Mining

association_rules() mines "if A then B" patterns from sequences or binary matrices using the Apriori algorithm. Returns support, confidence, lift, and conviction for every rule above a threshold.

rules <- association_rules(net, min_support = 0.05, min_confidence = 0.6)
rules$rules                       # tidy data.frame, sorted by lift
pathways(rules)                   # rules as arrow-notation strings

# From a raw binary matrix
rules2 <- association_rules(binary_mat, min_support = 0.1)

Link Prediction

predict_links() scores all unobserved node pairs using structural similarity, identifying which missing connections are most likely to exist. evaluate_links() computes AUC, precision, and recall against held-out edges.

preds <- predict_links(net)       # common neighbours, Adamic-Adar, Katz, ...
head(preds$scores)                # sorted by predicted score

# Evaluate against known missing edges
eval  <- evaluate_links(net, held_out = test_edges)
eval$auc

Markov Chain Analysis

markov_stability() measures how stable a network partition is under random-walk dynamics at different time scales — a resolution-free way to find communities. passage_time() computes expected first-passage and return times between states.

stab <- markov_stability(net, times = seq(0.1, 10, 0.1))
plot(stab)                # stability vs time-scale curve

pt <- passage_time(net)
pt$first_passage          # expected steps to reach state j from state i
pt$return_time            # expected steps to return to the same state

Statistical Validation

Every network type shares the same validation pipeline.

# Bootstrap confidence intervals
boot <- bootstrap_network(net, iter = 1000)
summary(boot)

# Permutation test: are two networks different?
perm <- permutation(group_nets$`Cluster 1`, group_nets$`Cluster 2`)

# Network Comparison Test (NCT): formal test of structure + global strength
nct_res <- nct(data1, data2, iter = 500)
print(nct_res)            # M-statistic, S-statistic, per-edge p-values

# Split-half reliability
network_reliability(net)

# Centrality stability (CS-coefficient)
centrality_stability(net)

# Glasso-specific bootstrap (edge inclusion + centrality CIs)
boot_gl <- boot_glasso(net_pna, iter = 1000)

| Function | Purpose | |----------|---------| | bootstrap_network() | Bootstrap CIs and p-values for each edge | | permutation() | Edge-level comparison between two networks | | nct() | Formal Network Comparison Test (global strength + structure) | | network_reliability() | Split-half reliability of edge weights | | centrality_stability() | CS-coefficient via case-dropping | | boot_glasso() | Edge inclusion, centrality CIs, difference tests for glasso networks |

Bundled Datasets

| Dataset | Description | |---------|-------------| | human_long | 10,796 human actions across 429 human-AI coding sessions | | ai_long | Matched AI actions from the same 429 sessions | | human_cat | Same sessions coded at category level (9 types) | | human_detailed | Same sessions at fine-grained code level | | srl_strategies | SRL strategy frequencies — 250 students, 9 strategies | | chatgpt_srl | ChatGPT-generated SRL scale scores for psychological networks | | learning_activities | Binary learning activity indicators — 200 students × 30 timepoints | | group_regulation_long | Group regulation sequences with covariates | | trajectories | 138-student engagement trajectory matrix |

Documentation

Citation

If you use Nestimate in your research, please cite:

Saqr, M., Lopez-Pernas, S., Tormanen, T., Kaliisa, R., Misiejuk, K., & Tikka, S. (2025). Transition Network Analysis: A Novel Framework for Modeling, Visualizing, and Identifying the Temporal Patterns of Learners and Learning. Proceedings of the 15th Learning Analytics and Knowledge Conference. doi: 10.1145/3706468.3706513

Saqr, M., Beck, E., & Lopez-Pernas, S. (2024). Psychological Networks. In M. Saqr & S. Lopez-Pernas (Eds.), Learning Analytics Methods and Tutorials (pp. 513–546). Springer. doi: 10.1007/978-3-031-54464-4_19

License

MIT



Try the Nestimate package in your browser

Any scripts or data that you put into this service are public.

Nestimate documentation built on April 20, 2026, 5:06 p.m.