Markov Stability Analysis
In Nestimate: Network Estimation, Bootstrap, and Higher-Order Analysis

knitr::opts_chunk$set(
  collapse  = TRUE,
  comment   = "#>",
  out.width = "100%",
  fig.width = 7,
  fig.height = 5,
  dpi       = 96,
  warning   = FALSE,
  message   = FALSE
)

library(Nestimate)

Nestimate provides two functions for Markov-chain stability analysis of transition networks:

passage_time() -- computes the full matrix of mean first passage times (MFPT). Entry M[i, j] is the expected number of steps to travel from state i to state j for the first time. The diagonal equals the mean recurrence time 1/pi.
markov_stability() -- computes per-state stability metrics: persistence, stationary probability, return time, sojourn time, and mean accessibility.

Both functions accept a netobject, cograph_network, tna object, row-stochastic matrix, or a raw wide sequence data frame.

Data

trajectories contains 138 learners recorded at 15 time-steps each with three engagement states: Active, Average, and Disengaged.

dim(trajectories)
table(as.vector(trajectories), useNA = "always")

Selecting mostly-active learners

We keep learners who were Active more than half the time (> 7 of 15 steps).

df <- as.data.frame(trajectories)
active_count  <- rowSums(df == "Active", na.rm = TRUE)
mostly_active <- active_count > ncol(df) / 2
cat(sum(mostly_active), "of", nrow(df), "learners qualify\n")
sub <- df[mostly_active, ]

Sequence plots

state_pal <- c(Active = "#1a7a1a", Average = "#E69F00", Disengaged = "#CC79A7")

sequence_plot(
  df,
  type         = "heatmap",
  sort         = "lcs",
  k            = 3,
  k_color      = "white",
  k_line_width = 2,
  state_colors = state_pal,
  na_color     = "grey88",
  main         = "Full sample - all 138 learners",
  time_label   = "Time-step",
  y_label      = "Learner",
  legend       = "bottom"
)

sequence_plot(
  sub,
  type         = "heatmap",
  sort         = "lcs",
  k            = 2,
  k_color      = "white",
  k_line_width = 2,
  state_colors = state_pal,
  na_color     = "grey88",
  main         = "Mostly-active learners - Active > 7 of 15 steps (n = 42)",
  time_label   = "Time-step",
  y_label      = "Learner",
  legend       = "bottom"
)

Transition networks

net_all <- build_network(df,  method = "relative")
net_sub <- build_network(sub, method = "relative")

round(net_all$weights, 3)
round(net_sub$weights, 3)

In the mostly-active group nearly every state transitions predominantly back to Active, and the probability of entering Disengaged is almost zero.

Mean First Passage Times

pt_all <- passage_time(net_all)
pt_sub <- passage_time(net_sub)

print(pt_all, digits = 2)

print(pt_sub, digits = 2)

Active to Disengaged rises from 10.4 to ~40 steps -- mostly-active learners are nearly four times harder to disengage. The diagonal equals the mean recurrence time (how many steps between consecutive visits to the same state).

plot(pt_all, title = "Full sample (n = 138)")

plot(pt_sub, title = "Mostly-active learners (n = 42)")

In the mostly-active heatmap the Active column is uniformly dark (quickly reachable from any state) and the Disengaged column is uniformly pale (nearly unreachable).

Stationary distribution

data.frame(
  State           = names(pt_all$stationary),
  `Full sample`   = paste0(round(pt_all$stationary * 100, 1), "%"),
  `Mostly active` = paste0(round(pt_sub$stationary * 100, 1), "%"),
  check.names     = FALSE
)

In the mostly-active group ~79% of long-run time is spent Active (vs 37% in the full sample).

Markov Stability

ms_all <- markov_stability(net_all)
ms_sub <- markov_stability(net_sub)

print(ms_all)
print(ms_sub)

The stability data frame contains:

| Column | Meaning | |--------|---------| | persistence | P[i,i] -- probability of staying at the next step | | stationary_prob | Long-run proportion of time in this state | | return_time | Expected steps between consecutive visits (1/pi) | | sojourn_time | Expected consecutive steps before leaving (1/(1-P[i,i])) | | avg_time_to_others | Mean MFPT leaving this state to all others | | avg_time_from_others | Mean MFPT from all other states arriving here |

plot(ms_all, title = "Full sample")

plot(ms_sub, title = "Mostly-active learners")

Active leads on persistence (0.81) and sojourn time (5.3 steps) -- once a learner is active they stay active. Average has the lowest avg_time_from_others (1.6 steps) -- the most accessible hub state. Disengaged has avg_time_from_others of ~39 steps -- effectively unreachable.