computeRecency: Compute Butts' (2008) Recency Network Statistic for Event...

View source: R/rem_stat_recency.R

computeRecencyR Documentation

Compute Butts' (2008) Recency Network Statistic for Event Dyads in a Relational Event Sequence

Description

This function computes the recency network sufficient statistic for a relational event sequence (see Butts 2008; Vu et al. 2015; Meijerink-Bosman et al. 2022). The recency statistic captures the tendency in which more recent events (i.e., an exchange between two medical doctors) are more likely to reoccur in comparison to events that happened in the distant past (see Butts 2008 for a discussion). This measure allows for recency scores to be only computed for the sampled events, while creating the statistics based on the full event sequence. Moreover, the function allows users to specify relational relevancy for the statistic and employ a sliding windows framework for large relational sequences.

Usage

computeRecency(
  observed_time,
  observed_sender,
  observed_receiver,
  processed_time,
  processed_sender,
  processed_receiver,
  type = c("raw.diff", "inv.diff.plus1", "rank.ordered.count"),
  i_neighborhood = TRUE,
  dependency = FALSE,
  relationalTimeSpan = NULL,
  nopastEvents = NA,
  sliding_windows = FALSE,
  processed_seqIDs = NULL,
  window_size = NA
)

Arguments

observed_time

The vector of event times from the pre-processing event sequence.

observed_sender

The vector of event senders from the pre-processing event sequence.

observed_receiver

The vector of event receivers from the pre-processing event sequence

processed_time

The vector of event times from the post-processing event sequence (i.e., the event sequence that contains the observed and null events).

processed_sender

The vector of event senders from the post-processing event sequence (i.e., the event sequence that contains the observed and null events).

processed_receiver

The vector of event receivers from the post-processing event sequence (i.e., the event sequence that contains the observed and null events).

type

A string value that specifies which recency formula will be used to compute the statistics. The options are "raw.diff", "inv.diff.plus1", "rank.ordered.count" (see the details section).

i_neighborhood

TRUE/FALSE. TRUE indicates that the recency statistic will be computed in reference to the sender’s past relational history (see details section). FALSE indicates that the persistence statistic will be computed in reference to the target’s past relational history (see details section). Set to TRUE by default.

dependency

TRUE/FALSE. TRUE indicates that temporal relevancy will be modeled (see the details section). FALSE indicates that temporal relevancy will not be modeled, that is, all past events are relevant (see the details section). Set to FALSE by default.

relationalTimeSpan

If dependency = TRUE, a numerical value that corresponds to the temporal span for relational relevancy, which must be the same measurement unit as the observed_time and processed_time objects. When dependency = TRUE, the relevant events are events that have occurred between current event time, t, and t - relationalTimeSpan. For example, if the time measurement is the number of days since the first event and the value for relationalTimeSpan is set to 10, then only those events which occurred in the past 10 days are included in the computation of the statistic.

nopastEvents

The numerical value that specifies what value should be given to events in which the sender has sent not past ties (i's neighborhood when i_neighborhood = TRUE) or has not received any past ties (j's neighborhood when i_neighborhood = FALSE). Set to NA by default.

sliding_windows

TRUE/FALSE. TRUE indicates that the sliding windows computational approach will be used to compute the network statistic, while FALSE indicates the ap- proach will not be used. Set to FALSE by default. It’s important to note that the sliding windows framework should only be used when the pre-processed event sequence is ‘big’, such as the 360 million pre-processed event sequence used in Lerner and Lomi (2020), as it aims to reduce the computational burden of sorting ‘big’ datasets. In general, most pre-processed event sequences will not need to use the sliding windows approach. There is not a strict cutoff for ‘big’ dataset. This definition depends on both the size of the observed event sequence and the post-processing sampling dataset. For instance, according to our internal tests, when the event sequence is relatively large (i.e., 100,000 observed events) with probability of sampling from the observed event sequence set to 0.05 and using 10 controls per sampled event, the sliding windows framework for computing repetition is about 11% faster than the non-sliding windows framework. Yet, in a smaller dataset (i.e., 10,000 observed events) the sliding windows framework is about 25% slower than the non-sliding framework with the same conditions as before.

processed_seqIDs

If sliding_windows is set to TRUE, the vector of event sequence IDs from the post-processing event sequence. The event sequence IDs represents the index for when the event occurred in the observed event sequence (e.g., the 5th event in the sequence will have a value of 5 in this vector).

window_size

If sliding_windows is set to TRUE, the sizes of the windows that are used for the sliding windows computational framework. If NA, the function internally divides the dataset into ten slices (may not be optimal).

Details

This function calculates the recency network sufficient statistic for a relational event based on Butts (2008), Vu et al. (2015), or Meijerink-Bosman et al. (2022). Depending on the type and neighborhood requested, different formulas will be used.

In the below equations, when i_neighborhood is TRUE:

t^{*} = max(t \in \left\{(s',r',t') \in E : s'= s \land r'= r \land t'<t \right\})

When i_neighborhood is FALSE, the following formula is used:

t^{*} = max(t \in \left\{(s',r',t') \in E : s'= r \land r'= s \land t'<t \right\})

The formula for recency for event e_i with type set to "raw.diff" and i_neighborhood is TRUE (Vu et al. 2015):

recency_{e_i} = t_{e_i} - t^{*}

where t^{*}, is the most recent time in which the past event has the same receiver and sender as the current event. If there are no past events within the current dyad, then the value defaults to the nopastEvents argument.

The formula for recency for event e_i with type set to "raw.diff" and i_neighborhood is FALSE (Vu et al. 2015):

recency_{e_i} = t_{e_i} - t^{*}

where t^{*}, is the most recent time in which the past event's sender is the current event receiver and the past event receiver is the current event sender. If there are no past events within the current dyad, then the value defaults to the nopastEvents argument.

The formula for recency for event e_i with type set to "inv.diff.plus1" and i_neighborhood is TRUE (Meijerink-Bosman et al. 2022):

recency_{e_i} =\frac{1}{t_{e_i} - t^{*} + 1}

where t^{*}, is the most recent time in which the past event has the same receiver and sender as the current event. If there are no past events within the current dyad, then the value defaults to the nopastEvents argument.

The formula for recency for event e_i with type set to "inv.diff.plus1" and i_neighborhood is FALSE (Meijerink-Bosman et al. 2022):

recency_{e_i} = \frac{1}{t_{e_i} - t^{*} + 1}

where t^{*}, is the most recent time in which the past event's sender is the current event receiver and the past event receiver is the current event sender. If there are no past events within the current dyad, then the value defaults to the nopastEvents argument.

The formula for recency for event e_i with type set to "rank.ordered.count" and i_neighborhood is TRUE (Butts 2008):

recency_{e_i} = \rho(s(e_i), r(e_i), A_t)^{-1}

where \rho(s(e_i), r(e_i), A_t) , is the current event receiver's rank amongst the current sender's recent relational events. That is, as Butts (2008: 174) argues, "\rho(s(e_i), r(e_i), A_t) is j’s recency rank among i’s in-neighborhood. Thus, if j is the last person to have called i, then \rho(s(e_i), r(e_i), A_t)^{-1} = 1. This falls to 1/2 if j is the second most recent person to call i, 1/3 if j is the third most recent person, etc." Moreover, if j is not in i's neighborhood, the value defaults to infinity. If there are no past events with the current sender, then the value defaults to the nopastEvents argument.

The formula for recency for event e_i with type set to "rank.ordered.count" and i_neighborhood is FALSE (Butts 2008):

recency_{e_i} = \rho(r(e_i), s(e_i), A_t)^{-1}

where \rho(r(e_i), s(e_i), A_t) , is the current event sender's rank amongst the current receiver's recent relational events. That is, this measure is the same as above where the dyadic pair is flipped for the past relational events. Moreover, if j is not in i's neighborhood, the value defaults to infinity. If there are no past events with the current sender, then the value defaults to the nopastEvents argument.

Finally, researchers interested in modeling temporal relevancy (see Quintane, Mood, Dunn, and Falzone 2022) can specify the relational time span, that is, length of time for which events are considered relationally relevant. This should be specified via the option relationalTimeSpan with dependency set to TRUE.

Value

The vector of recency network statistics for the relational event sequence.

Author(s)

Kevin A. Carson kacarson@arizona.edu, Diego F. Leal dflc@arizona.edu

References

Butts, Carter T. 2008. "A relational event framework for social action." Sociological Methodology 38(1): 155-200.

Meijerink-Bosman, Marlyne, Roger Leenders, and Joris Mulder. 2022. "Dynamic relational event modeling: Testing, exploring, and applying." PLOS One 17(8): e0272309.

Quintane, Eric, Martin Wood, John Dunn, and Lucia Falzon. 2022. “Temporal Brokering: A Measure of Brokerage as a Behavioral Process.” Organizational Research Methods 25(3): 459-489.

Vu, Duy, Philippa Pattison, and Garry Robbins. 2015. "Relational event models for social learning in MOOCs." Social Networks 43: 121-135.

Examples



# A Dummy One-Mode Event Dataset
events <- data.frame(time = 1:18,
                                eventID = 1:18,
                                sender = c("A", "B", "C",
                                           "A", "D", "E",
                                           "F", "B", "A",
                                           "F", "D", "B",
                                           "G", "B", "D",
                                           "H", "A", "D"),
                                target = c("B", "C", "D",
                                           "E", "A", "F",
                                           "D", "A", "C",
                                           "G", "B", "C",
                                           "H", "J", "A",
                                           "F", "C", "B"))

# Creating the Post-Processing Event Dataset with Null Events
eventSet <- processOMEventSeq(data = events,
                          time = events$time,
                          eventID = events$eventID,
                          sender = events$sender,
                          receiver = events$target,
                          p_samplingobserved = 1.00,
                         n_controls = 6,
                         seed = 9999)

# Compute Recency Statistic without Sliding Windows Framework and
# No Temporal Dependency
eventSet$recency_rawdiff <- computeRecency(
 observed_time = events$time,
 observed_receiver = events$target,
 observed_sender = events$sender,
 processed_time = eventSet$time,
 processed_receiver = eventSet$receiver,
 processed_sender = eventSet$sender,
 type = "raw.diff",
 dependency = FALSE,
 i_neighborhood = TRUE,
 nopastEvents = 0)

# Compute Recency Statistic without Sliding Windows Framework and
# No Temporal Dependency
eventSet$recency_inv <- computeRecency(
 observed_time = events$time,
 observed_receiver = events$target,
 observed_sender = events$sender,
 processed_time = eventSet$time,
 processed_receiver = eventSet$receiver,
 processed_sender = eventSet$sender,
 type = "inv.diff.plus1",
 dependency = FALSE,
 i_neighborhood = TRUE,
 nopastEvents = 0)


# Compute Recency Statistic without Sliding Windows Framework and
# No Temporal Dependency
eventSet$recency_rank <- computeRecency(
 observed_time = events$time,
 observed_receiver = events$target,
 observed_sender = events$sender,
 processed_time = eventSet$time,
 processed_receiver = eventSet$receiver,
 processed_sender = eventSet$sender,
 type = "rank.ordered.count",
 dependency = FALSE,
 i_neighborhood = TRUE,
 nopastEvents = 0)

# Compute Recency Statistic with Sliding Windows Framework and No Temporal Dependency
eventSet$recency_rawdiffSW <- computeRecency(
 observed_time = events$time,
 observed_receiver = events$target,
 observed_sender = events$sender,
 processed_time = eventSet$time,
 processed_receiver = eventSet$receiver,
 processed_sender = eventSet$sender,
 type = "raw.diff",
 dependency = FALSE,
 i_neighborhood = TRUE,
 sliding_windows = TRUE,
 processed_seqIDs = eventSet$sequenceID,
 nopastEvents = 0)


# Compute Recency Statistic with Sliding Windows Framework and No Temporal Dependency
eventSet$recency_invSW <- computeRecency(
 observed_time = events$time,
 observed_receiver = events$target,
 observed_sender = events$sender,
 processed_time = eventSet$time,
 processed_receiver = eventSet$receiver,
 processed_sender = eventSet$sender,
 type = "inv.diff.plus1",
 dependency = FALSE,
 i_neighborhood = TRUE,
 sliding_windows = TRUE,
 processed_seqIDs = eventSet$sequenceID,
 nopastEvents = 0)


# Compute Recency Statistic with Sliding Windows Framework and No Temporal Dependency
eventSet$recency_rankSW <- computeRecency(
 observed_time = events$time,
 observed_receiver = events$target,
 observed_sender = events$sender,
 processed_time = eventSet$time,
 processed_receiver = eventSet$receiver,
 processed_sender = eventSet$sender,
 type = "rank.ordered.count",
 dependency = FALSE,
 i_neighborhood = TRUE,
 sliding_windows = TRUE,
 processed_seqIDs = eventSet$sequenceID,
 nopastEvents = 0)


dream documentation built on Aug. 8, 2025, 6:36 p.m.

Related to computeRecency in dream...