remstats_repetition: Compute Butts' (2008) Repetition Network Statistic for Event...

View source: R/rem_repetition.R

remstats_repetitionR Documentation

Compute Butts' (2008) Repetition Network Statistic for Event Dyads in a Relational Event Sequence

Description

[Stable]

This function computes the repetition network sufficient statistic for a relational event sequence (see Lerner and Lomi 2020; Butts 2008). Repetition measures the increased tendency for events between S and R to occur given that S and R have interacted in the past. Furthermore, this function allows for repetition scores to be only computed for the sampled events, while creating the weights based on the full event sequence (see Lerner and Lomi 2020; Vu et al. 2015). The function also allows users to use two different weighting functions, return the counts of past events, reduce computational runtime, and specify a dyadic cutoff for relational relevancy.

Usage

remstats_repetition(
  time,
  sender,
  receiver,
  observed,
  sampled,
  halflife = 2,
  counts = FALSE,
  dyadic_weight = 0,
  exp_weight_form = FALSE
)

Arguments

time

The vector of event times from the post-processing event sequence.

sender

The vector of event senders from the post-processing event sequence.

receiver

The vector of event receivers from the post-processing event sequence

observed

A vector for the post-processing event sequence where i is equal to 1 if the dyadic event is observed and 0 if not.

sampled

A vector for the post-processing event sequence where i is equal to 1 if the observed dyadic event is sampled and 0 if not.

halflife

A numerical value that is the halflife value to be used in the exponential weighting function (see details section). Preset to 2 (should be updated by the user based on substantive context).

counts

TRUE/FALSE. TRUE indicates that the counts of past events should be computed (see the details section). FALSE indicates that the temporal exponential weighting function should be used to downweigh past events (see the details section). Set to FALSE by default.

dyadic_weight

A numerical value for the dyadic cutoff weight that represents the numerical cutoff value for temporal relevancy based on the exponential weighting function. For example, a numerical value of 0.01, indicates that an exponential weight less than 0.01 will become 0 and that events with such value (or smaller values) will not be included in the sum of the past event weights (see the details section). Set to 0 by default.

exp_weight_form

TRUE/FALSE. TRUE indicates that the Lerner et al. (2013) exponential weighting function will be used (see the details section). FALSE indicates that the Lerner and Lomi (2020) exponential weighting function will be used (see the details section). Set to FALSE by default

Details

This function calculates the repetition scores for relational event models based on the exponential weighting function used in either Lerner and Lomi (2020) or Lerner et al. (2013).

Following Lerner and Lomi (2020), the exponential weighting function in relational event models is:

w(s, r, t) = e^{-(t-t') \cdot \frac{ln(2)}{T_{1/2}} }

Following Lerner et al. (2013), the exponential weighting function in relational event models is:

w(s, r, t) = e^{-(t-t') \cdot \frac{ln(2)}{T_{1/2}} } \cdot \frac{ln(2)}{T_{1/2}}

In both of the above equations, s is the current event sender, r is the current event receiver (target), t is the current event time, t' is the past event times that meet the weight subset (in this case, all events that have the same sender and receiver), and T_{1/2} is the halflife parameter.

The formula for repetition for event e_i is:

repetition_{e_{i}} = w(s, r, t)

Moreover, researchers interested in modeling temporal relevancy (see Quintane, Mood, Dunn, and Falzone 2022; Lerner and Lomi 2020) can specify the dyadic weight cutoff, that is, the minimum value for which the weight is considered relationally relevant. Users who do not know the specific dyadic cutoff value to use, can use the remstats_dyadcut function.

Following Butts (2008), if the counts of the past events are requested, the formula for repetition for event e_i is:

repetition_{e_{i}} = d(s = s', r = r', t')

Where, d() is the number of past events where the event sender, s', is the current event sender, s, the event receiver (target), r', is the current event receiver, r. Moreover, the counting equation can be used in tandem with relational relevancy, by specifying the halflife parameter, exponential weighting function, and the dyadic cut off weight values. If the user is not interested in modeling relational relevancy, then those value should be left at their baseline values.

Value

The vector of repetition statistics for the relational event sequence.

Author(s)

Kevin A. Carson kacarson@arizona.edu, Diego F. Leal dflc@arizona.edu

References

Butts, Carter T. 2008. "A Relational Event Framework for Social Action." Sociological Methodology 38(1): 155-200.

Quintane, Eric, Martin Wood, John Dunn, and Lucia Falzon. 2022. “Temporal Brokering: A Measure of Brokerage as a Behavioral Process.” Organizational Research Methods 25(3): 459-489.

Lerner, Jürgen and Alessandro Lomi. 2020. “Reliability of relational event model estimates under sampling: How to fit a relational event model to 360 million dyadic events.” Network Science 8(1): 97-135.

Lerner, Jürgen, Margit Bussman, Tom A.B. Snijders, and Ulrik Brandes. 2013. " Modeling Frequency and Type of Interaction in Event Networks." The Corvinus Journal of Sociology and Social Policy 4(1): 3-32.

Vu, Duy, Philippa Pattison, and Garry Robins. 2015. "Relational event models for social learning in MOOCs." Social Networks 43: 121-135.

Examples

data("WikiEvent2018.first100k")
WikiEvent2018 <- WikiEvent2018.first100k[1:10000,] #the first ten thousand events
WikiEvent2018$time <- as.numeric(WikiEvent2018$time) #making the variable numeric
### Creating the EventSet By Employing Case-Control Sampling With M = 5 and
### Sampling from the Observed Event Sequence with P = 0.01
EventSet <- create_riskset(type = "two-mode",
 time = WikiEvent2018$time, # The Time Variable
 eventID = WikiEvent2018$eventID, # The Event Sequence Variable
 sender = WikiEvent2018$user, # The Sender Variable
 receiver = WikiEvent2018$article, # The Receiver Variable
 p_samplingobserved = 0.01, # The Probability of Selection
 n_controls = 8, # The Number of Controls to Sample from the Full Risk Set
 combine = TRUE,
 seed = 9999) # The Seed for Replication

#Computing the repetition statistics for the relational event sequence with the
#weights of past events returned
rep_weights <- remstats_repetition(
   time = EventSet$time,
   sender = EventSet$sender,
   receiver = EventSet$receiver,
   sampled = EventSet$sampled,
   observed = EventSet$observed,
   halflife = 2.592e+09, #halflife parameter
   dyadic_weight = 0,
   exp_weight_form = FALSE)


#Computing the repetition statistics for the relational event sequence with the
#counts of events returned
rep_counts <- remstats_repetition(
   time = EventSet$time,
   sender = EventSet$sender,
   receiver = EventSet$receiver,
   sampled = EventSet$sampled,
   observed = EventSet$observed,
   halflife = 2.592e+09, #halflife parameter
   dyadic_weight = 0,
   exp_weight_form = FALSE)

cbind(rep_weights, rep_counts)



dream documentation built on Jan. 21, 2026, 1:06 a.m.