View source: R/rem_outgoingtwopaths.R
computeOTP | R Documentation |
The function computes the outgoing two paths (OTP) network sufficient statistic for a relational event sequence (see Lerner and Lomi 2020; Butts 2008). In essence, the outgoing two paths measure captures the tendency of triadic closure to occur in the network of past events, in which the past triadic closure is based upon the outgoing two paths structure (see Butts 2008 for an empirical example). This measure allows for OTP scores to be only computed for the sampled events, while creating the weights based on the full event sequence (see Lerner and Lomi 2020; Vu et al. 2015). The function allows users to use two different weighting functions, reduce computational runtime, employ a sliding windows framework for large relational sequences, and specify a dyadic cutoff for relational relevancy.
computeOTP(
observed_time,
observed_sender,
observed_receiver,
processed_time,
processed_sender,
processed_receiver,
sliding_windows = FALSE,
processed_seqIDs = NULL,
counts = FALSE,
halflife = 2,
dyadic_weight = 0,
window_size = NA,
Lerneretal_2013 = FALSE
)
observed_time |
The vector of event times from the pre-processing event sequence. |
observed_sender |
The vector of event senders from the pre-processing event sequence. |
observed_receiver |
The vector of event receivers from the pre-processing event sequence |
processed_time |
The vector of event times from the post-processing event sequence (i.e., the event sequence that contains the observed and null events). |
processed_sender |
The vector of event senders from the post-processing event sequence (i.e., the event sequence that contains the observed and null events). |
processed_receiver |
The vector of event receivers from the post-processing event sequence (i.e., the event sequence that contains the observed and null events). |
sliding_windows |
TRUE/FALSE. TRUE indicates that the sliding windows computational approach will be used to compute the network statistic, while FALSE indicates the ap- proach will not be used. Set to FALSE by default. It’s important to note that the sliding windows framework should only be used when the pre-processed event sequence is ‘big’, such as the 360 million pre-processed event sequence used in Lerner and Lomi (2020), as it aims to reduce the computational burden of sorting ‘big’ datasets. In general, most pre-processed event sequences will not need to use the sliding windows approach. There is not a strict cutoff for ‘big’ dataset. This definition depends on both the size of the observed event sequence and the post-processing sampling dataset. For instance, according to our internal tests, when the event sequence is relatively large (i.e., 100,000 observed events) with probability of sampling from the observed event sequence set to 0.05 and using 10 controls per sampled event, the sliding windows framework for computing repetition is about 11% faster than the non-sliding windows framework. Yet, in a smaller dataset (i.e., 10,000 observed events) the sliding windows framework is about 25% slower than the non-sliding framework with the same conditions as before. |
processed_seqIDs |
If sliding_windows is set to TRUE, the vector of event sequence IDs from the post-processing event sequence. The event sequence IDs represents the index for when the event occurred in the observed event sequence (e.g., the 5th event in the sequence will have a value of 5 in this vector). |
counts |
TRUE/FALSE. TRUE indicates that the counts of past events should be computed (see the details section). FALSE indicates that the temporal exponential weighting function should be used to downweigh past events (see the details section). Set to FALSE by default. |
halflife |
A numerical value that is the halflife value to be used in the exponential weighting function (see the details section). Preset to 2 (should be updated by user). |
dyadic_weight |
A numerical value that is the dyadic cutoff weight that represents the numerical cutoff value for temporal relevancy based on the exponential weighting function. For example, a numerical value of 0.01, indicates that an exponential weight less than 0.01 will become 0 and will not be included in the sum of the past event weights (see the details section). Set to 0 by default. |
window_size |
If sliding_windows is set to TRUE, the sizes of the windows that are used for the sliding windows computational framework. If NA, the function internally divides the dataset into ten slices (may not be optimal). |
Lerneretal_2013 |
TRUE/FALSE. TRUE indicates that the Lerner et al. (2013) exponential weighting function will be used (see the details section). FALSE indicates that the Lerner and Lomi (2020) exponential weighting function will be used (see the details section). Set to FALSE by default |
The function calculates the outgoing two paths statistic for relational event sequences based on the exponential weighting function used in either Lerner and Lomi (2020) or Lerner et al. (2013).
Following Lerner and Lomi (2020), the exponential weighting function in relational event models is:
w(s, r, t) = e^{-(t-t') \cdot \frac{ln(2)}{T_{1/2}} }
Following Lerner et al. (2013), the exponential weighting function in relational event models is:
w(s, r, t) = e^{-(t-t') \cdot \frac{ln(2)}{T_{1/2}} } \cdot \frac{ln(2)}{T_{1/2}}
In both of the above equations, s is the current event sender, r is the
current event receiver (target), t is the current event time, t' is the
past event times that meet the weight subset, and T_{1/2}
is the halflife parameter.
The general formula for outgoing two paths for event e_i
is:
OTP_{e_{i}} = \sqrt{ \sum_h w(s, h, t) \cdot w(h, r, t) }
That is, as discussed in Butts (2008), outgoing two paths finds all past events where the current sender sends a relational tie to node h and the current target receives a relational tie from the same h node.
Moreover, researchers interested in modeling temporal relevancy (see Quintane,
Mood, Dunn, and Falzone 2022; Lerner and Lomi 2020) can specify the dyadic
weight cutoff, that is, the minimum value for which the weight is considered
relationally relevant. Users who do not know the specific dyadic cutoff value to use, can use the
computeRemDyadCut
function.
Following Butts (2008), if the counts of the past events are requested, the formula for outgoing two paths for
event e_i
is:
OTP_{e_{i}} = \sum_{i=1}^{|H|} \min\left[d(s,h,t), d(h,r,t)\right]
Where, d()
is the number of past events that meet the specific set operations. d(s,h,t)
is the number
of past events where the current event sender sent a tie to a third actor, h, and d(h,r,t)
is the number
of past events where the third actor h sent a tie to the current event receiver. The sum loops through all
unique actors that have formed past outgoing two path structures with the current event sender and receiver.
Moreover, the counting equation can be used in tandem with relational relevancy, by specifying the halflife parameter, exponential
weighting function, and the dyadic cut off weight values. If the user is not interested in modeling
relational relevancy, then those values should be left at their defaults.
The vector of outgoing two path statistics for the relational event sequence.
Kevin A. Carson kacarson@arizona.edu, Diego F. Leal dflc@arizona.edu
Butts, Carter T. 2008. "A Relational Event Framework for Social Action." Sociological Methodology 38(1): 155-200.
Quintane, Eric, Martin Wood, John Dunn, and Lucia Falzon. 2022. “Temporal Brokering: A Measure of Brokerage as a Behavioral Process.” Organizational Research Methods 25(3): 459-489.
Lerner, Jürgen and Alessandro Lomi. 2020. “Reliability of relational event model estimates under sampling: How to fit a relational event model to 360 million dyadic events.” Network Science 8(1): 97-135.
Lerner, Jürgen, Margit Bussman, Tom A.B. Snijders, and Ulrik Brandes. 2013. " Modeling Frequency and Type of Interaction in Event Networks." The Corvinus Journal of Sociology and Social Policy 4(1): 3-32.
Vu, Duy, Philippa Pattison, and Garry Robins. 2015. "Relational event models for social learning in MOOCs." Social Networks 43: 121-135.
events <- data.frame(time = 1:18,
eventID = 1:18,
sender = c("A", "B", "C",
"A", "D", "E",
"F", "B", "A",
"F", "D", "B",
"G", "B", "D",
"H", "A", "D"),
target = c("B", "C", "D",
"E", "A", "F",
"D", "A", "C",
"G", "B", "C",
"H", "J", "A",
"F", "C", "B"))
eventSet <- processOMEventSeq(data = events,
time = events$time,
eventID = events$eventID,
sender = events$sender,
receiver = events$target,
p_samplingobserved = 1.00,
n_controls = 1,
seed = 9999)
# Computing Outgoing Two Paths Statistics without the sliding windows framework
eventSet$OTP <- computeOTP(
observed_time = events$time,
observed_sender = events$sender,
observed_receiver = events$target,
processed_time = eventSet$time,
processed_sender = eventSet$sender,
processed_receiver = eventSet$receiver,
halflife = 2, #halflife parameter
dyadic_weight = 0,
Lerneretal_2013 = FALSE)
# Computing Outgoing Two Paths Statistics with the sliding windows framework
eventSet$OTP_SW <- computeOTP(
observed_time = events$time,
observed_sender = events$sender,
observed_receiver = events$target,
processed_time = eventSet$time,
processed_sender = eventSet$sender,
processed_receiver = eventSet$receiver,
halflife = 2, #halflife parameter
processed_seqIDs = eventSet$sequenceID,
dyadic_weight = 0,
sliding_window = TRUE,
Lerneretal_2013 = FALSE)
#The results with and without the sliding windows are the same (see correlation
#below). Using the sliding windows method is recommended when the data are
#big' so that memory allotment is more efficient.
cor(eventSet$OTP , eventSet$OTP_SW)
# Computing Outgoing Two Paths Statistics with the counts of events being returned
eventSet$OTPC <- computeOTP(
observed_time = events$time,
observed_sender = events$sender,
observed_receiver = events$target,
processed_time = eventSet$time,
processed_sender = eventSet$sender,
processed_receiver = eventSet$receiver,
halflife = 2, #halflife parameter
dyadic_weight = 0,
sliding_window = FALSE,
counts = TRUE,
Lerneretal_2013 = FALSE)
cbind(eventSet$OTP,
eventSet$OTP_SW,
eventSet$OTPC)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.