computePairs: Builds pairs of devices and corresponding connections.

Description Usage Arguments Value

View source: R/computePairs.R

Description

Builds a data.table object that contains pairs of antenna IDs in the form "antennaID1-antennaID2", where antennaID1 corresponds to a device while antennaID2 corresponds to another device, for each time instant over a time period when the network events are registered. These pairs are build for each distinct combination "deviceID1-deviceID2" of devices. The rows of the data.table object corresponds to a combination of devices and the columns corresponds to different time instants. The first two columns contains the device IDs of each pair of devices and the rest of the columns correspond to a time instant and contains the pairs antennaID1-antennaID2 where the two devices are connected at that time instant.

Usage

1
2
3
4
5
6
7
8
computePairs(
  connections,
  ndevices,
  oneToOne = TRUE,
  P1 = 0,
  limit = 0.05,
  antennaNeighbors = NULL
)

Arguments

connections

A matrix with the antenna IDs where the mobile devices are connected at every time instant. Each row corresponds to a device and each column to a time instant. This matrix is obtained by calling getConnections() function.

ndevices

The number of devices registered by the network. A vector with device IDs can be obtained by calling getDevices() function and the number of devices is simply the lenght of this vector.

oneToOne

If TRUE, the result is built to apply the method "1to1" to compute the duplicity probability for each device. This means that the result will contain all combinations of devices. If FALSE, the result will consider the proximity of antennas through the parameter antennaNeighbors and remove all the combinations of devices that are impossible to belong to a single person. If most of the time instants two devices are connected to two antennas that are not neighbors (i.e. their cells don't overlap) we consider that these two devices belong to different persons and remove this combination of devices from the result. The term "most of the time instants" is implemented like this: we add how many times during the time horizon two devices are connected to neighboring antennas and than keep in the final result only those combinations of devices with this number greater than the quantile of the sequence 0...(Number of time instants) with probability 1-P1-limit. In this way we reduce the time complexity of the duplicity probability computation.

P1

The apriori probability of duplicity. It is obtained by calling aprioriDuplicityProb() function.

limit

A number that stands for the error in computing apriori probability of duplicity.

antennaNeighbors

A data.table object with a single column nei that contains pairs of antenna IDs that are neighbors in the form antennaID1-antennaID2. We consider that two antennas are neighbors if their coverage areas has a non void intersection.

Value

a data.table object. The first two columns contain the devices indices while the rest of the columns contains pairs antennaID1-antennaID2 with antenna IDs where the devices are connected. There is one column for each time instant for the whole time horizon.


bogdanoancea/deduplication documentation built on Dec. 2, 2020, 11:22 p.m.