anonymize: Replace badge identifiers

View source: R/anonymize.R

anonymizeR Documentation

Replace badge identifiers

Description

Sociometric data contains the numeric Badge identifier. This function "anonymizes" the exported identifiers (which correspond to the unique Badge IDs) with alternative IDs. Replacement values can be provided or will be generated by the function.

Usage

anonymize(x, ids = NULL, replv = NULL, cols = NULL, decreasing = F)

## S3 method for class 'smtrx'
anonymize(
  x,
  ids = NULL,
  replv = NULL,
  cols = c("Badge.ID", "Other.ID"),
  decreasing = F
)

## S3 method for class 'ego'
anonymize(x, ids = NULL, replv = NULL, cols = c("Badge.ID"), decreasing = F)

Arguments

x

A data frame with one or several columns to anonymize ids. Usually these columns are "Badge.ID" and "Other.ID"

ids

vector of values to be replaced. If (default=NULL) gathers automatically a list of unique values across all cols.

replv

vector of replacement values. Default value NULL will generate replacement values as sequence from 1:n (number of unique elements)

cols

Vector of column (names or indices) over which replacement will happen.

decreasing

Logical. In case ids == NULL, indicates how the retrieved ids from the indicated columns are ordered. If a list of ids is provided, the ordering will be ignored. See Details!

Details

In order to replace values, two behaviors are possible: a) the mapping from original values to replacement values is specified by vectors of equal length for ids and replv where the first element of ids is replaced with the first element of replv and so forth. If ids=NULL, the function retrieves a list of unique IDs over the specified columns. The list of unique values is generated over the combined list of all columns and not on a per column basis. It is decisive that this list of unique IDs can be ordered differently which affects which replacement values are assigned! For the sorting of the unique IDs see unique_ids. The default value decreasing=F makes sure that unique IDs are sorted in ascending order which corresponds to the order of the automatically generated replacement values from 1..n in case replv=NULL

Value

Dataframe with replaced values in indicated columns.

Methods (by class)

  • smtrx: Anonymize sociometrics data with two columns "Badge.ID" and "Other.ID" by default

  • ego: Anonymize sociometrics data frame with single column "Badge.ID" by default.

Examples

x <- data.frame(a=c(1:15), b=c(11:25), c=sample(25:30, size=15, replace=T))

#replace all 2 with "AA" and all 13 with -99 over all three columns.
anonymize(x, ids=c(2,13,25), replv=c("AA",-99, "--"), cols=c("a", "b", "c"))

#replaces sequence of 11:25 with 15:1 in column "b"
anonymize(x, cols=c("b"), decreasing=T)

#column "c" repeats values between 25:30. Replaces in increasing order 1:5
anonymize(x, cols=c("c"), decreasing=F)

#replaces in inversed (decreasing) order 5:1
anonymize(x, cols=c("c"), decreasing=T)

#combine two columns and inverse order
anonymize(x, cols=c("b", "c"), decreasing=T)



jmueller17/sociometrics documentation built on March 20, 2024, 1:04 a.m.