deidentify: Remove columns that could include identifiable information

View source: R/deidentify.R

deidentifyR Documentation

Remove columns that could include identifiable information

Description

The deidentify() function selects out columns from Qualtrics surveys that may include identifiable information such as IP address, location, or computer characteristics.

Usage

deidentify(x, strict = TRUE)

Arguments

x

Data frame (downloaded from Qualtrics).

strict

Logical indicating whether to use strict or non-strict level of deidentification. Strict removes computer information columns in addition to IP address and location.

Details

The function offers two levels of deidentification. The default strict level removes columns associated with IP address and location and computer information (browser type and version, operating system, and screen resolution). The non-strict level removes only columns associated with IP address and location.

Typically, deidentification should be used at the end of a processing pipeline so that these columns can be used to exclude rows.

Value

An object of the same type as x that excludes Qualtrics columns with identifiable information.

Examples

names(qualtrics_numeric)

# Remove IP address, location, and computer information columns
deid <- deidentify(qualtrics_numeric)
names(deid)

# Remove only IP address and location columns
deid2 <- deidentify(qualtrics_numeric, strict = FALSE)
names(deid2)

excluder documentation built on Feb. 16, 2023, 7:09 p.m.