README.md

Critical Care Data Anonymisation

This code is used in conjunction with the Standard Operating Procedure for anonymising data for release for use by clinical researchers according to the terms of the end user license.

We have made this repository public in the interests of transparency.

Dependencies:

How to run:

  1. Prepare a YAML configuration file. You can use the following function to create a template for you.
template.conf("template.yaml")
  1. Using sdc.trial to find the most suitable parameters such as K-anonymity or L-diversity. If it is specified in the SOP, then this step can be skipped.

  2. Create the new ccdata using anonymisation function.

Template code

# Load libraries
# Note, either modify your .Rprofile to contain the location on non-standard libraries, or remember to add the lib.loc = "/data/RLibraries/" argument
library(cleanEHR)
library(ccanonym)
library(yaml)

# Load data
load("data/current_db")

# Run anonymiser
ccd <- anonymisation(
  alldata,
  conf="release-internal.yaml",
  verbose=T,
  remove.alive=FALSE,
  k.anon=5)

anon_conf <- yaml.load_file("release-internal.yaml")

# Now save both data and configuration into a single R data file
save(file="anon_internal.RData", list=c("ccd", "anon_conf"))


CC-HIC/ccanonym documentation built on May 6, 2019, 9:23 a.m.