This package is designed to make it easier to use SNOMED CT in R, including searching the SNOMED CT dictionary and relations, and creating and manipulating SNOMED CT codelists.

Basic introduction to SNOMED CT

SNOMED CT is a clinical terminology system which contains thousands of concepts, each of which has a distinct meaning. Each concept has a unique concept ID, and one or more descriptions (synonyms). It also contains a knowledge model (ontology), specifying which concepts are subtypes of another concept or associated in other ways.

The SNOMED dictionaries are contained within four key tables:

Each SNOMED CT concept may have any number of synonyms, but there are the following special types:

Each concept also has a semantic tag, which denotes what type of concept it is (e.g. 'disorder' or 'organism'). This is currently recorded in parentheses at the end of the Fully Specified Name, but can be extracted using the semanticType function in this package.

Loading the SNOMED CT dictionaries

This package contains functions to import a set of SNOMED CT release files from NHS Digital TRUD (https://isd.digital.nhs.uk/trud3/user/guest/group/0/home). The International and UK release files are provided in separate downloads, and need to be combined to create the whole dictionary. This can be done using the loadSNOMED() function and specifying a vector of folder paths to load.

For most users the 'Snapshot' files, containing the current versions of the entries, will be the most useful, and the sample dictionaries in this package are based on the Snapshot file format. The 'Delta' files contain changes since the previous version, and the 'Full' files contain a history of the changes to each entry.

The SNOMED CT dictionary files are loaded into an R 'environment', which is an object that can contain other objects. This is a convenient way in R to store a group of objects without cluttering up the global environment with too many individual objects. It also allows different versions of SNOMED CT dictionaries to be used side by side, by loading them into different environments. For ease of use, many of the functions in the package will search for the dictionaries in an environment named 'SNOMED' by default, but for programming use I recommend specifying the environment explicitly to avoid unexpected errors.

For the purpose of this vignette, we will create a sample set of SNOMED CT files from the sample dictionaries included with the package.

# The sampleSNOMED() function returns an environment containing
# the sample dictionaries
TEST <- sampleSNOMED()

# TEST is now the environment. Objects within the environment can be
# retrieved using the $ operator or the 'get' function. We will export
# the sample dictionaries to a temporary folder in order to retrieve
# them using loadSNOMED()

for (table in c('Concept', 'Description', 'Relationship',
  'StatedRelationship')){
  write.table(get(toupper(table), envir = TEST), paste0(tempdir(),
    '/sct_', table, '_test.txt'), row.names = FALSE, sep = '\t', quote = FALSE)
}

# Import using the loadSNOMED function
SNOMED <- loadSNOMED(tempdir(), active_only = FALSE)

SNOMED CT concepts IDs in R

SNOMED CT concept IDs are long integers which need to be represented using the integer64 data type in R. This is not available in base R but is provided in the bit64 package which is automatically loaded with this package. They must not be stored as short integer or numeric values because they cannot be stored precisely and may be incorrect.

This package makes it easier to use SNOMED CT concept IDs because they can be supplied as character vectors and converted to 'SNOMEDconcept' vectors. The SNOMEDconcept class is a 64-bit integer class which can faithfully store SNOMED CT concept IDs, and is more memory-efficient than storing them as character vectors. If the SNOMED dictionary is available in the R environment, the concepts are displayed with their description in the default print method.

The function 'as.SNOMEDconcept' can be used to retrieve the SNOMED concept ID matching a description. This function also converts SNOMED CT concept IDs in other formats (numeric, 64-bit integer or character) into SNOMEDconcept objects.

# Make sure the SNOMED environment is available and contains the SNOMED dictionary
as.SNOMEDconcept('Heart failure', SNOMED = SNOMED)

# To use the sample SNOMED dictionary for testing
as.SNOMEDconcept('Heart failure', SNOMED = sampleSNOMED())

# If an object named SNOMED containing the SNOMED dictionary is available
# in the current environment, it does not need to be stated in the
# function call
SNOMED <- sampleSNOMED()
as.SNOMEDconcept('Heart failure')

# The argument 'exact' can be used to specify whether a regular expression
# search should be done, e.g.
as.SNOMEDconcept('Heart f', exact = FALSE)

# The 'description' function can be used to return the descriptions of
# the concepts found. It returns a data.table with the fully specified 
# name for each term.
description(as.SNOMEDconcept('Heart f', exact = FALSE))

# The 'semantic type' function returns the semantic type of the concept
# from the Fully Specified Name
semanticType(as.SNOMEDconcept('Heart failure'))

# Functions which expect a SNOMEDconcept object, such as semanticType,
# will automatically convert their argument to SNOMEDconcept using the
# function as.SNOMEDconcept
semanticType('Heart failure')

Set operations using SNOMEDconcept

This package provides versions of the set functions 'setdiff', 'intersect' and 'union' which work with SNOMEDconcept objects.

# A list of concepts with a description containing the term 'heart'
# (not that all synonyms are searched, not just the Fully Specified Names)
heart <- as.SNOMEDconcept('Heart|heart', exact = FALSE, SNOMED = sampleSNOMED())

# A list of concepts containing the term 'fail'
fail <- as.SNOMEDconcept('Fail|fail', exact = FALSE, SNOMED = sampleSNOMED())

# Concepts with heart and fail
intersect(heart, fail)

# Concepts with heart and not fail
setdiff(heart, fail)

# Concepts with heart or fail
union(heart, fail)

These set operations can be used to create lists of SNOMED CT concepts of interest, similar to the way researchers use Read codes. However, SNOMED CT also allows the use of hierarchies and relationships to locate concepts by their meaning

Using relationships between SNOMED CT concepts

The most important relationship is the 'Is a' relationship, also known as parent and child. This makes it easy to find all specific concepts that are a subtype of a more general concept. The relationship functions (parents, ancestors, children and descendants) all take a SNOMEDconcept object as input, or attempt to convert their argument to SNOMEDconcept.

SNOMED <- sampleSNOMED()

# Parents (immediate ancestors)
parents('Acute heart failure')

# Ancestors
ancestors('Acute heart failure')

# Children (immediate descendants)
children('Acute heart failure')

# Descendants
descendants('Acute heart failure')

Attributes of SNOMED CT concepts

The 'hasAttributes' function can be used to find terms with particular attributes. For example, to find all disorders with a finding site of the heart, we can use the relatedConcepts function, which retrieves relationships from the RELATIONSHIP and STATEDRELATIONSHIP tables.

In order to find the finding site of a disorder, we use the 'forward' relationship. In order to find disorders with a particular finding site, we use the relationship in the 'reverse' direction.

SNOMED <- sampleSNOMED()

# List all the attributes of a concept
print(attrConcept('Heart failure'))

# 'Finding site' of a particular disorder
relatedConcepts('Heart failure','Finding site')

# Disorders with a 'Finding site' of 'Heart'
relatedConcepts('Heart', 'Finding site', reverse = TRUE)

SNOMED CT codelists

The SNOMED CT codelist data type allows a list of SNOMED CT concepts to be curated. It allows codelists to be expressed in a contracted (simplified) form with descendants implicitly included.

SNOMED <- sampleSNOMED()

# Create a codelist containing all the descendants of
my_heart_failure_codelist <- SNOMEDcodelist(as.SNOMEDconcept('Heart failure'), include_desc = TRUE)

# Original codelist
print(my_heart_failure_codelist)

# Expanded codelist
expanded <- expandSNOMED(my_heart_failure_codelist)
print(expanded)

# Contract codelist
contracted <- contractSNOMED(expanded)
print(contracted)

More information

For more information about SNOMED CT, visit the SNOMED CT international website: https://www.snomed.org/

SNOMED CT (UK edition) can be downloaded from the NHS Digital site: https://isd.digital.nhs.uk/trud3/user/guest/group/0/home

The NHS Digital terminology browser can be used to search for terms interactively: https://termbrowser.nhs.uk/



anoopshah/Rdiagnosislist documentation built on April 21, 2023, 11:49 p.m.