Introduction

The European Data Format (EDF) is a simple and flexible format for exchange and storage of multichannel biological and physical signals. It was developed by a few European 'medical' engineers who first met at the 1987 international Sleep Congress in Copenhagen. See http://www.edfplus.info/

The original EDF specification has been expanded in several ways. EDF+ supports the addition of annotations and non-continuous recordings. The BioSemi Data Format BDF format uses 24 bits per sample (in stead of the 16 bits per sample in EDF). And BDF+ is an EDF+ like extension of BDF.

This packages supports all these variants.

Both EDF and BDF files consist of an header followed by one or more data records with the recorded signals, either ordinary signals or annotation signals.

This package follows this structure by providing two basic functions: readEdfHeader and readEdfSignals (see the help pages for details)

The examples below are based on the following files:

The BFile, CFile and DFile are derived from the "test_generator_2" test files from http://www.teuniz.net/edf_bdf_testfiles. The AFile is derived from the "test_generator8" file received from Teunis van Beelen by private communications.

libDir <- system.file ("extdata", package="edfReader")
AFile <- paste (libDir, '/edfAnnonC.edf', sep='') # a file with 2 annotation signals
BFile <- paste (libDir, '/bdfPlusC.bdf' , sep='') # a continuously recorded BDF file
CFile <- paste (libDir, '/edfPlusC.edf' , sep='') # a continuously recorded EDF file
DFile <- paste (libDir, '/edfPlusD.edf' , sep='') # a discontinuously recorded EDF file

EDF headers objects

Introduction

The readEdfHeader function returns a list of class 'ebdfHeader' with all the data from the EDF or BDF file header. Part of this list is a data frame of class 'ebdfSHeader' which contains the signal headers.

The ebdfHeader

A file header can be read with readEdfHeader()

require (edfReader)
AHdr  <- readEdfHeader (AFile)
BHdr  <- readEdfHeader (BFile)
CHdr  <- readEdfHeader (CFile)            
DHdr  <- readEdfHeader (DFile)                  

Summaries of the header data can be shown with the S3 summary () and print() functions

BHdr
summary (AHdr)

The summary() provides somewhat more information than the print() function. For all details use str().

edfAnnonC.edf contains two annotation. These signals must have the same label. As edfReader names signals after their labels (see below), distinguished names are created by appending '-1' and '-2' to the labels.

NOTE \ \ \ \ \ \ Actually, the startTime fraction .7 is not read from the edf/bdf file header but from the first data record. For details, see the section “Samples, time and periods”.

The ebdfSHeader

Summaries of the signal header data can be shown with the S3 print() and summary () functions.

AHdr$sHeader
summary (CHdr$sHeader)

EDF signal objects

Introduction

The readEdfSignals function with simplify=FALSE returns a list of class 'ebdfSignals' with the signals selected from the EDF / BDF file.

The signals in this list are of the following :

Reading the whole recording of all signals

The signals in an EDF or BDF file can be read with the readEdfSignals function.

ASignals <- readEdfSignals (AHdr)
ASignals

NOTE 1\ \ \ \ \ See below in case the file contains only one signal.

NOTE 2\ \ \ \ \ The annotation signal number '1,3' indicates that the 'ebdfASignal' annotation object contains the annotations from both signals 1 and 3. This can be prevented by using the 'mergeASignals=FALSE' option.

DSignals <- readEdfSignals (DHdr)
DSignals

By default all fragments in a discontinuously recorded signals will be concatenated into one signal with the gaps filled with NA values. This can be prevented by using the 'fragments=TRUE' option.

DSignalsF <- readEdfSignals (DHdr, fragments = TRUE)
DSignalsF

A summary of the list of signals can be shown with the S3 summary () functions. For this 'ebdfSignals' objects it shows the same data as the S3 print function (see above). For object details use the str() function.

Reading a selection of signals

The reading of signals can be restricted to specific set by using the signals parameter. Signals can be identified by their signal number, label, name, or signal type ('Ordinary' or 'Annotations'). Also, the list may contain duplicates. The following 3 designations refer e.g. to the same signal.

CSignals8 <- readEdfSignals (CHdr, signals=c(8, "8", "sine 8.5 Hz"))
CSignals8

NOTE \ \ \ \ \ \ As in this case only one signal was read, the list of one was simplified to a singe 'ebdfCSignal' object. This could be prevented by using the 'simplify=FALSE' parameter.

EXAMPLE \ \ \ \ \ \ readEdfSignals (CHdr, signals=7, simplify=FALSE)[[1]] and readEdfSignals (CHdr, signals=7) will return the same object.

Reading a selected period

If required the reading can be restricted also to a particular period. A period is identified with the 'from' and 'till' parameters which specify a time in seconds relative to start of the recording.

ASignalsPeriod    <- readEdfSignals (AHdr, from=0.7, till=1.8)
ASignalsPeriod

NOTE 1 \ \ \ \ The annotations included are those with an onset up to and including 1.8 sec. The onset for an annotation may be outside the recorded period.

NOTE 2 \ \ \ \ Because the recorded period is only 1.2 sec (see above or use summary(ASignalsPeriod)), the period read will not be [0.7, 1.8) but only [0.7, 1,2).

NOTE 3 \ \ \ \ For time and rounding details see the section “Samples, time and periods”

Summaries of a list of signals can be shown with the S3 print() and summary () functions.

Ordinary signals, continuously recorded

Summaries of a continuously recorded ordinary signal can be shown with the S3 print() and summary () functions.

CSignals <- readEdfSignals (CHdr)
summary (CSignals$pulse)         # edfReader names signals after their label

Ordinary signals, not continuously recorded

Ordinary signals that are not continuously recorded can be read in two different ways:

The latter method will use a more complex data structure, the first may result in a (much too) huge object.

CDSignals <- readEdfSignals (DHdr, from=5.1, till=18)
FDSignals <- readEdfSignals (DHdr, fragments=TRUE)

The objects of class 'ebdfCSignal' are printed and summarised in the same way as continuously recorded signals.

Summaries of a not continuously recorded ordinary signal stored in fragments can be shown with the S3 print() and summary () functions.

summary (FDSignals$`sine 8.5 Hz`)         # note the "`" quotes for a name with spaces.

NOTE \ \ \ \ \ \ In case of more then 10 fragments, the list of fragments summaries will be limited to 5 followed by the number of omitted summaries.

Annotation signals

Summaries of an annotation signal (ASignal) can be shown with the S3 print() and summary () functions.

CSignals$`EDF Annotations`
summary(ASignalsPeriod$`EDF Annotations`)

NOTE 1 \ \ \ \ Because both annotation signals are merged into one, the merged signal is named EDF Annotations again.

NOTE 2 \ \ \ \ The "Record start specs = 0" indicates that the record start specifications are not included, i.e. readEdfSignals was used with the parameter recordStartTimes = FALSE.

Samples, time and periods.

The start time

In case of an EDF or BDF file, the startTime attribute is based on the startdate and starttime in the EDF/BDF header.

The starttime in an EDF/BDF header is specified up to the second.

In EDF+ and BDT+ files a sub second start time can be specified as the start time of the first data record. See the last paragraph in section .2.2.4 of the EDF+ specification.

See, e.g., the startTime for the

format (AHdr$startTime, format="%Y-%m-%d %H:%M:%OS3",  usetz = FALSE)
ASignalsPlusStartTimes <- readEdfSignals(AHdr, signals='EDF Annotations-1', recordStarts=TRUE)
annots <- ASignalsPlusStartTimes$annotations
annots[annots$isRecordStart==TRUE,'onset'][1]

For EDF+ and BDT+ files, the startTime shown by the edfReader is based on both the data in the header and in the start time of the first data record.

Samples and time

As usual, a recording starts at time 0 with sample 1.

Consequently sample n will be at time (n-1)/sRate, where sRate denotes the sample rate.

Samples and periods

Apart from rounding errors, a from - till period in readEdfSignals will be the period [from,till), i.e. starting at from and up to be but not including till.

This may sound strange, but this convention has the following properties

Alignment

The problem

In an EDF+/BDF+ file the start of the recording of the signals in a data record is specified in its first annotations signal. For a +D file the recording in a subsequent data record may start at any time after the start of the previous data record plus the record duration time.

The gap between to +D file data records may not be an exact multiple of the sample period of its signals. This raises a question about the alignment of samples in these subsequent data records.

Two basic models.

The alignment of ordinary signals can be modelled in two different ways:

A. with a interrupted clock
In which case the first sample for every recorded signal is taken at the start of the recording for that data record

B. which a continuously running clock
In which case the clock starts at the start of the first record and all sampling is based on the individual signal sample rate(s). In other words, the sample time for a signal sample in the data records is aligned to its sample rate.

It should be noted that the model with the continuously running clock is (implicitly) required if one want to map all fragments of a signal into a single single (with NA values in the gap) one.

The edfReader supports both models, for any fragment it provides both the aligned data according to the "continuously running clock" model as well as the record start time from the data record.

Alignment details

The first sample in every +D record can be aligned as follows:

n = ceiling (sRate * recordStartTime) + 1

and the

sampleTime = (n - 1) / sRate

Where

However, in order to avoid rounding unnecessary rounding errors the first formula is actually implemented as follows:

n = ceiling (sRate * (recordStartTime - maxTErr)) + 1

Where maxTErr = 5 * .Machine$double.eps (= normally 5 * 2.220446e-16 = 1.110223e-15 )

Parameters involved

For any signal object and any fragment:

Object details

Header details

Header attributes

The header data encompass the following:

str (CHdr,  max.level=1)

The fields version, patient, recordingId, startTime, headerLength, reserved, nRecords, recordDuration, and nSignals are from the file header.

the startSecondFraction attribute contains the sub second start data specified in the first data record. See the 'The start time' section above.

NOTE \ \ \ \ \ \ In order to avoid annoying rounding errors, the sub second start time is stored separately. trunc (startTime, 'sec') is less accurate.

sHeader is detailed below. The others attributes are derived ones.

Signal header attributes

The signal header data encompass the following:

str (CHdr$sHeader, max.level=1)

The fields label, transducerType, physicalDim, physicalMin, physicalMax, digitalMin, digitalMax, preFilter, samplesPerRecord, and reserved are from the file header. The others are derived. Gain and offset are used to map the digital sample values to the range of physical values.

For annotation signals the only relevant fields are "label" which must have the value "EDF Annotations" (or "BDF Annotations") and "samplesPerRecord".

Signal details

Ordinary signal objects of class 'ebdfCSignal'

The data for ordinary signal objects of class 'ebdfCSignal' encompass the following:

str(CSignals$pulse, max.level=1) 

The attributes startTime, signalNumber, label, isContinuous, isAnnotation, recordedPeriod, totalPeriod, transducerType, sampleBits, sRate, range, and preFilter are (derived) from the header data.

For a continuously recorded signal the totalPeriod is equal to the recordedPeriod. For a not continuously recorded signal the total period equals the start of the last data record plus its duration.

The attributes from and till contain the values of the corresponding actual readEdfSignals parameters. The default values are 0 and Inf.

The start attribute contains the start time and is always zero. The fromSample attribute contains the first sample number and is always 1. By including these attributes, signals and fragments share the same time/sample attributes.

The signal attribute contains the sample data from the EDF / BDF data records. If read with the readEdfSignals parameter physical=TRUE, the default, the digital sample values are mapped to physical values. With physical=FALSE, signals contains the digital sample values.

The physical values are calculated as follows:
physicalValue = gain * digitalValue + offset,
with: gain = (physicalMax - physicalMin) / (digitalMax - digitalMin)
offset = physicalMax - gain * digitalMax

Ordinary signal objects of class 'ebdfFSignal'

The data for ordinary signal objects of class 'ebdfFSignal' encompass the following:

str(FDSignals$`sine 8.5 Hz`, max.level=1) 

For the attributes startTime, signalNumber, label, isContinuous, isAnnotation, recordedPeriod, totalPeriod, from, till, start, fromSample, transducerType, sampleBits, sRate, range and preFilter see the previous section.

For a not continuously recorded signal the total period equals the start of the last data record plus its duration.

The fragments attribute contains the list of recorded fragments.

Signal fragment data

The data of a signal fragment in objects of class 'ebdfFSignal' encompass the following:

str(FDSignals$`sine 8.5 Hz`$fragments[[1]], max.level=1) 

The fromSample attribute contains the sample number of the first sample in this fragment (as if it were a continuous recording). The start attribute the sample time for this sample.

The 'signal' attribute contains the fragment's sample values. These may be physical values (the default) or digital values (see above).

Annotation signals

The data for annotation signal objects of class 'ebdfASignal' encompass the following:

str(ASignals$`EDF Annotations`, max.level=1) 

For the attributes startTime, signalNumber, label, isContinuous, isAnnotation, totalPeriod, from, and till see the section for objects of class 'ebdfCSignal'.

The annotations attribute contains a data frame with the individual annotations.

Annotation data

The data for a single annotation encompass the following:

str(ASignals$`EDF Annotations`$annotations, max.level=1) 

The record attribute refers to the data record the annotation was read from.

The onset attribute contains the time of the annotation relative to the start of the recording.

The duration attribute contains the duration of the annotated event.

The isRecordStart indicates whether or not this annotation is the first one in a data record (and indicates the start time of the recording of that record).

The annotation attribute contains the annotations associated with the onset and duration.

The fromSignal attribute, if present, refers to the signal that contains the annotation. This attributed is present only if the annotations from different signals were merged into one ‘ebdfASignal’ object.

Next step: a quick look

One of the the first things you may want to do with the imported signals is to have look at them.

This can be acheived with one of the several plot packages available, e.g.:

- ‘plot’ (included in the base installation)

- ‘lattice', or

- 'ggplot2'

As an example, the function plotEdfSignals() uses ggplot2 and will plot one or more signals
over some period of time. The plot shown presents the period from 0.2 till 0.5 seconds of the 'sine 8.5 Hz' and the 'sine 50 Hz' signals in CSignals.

plotEdfSignals <- function (signals,labels, from=0, till=Inf) {
    nLabels <- length (labels)
    sRate   <- numeric (length = nLabels)
    fromS   <- integer (length = nLabels)
    tillS   <- integer (length = nLabels)
    sLength <- integer (length = nLabels)
    for (i in 1:nLabels) {
        sRate[i]    <- signals[[labels[i]]]$sRate
        fromS[i]    <- ceiling (sRate[i] * max (from, 0)) +1
        tillS[i]    <- ceiling (sRate[i] * till)
        tillS[i]    <- min (tillS[i], length(signals[[labels[i]]]$signal))
        sLength[i]  <- tillS[i] - fromS[i] + 1 
    }
    totLength  <- sum (sLength)
    cat (" totLength=",  totLength)
    time    <- numeric   (length = totLength)
    signal  <- numeric   (length = totLength)
    label   <- character (length = totLength)
    from <- 1
    for (i in 1:nLabels) {
        till <- from + sLength[i] - 1
        time  [from:till]   <- seq (from=fromS[i]-1, to=(tillS[i]-1)) / sRate[i]
        signal[from:till]   <- signals[[labels[i]]]$signal[fromS[i]:tillS[i]]
        label [from:till]   <- rep(labels[i], sLength[i])
        from <- till + 1
    }
    cat (" | from-1=", from-1,'\n')

    ggplotDF <- data.frame (time=time, signal=signal, label=label)
    ggplot (ggplotDF, aes(x=time, y=signal, colour=label)) + geom_line()
}

if (require(ggplot2)) {
    CSignals <- readEdfSignals (CHdr)
    plotEdfSignals (CSignals, labels=c('sine 8.5 Hz', 'sine 50 Hz'), from=.2, till=0.5)
}

For details about ggplot2 see e.g. the 'R Graphics cookbook'

Enjoy.

Acknowledgement

This package has used code from

References

  1. Specification of EDF
    http://www.edfplus.info/specs/edf.html

  2. Specification of EDF+
    http://www.edfplus.info/specs/edfplus.html

  3. Specification of EDF++
    http://195.154.67.227/en/contribute/edf/

  4. Specification of BDF
    see 'Which file format does BioSemi use' at
    http://www.biosemi.com/faq/file_format.htm

  5. Specification of BDF+
    http://www.teuniz.net/edfbrowser/bdfplus%20format%20description.html

Other useful EDF related sources can be found at:
http://www.edfplus.info/downloads/ /



vagnerfonseca/edf documentation built on May 3, 2019, 2:41 p.m.