knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(width = 80)
The package eeguana provides a framework for doing simple pre-processing with specialized functions and manipulating EEG data with dplyr verbs (e.g., eeg_mutate
, eeg_filter
, eeg_summarize
) extended to a new class eeg_lst
, and ggplot
wrapper functions. The new class is inspired by tidyverse principles but it's not really "tidy" (due to space considerations), it's a list of (i) a wide table that contains the signal amplitudes at every sample point of the EEG, (ii) an events table with information about markers (or triggers), blinks and other exported information, and (iii) a long table with experimental information, such as participant (recording
), conditions, etc.
While it's possible to transform the eeg_lst
to a data.frame
, data.table
or a tibble
(with as.data.frame()
, as.data.table()
and as_tibble()
), the motivation for manipulating the data in the eeg_lst
format has to do with size considerations. In this case, the original file was 113 MB, converting it to a long format entails a lot of repetition and generates an object of 556 MB. While this will still work here, a long format quickly becomes prohibitive in real settings with longer recordings.
Dplyr-like verbs always return an eeg_lst
object, which allows us to use magrittr
's pipe, %>%
(see ?`dplyr-eeguana`
) or the relatively new R pipe |>
.
In addition, eeg_
/ch_
functions will also return an eeg_lst
unless they have a suffix _tbl
that indicates they return a data frame. In general, we will work with the eeg_lst
, unless we want to modify the channels information (with channels_tbl()
) or the events table containing markers, artifacts ans custom annotations (with events_tbl
).
Here, I exemplify the use of eeguana with raw EEG data exported in the format of BrainVision 1.0. The data belong to a simple experiment where a participant was presented 100 faces and 100 assorted images in random order. The task of the experiment was to mentally count the number of faces.
First we download the data:
# Run the following or just download the files from raw_faces folder in https://osf.io/tbwvz/ library(httr) GET("https://osf.io/wv5hp//?action=download", write_disk("./s1_faces.vhdr", overwrite = TRUE), progress() ) GET("https://osf.io/c8wea//?action=download", write_disk("./s1_faces.vmrk", overwrite = TRUE), progress() ) GET("https://osf.io/uhsde//?action=download", write_disk("./s1_faces.eeg", overwrite = TRUE), progress() )
BrainVision 1.0 exports three files: s1_faces.vhdr
, s1_faces.vmrk
, and
s1_faces.eeg
. The file s1_faces.vhdr
contains the metadata and links to the other
two files, s1_faces.vmrk
contains the triggers and other events in the samples,
and s1_faces.eeg
contains the signals at every sample for every channel recorded.
We load the relevant packages first. Here, I use tidytable
which is a faster alternative to dplyr
with a very similar interface.
library(tidytable) #or library(dplyr) library(ggplot2) library(stringr) library(eeguana) set.seed(123) # ICA will always find the same components
We first need to read the data:
faces <- read_vhdr("s1_faces.vhdr")
The function read_vhdr()
creates a list with data frames for the signal, events,
segments information, and incorporates in its attributes generic EEG information.
faces
summary(faces)
We see that there is no electrode positions in the object, but since we know that the layout was a standard 10/20, we'll add this layout to the object, using the dataset layout_32_1020
. This is required to be able to create topographic plots.
channels_tbl(faces) ## In case the order of the electrodes is different, we do a left_join instead of replacing the table: channels_tbl(faces) <- select(channels_tbl(faces), .channel) |> left_join(layout_32_1020)
The plots that eeguana produce are ggplot
objects that can be modified like regular ggplot
.
plot(faces) + ggtitle("All the experiment")
Here the preprocessing is done with only one EEG dataset. If the EEG data of different participants is available, one can repeat a similar preprocessing pipeline for each dataset and bind the EEG datasets with bind()
.
In this dataset, the signal from all electrodes is monopolar and referenced to the left mastoid (M1). We want the signal (excluding the EOG channels) to be referenced to linked (left and right) mastoids (M1 and M2).
faces <- eeg_rereference(faces, -VEOG, -HEOG, .ref = c("M1", "M2"))
We apply a band pass filter of 0.1 to 30 Hz to all the channels except the EOG channels. We don't segment yet, because discontinuities in the signal create artifacts on the edges.
faces_filt <- eeg_filt_band_pass(faces, -HEOG, -VEOG, .freq = c(.1, 30))
We want to apply ICA to as much data as possible, but to "representative" data: to data of the experiment and not when the participants were moving or reading the instructions. For the same reason, we want to the ignore artifacts that we are sure are not representing brain activity, because of the extreme amplitudes in the signal.
We first cut a large segment that excludes data before and after the experiment was ran using eeg_segment()
(that is, only data from the marker "s111" to "s121"), and then we mark differences of 200 microvolts between peaks with eeg_artif_minmax()
.
faces_ls <- eeg_segment(faces_filt, .description == "s111", .end = .description == "s121") |> eeg_artif_minmax(-HEOG, -VEOG, .threshold = 200, .window = 200, .unit = "ms")
eeg_artif_minimax()
only adds the artifacts in the events table and doesn't modify the signal. We can have a look at the type of artifacts that were detected, by plotting the signal and the events with annotate_events()
. It would be a good idea to just check a couple of seconds of the signal, for that we use the dplyr-like function eeg_filter()
.
faces_ls |> eeg_select(-HEOG, -VEOG)|> eeg_filter(as_time(.sample, .unit = "s") |> between(0, 90)) |> plot() + annotate_events() + theme(legend.position = "bottom")
Now we can run ICA, removing the EOG and reference electrodes, and ignoring the artifacts.
## By default, it will ignore artifacts faces_ica <- faces_ls |> eeg_ica(-HEOG, -VEOG, -M1, -M2, .method = adapt_fast_ICA, .ignore = .type == "artifact")
We can now check the different topographic plots of the ICAs with plot_components()
, and their correlation with the EOG channels looking a the summary of the object.
faces_ica |> plot_components() eeg_ica_summary_tbl(faces_ica)
We'll look closer at the ICA that are more likely to be related to eye movements, and we'll compare with signals that look like blinks and saccades:
faces_ica <- faces_ica |> eeg_artif_step(HEOG, VEOG, .threshold = 30, .window = 200, .unit = "ms", .freq = c(1, 10)) |> eeg_artif_peak(VEOG, .threshold = 100, .freq = c(1, 10))
In order to investigate the ICA, we are using an experimental function not yet exported (that's why we use the :::). This function will probably converted into a shiny app. It's possible to get the same plots by using a combination of plot_topo
, eeg_ica_show
, eeg_ica_summary_tbl
, and some ggplot
and cowplot
functions.
We'll see how the components explaining the most variance and most correlated to the EOG channels behave:
#samples with blinks s_peaks <- filter(events_tbl(faces_ica), str_starts(.description, "peak")) |> pull(.initial) eeguana:::plot_ica.eeg_ica_lst(faces_ica, samples = seq(s_peaks[1]-2000,s_peaks[1]+2000))
The clearest ones is ICA5, and so we'll just remove that one.
faces_icaed <- faces_ica |> eeg_ica_keep(-ICA5) |> as_eeg_lst()
Now we'll segment the data to appropriate check if there are still artifacts.
events_tbl(faces_icaed) <- events_tbl(faces_icaed) |> filter(!.type %in% "artifact") faces_seg <- faces_icaed |> eeg_select(-description, -type) |> eeg_segment(.description %in% c("s70", "s71"), .lim = c(-.1, .5)) faces_seg_artif <- faces_seg |> eeg_artif_minmax(-HEOG, -VEOG, .threshold = 100, .window = 150, .unit = "ms") |> eeg_artif_step(-HEOG, -VEOG, .threshold = 50, .window = 200, .unit = "ms") ## extracts the ids of the segments with artifacts bad <- filter(events_tbl(faces_seg_artif), .type == "artifact") |> pull(.id) |> unique() ## Show the segment with artifact and one before and after: faces_seg_artif |> eeg_filter(.id %in% c(bad-1, bad, bad+1)) |> eeg_select(-VEOG, -HEOG) |> plot() + annotate_events() + theme(legend.position = "bottom") faces_seg <- faces_seg_artif |> eeg_events_to_NA(.type == "artifact", .entire_seg = TRUE, .drop_events = TRUE) summary(faces_seg)
Finally, we can baseline the segments:
faces_seg <- faces_seg |> eeg_baseline()
We edit the segmentation information and add more descriptive labels.
faces_seg <- faces_seg |> eeg_mutate( condition = ifelse(description == "s70", "faces", "non-faces") ) |> eeg_select(-type) faces_seg
With some ggplot
skills, we can create customized plots. ggplot
is overloaded to work on an eeg_lst
by first downsampling the signal when necessary, and converting it to a long-format data frame that is feed into ggplot
. This object can then be customized. (Notice that the channels, or component names are in a .key
column and their amplitude in .value
column, and instead of samples there are now .time
in seconds).
## ggplot uses internally a table that looks like this: faces_seg |> eeg_filter(.id %in% 1:3) |> as.data.frame() faces_seg |> eeg_select(O1, O2, P7, P8) |> ggplot(aes(x = .time, y = .value)) + geom_line(alpha = .1, aes(group = .id, color = condition)) + stat_summary( fun = "mean", geom = "line", alpha = 1, size = 1.5, aes(color = condition) ) + facet_wrap(~.key) + geom_vline(xintercept = 0, linetype = "dashed") + geom_vline(xintercept = .17, linetype = "dotted") + theme(legend.position = "bottom")
We can see here the N170 component in the faces condition for the average trial on a single participant. We can investigate the signal by averaging the channels of the occipital and parietal lobes using transmute()
, and the special function, chs_mean()
, a wrapper for rowMeans()
, which takes as arguments the relevant channels and whether missing values should be omitted from the calculations.
faces_seg |> eeg_transmute( Occipital = chs_mean(across(starts_with("O")), na.rm = TRUE), #O1, O2, Oz Parietal = chs_mean(across(starts_with("O")), na.rm = TRUE) # P3, P4, P7, P8, Pz ) |> ggplot(aes(x = .time, y = .value)) + geom_line(alpha = .1, aes(group = .id, color = condition)) + stat_summary( fun = "mean", geom = "line", alpha = 1, size = 1, aes(color = condition) ) + facet_wrap(~.key) + theme(legend.position = "bottom")
We can also calculate the ERPs and then plot them in their layout (and we'll add the same theme that plot uses for eeg_lst
s):
ERP_faces <- faces_seg |> eeg_group_by(.sample, condition) |> eeg_summarize(across_ch(mean, na.rm = TRUE)) ERP_plot <- ERP_faces |> ggplot(aes(x = .time, y = .value)) + geom_line(aes(color = condition)) + facet_wrap(~.key) + theme(legend.position = "bottom") + ggtitle("ERPs for faces vs non-faces") + theme_eeguana() ERP_plot |> plot_in_layout()
Another possibility is to create a topographic plot of the two conditions, by first making segments that include only the interval .1-.2 s after the onset of the stimuli, creating a table with interpolated amplitudes and using the ggplot wrapper plot_topo()
.
faces_seg |> eeg_filter(between(as_time(.sample, .unit = "s"), .1, .2)) |> eeg_group_by(condition) |> eeg_summarize(across_ch(mean, na.rm = TRUE)) |> plot_topo() + annotate_head() + geom_contour() + annotate_electrodes(colour = "black") + facet_grid(~condition)
For more specialized plots or analyses, it might be necessary to extract the data in long data frame format first. Even though, we can do this without transforming the object, here, we transform the data first and then we visualize independent t-tests at every electrode and time point.
df <- faces_seg |> eeg_select(O1, O2, P7, P8) |> as_tidytable() |> #or alternatively as_tibble() if one uses dplyr # We can use regular dplyr/tidytable functions now group_by(.key, .time) |> summarize( `t-value` = t.test( .value[condition == "faces"], .value[condition == "non-faces"] )$statistic ) df
Then we just load the data frame into ggplot
.
ggplot(df, aes(x = .time, y = `t-value`)) + geom_line() + facet_wrap(~.key)
However, this can be also done in the without transforming the eeg_lst
object:
faces_seg_t <- faces_seg |> eeg_select(O1, O2, P7, P8) |> eeg_group_by(.sample) |> eeg_summarize(across_ch(list(t = ~t.test( .[condition == "faces"], .[condition == "non-faces"] )$statistic))) faces_seg_t |> ggplot(aes(x = .time, y = .value)) + geom_line(alpha = .1, aes(group = .id)) + stat_summary(fun = "mean", geom = "line", alpha = 1, size = 1) + facet_wrap(~.key) + theme(legend.position = "bottom")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.