knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
I started to write this vignette a while ago, before I knew object-oriented programming (OOP) in R. So this might be interesting for you if you don't know OOP but want to learn more about all the internals of pyramidi. If you just want to see some use cases or if you know well R6, the other vignettes might be a better place to start.
First load some libraries:
pyramidi::install_miditapyr(envname = "r-reticulate")
library(pyramidi) library(dplyr) library(tidyr) library(purrr) library(ggplot2) library(zeallot)
We'll extract the information of a midi file into dataframe. We'll use the package internal midi file:
midi_file_str <- system.file("extdata", "test_midi_file.mid", package = "pyramidi") midifile <- mido$MidiFile(midi_file_str) ticks_per_beat <- midifile$ticks_per_beat
Now we can load the information of the midifile
into a dataframe:
dfc = miditapyr$frame_midi(midifile) head(dfc, 20)
This dataframe contains the columns of the track index i_track
, meta
(whether the midi event is a note event), and msg
containing named lists of further midi event information.
The MidiFile()
function of mido
also yields the ticks_per_beat
of the file:
ticks_per_beat
The miditapyr$unnest_midi()
function transforms the msg
column of the dataframe to a wide format, where every new column name corresponds to the names in the lists in msg
(like tidyr::unnest_wider()
):
df <- miditapyr$unnest_midi(dfc) %>% as_tibble() head(df, 20)
Except the name
column this seems to be the same as
dfc %>% unnest_wider(msg)
In the midi format, time is treated as relative increments between events (measured in ticks).
In order to derive the total time passed, you can use the function tab_measures()
:
dfm <- tab_measures(df, ticks_per_beat, c("m", "b")) %>% # create a variable `track` with the track name (in order to have it in the plot below) mutate(track = ifelse(purrr::map_chr(name, typeof) != "character", list(NA_character_), name)) %>% unnest(cols = track) %>% fill(track) dfm
This function adds further columns:
ticks
: specifying the total ticks passed,t
: specifying the total time in seconds passed,m
: specifying the total measures (bars) passed,b
: specifying the total beats passed,i_note
: unique ascending index for every track and midi note in the midi file.You can split the dataframe in two by whether the events are meta or not:
dfm %>% miditapyr$split_df() %->% c(df_meta, df_notes)
df_meta %>% as_tibble()
df_notes %>% as_tibble()
Each note in the midi file is characterized by a note_on
and a note_off
event.
In order to generate a piano roll plot with ggplot2, we need to tidyr::pivot_wider()
those events.
This can be done with the function pivot_wide_notes()
:
df_not_notes <- df_notes %>% dplyr::filter(!stringr::str_detect(type, "^note_o[nf]f?$")) df_notes_wide <- df_notes %>% dplyr::filter(stringr::str_detect(type, "^note_o[nf]f?$")) %>% # tab_measures(df_meta, df_notes, ticks_per_beat) %>% pivot_wide_notes() %>% left_join(pyramidi::midi_defs) df_notes_wide
In the new format, the data has half the number of rows.
The columns m
, b
, t
, ticks
, time
and velocity
are each replaced by
two columns with the suffix _note_on
and _note_off
.
Now we have the midi data in the right format for the piano roll plot:
df_notes_wide %>% ggplot() + geom_segment( aes( x = m_note_on, y = note_name, xend = m_note_off, yend = note_name, color = velocity_note_on ) ) + # each midi track is printed into its own facet: facet_wrap( ~ track, ncol = 1, scales = "free_y") + guides(color=guide_colorbar(title="Note velocity")) + labs( title = "Piano roll of the note events in the midi file", subtitle = "Only notes played are shown." ) + xlab("Measures") + scale_x_continuous(breaks = seq(0, 16, 4), minor_breaks = 0:16) + scale_colour_gradient() + theme_minimal()
The new format also allows to easily manipulate the midi data. For instance, let's put the volume (called velocity
in midi) of the first beat in every bar to the maximum (127), and to half of its original value otherwise:
df_notes_wide_mod <- df_notes_wide %>% mutate( velocity_note_on = ifelse( # As it's a 4/4 beat, the first beat of each bar is a multiple of 4: b_note_on %% 4 == 0, 127, velocity_note_on / 2 ) )
Let's compare the modified value to the original one:
df_notes_wide %>% select(b_note_on, velocity_note_on) %>% bind_cols( new = df_notes_wide_mod$velocity_note_on )
With an ifelse()
statement, we modified the volume of the midi notes, depending on if they're the first beat in the measure or not.
Other possible manipulations could be for instance:
round()
ing the note_on
/note_off
times,group_by(floor(m_note_on))
-summarize()
logic, orgroup_by(floor(m_note_on))
- mutate()
logic.We can transform the wide midi data back to the long format:
df_notes_long <- pivot_long_notes(df_notes_wide)
We can now add the non note events:
df_midi_out <- merge_midi_frames(df_meta, df_notes_long, df_not_notes) df_midi_out
The time
value in midi format is given by the number of ticks
passed between events.
Now we can transform the data back to a dataframe of the same format as the one we got with miditapyr$frame_midi()
:
dfc2 <- df_midi_out %>% # When reticulate converts R dataframes to pandas, there are complications # with character columns containing missing values. # repair_reticulate_conversion = TRUE, repairs that in the miditapyr python # code: miditapyr$nest_midi(repair_reticulate_conversion = TRUE) as_tibble(dfc2)
And we can save it back to a midi file:
miditapyr$write_midi(dfc2, ticks_per_beat, "test.mid")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.