README.md

hcorp: Music corpora for harmonic analysis

lifecycle Travis build
status AppVeyor build
status Coverage
status DOI

This R package provides several datasets of chord sequences. These datasets are expressly for research purposes only.

For more details, see the package’s documentation (e.g. ?classical_1).

Installation

You can install the current version of hcorp from Github by entering the following commands into R:

if (!require(devtools)) install.packages("devtools")
devtools::install_github("hcorp")

Example usage

The hcorp package is best used in tandem with the hrep package. The hrep package provides the underlying representations for the corpora in hcorp, as well as methods for manipulating and visualising them.

You can load these packages into the global namespace as follows:

library(hcorp)
library(hrep)
library(magrittr) # Provides the pipe operator, %>%

The hrep package currently contains three corpora:

classical_1
#> 
#> A corpus of 1022 sequences 
#>   total size = 199254 symbols 
#>   symbol type = 'pc_chord'
#>   coded = true 
#>  (Metadata available)

popular_1
#> 
#> A corpus of 739 sequences 
#>   total size = 74093 symbols 
#>   symbol type = 'pc_chord'
#>   coded = true 
#>  (Metadata available)

jazz_1
#> 
#> A corpus of 1186 sequences 
#>   total size = 42822 symbols 
#>   symbol type = 'pc_chord'
#>   coded = true 
#>  (Metadata available)

Internally, a corpus is a list of encoded vectors.

classical_1[1:3] %>% as.list
#> [[1]]
#> Encoded vector of type 'pc_chord', length = 53 (metadata available)
#> 
#> [[2]]
#> Encoded vector of type 'pc_chord', length = 47 (metadata available)
#> 
#> [[3]]
#> Encoded vector of type 'pc_chord', length = 39 (metadata available)

Encoded vectors are objects of class coded_vec.

classical_1[[1]] %>% class
#> [1] "coded_vec_pc_chord" "coded_vec"          "integer"

Internally, encoded vectors are sequences of integers. This is good for memory efficiency, and useful for certain modelling approaches.

classical_1[[1]] %>% as.integer %>% head
#> [1] 14481  8473 12553 14481  4245  8465

These vectors can be decoded with the function decode.

classical_1[[1]][1:3] %>% decode
#> Vector of type 'pc_chord', length = 3 (metadata available)

classical_1[[1]][1:3] %>% decode %>% as.list
#> [[1]]
#> Pitch-class chord: [7] 2 11
#> 
#> [[2]]
#> Pitch-class chord: [4] 0 7 11
#> 
#> [[3]]
#> Pitch-class chord: [6] 2 9

Corpora and sequences can optionally store metadata.

metadata(classical_1)
#> $description
#> [1] "A selection of common-practice Western tonal music"

metadata(classical_1[[1]])
#> $description
#> [1] "bach-chor001"
#> 
#> $keysig
#> [1] 1
#> 
#> $mode
#> [1] 0

Corpora and sequences can be subsetted and combined like lists.

classical_1[1:3]
#> 
#> A corpus of 3 sequences 
#>   total size = 139 symbols 
#>   symbol type = 'pc_chord'
#>   coded = true 
#>  (Metadata available)

classical_1[[1]]
#> Encoded vector of type 'pc_chord', length = 53 (metadata available)

classical_1[[1]][1:3]
#> Encoded vector of type 'pc_chord', length = 3 (metadata available)

c(classical_1[1:3],
  popular_1[1:3])
#> 
#> A corpus of 6 sequences 
#>   total size = 313 symbols 
#>   symbol type = 'pc_chord'
#>   coded = true

Pardo & Birmingham templates

Several of these corpora were converted into chord sequences using Pardo & Birmingham’s (2002) algorithm with an extended template dictionary. This extended dictionary is provided here:

| Pitch classes | Name | Weight | |---------------|---------|--------| | 0 4 7 | maj | 0.436 | | 0 4 7 10 | dom7 | 0.219 | | 0 3 7 | min | 0.194 | | 0 3 6 9 | dim7 | 0.044 | | 0 3 6 10 | hdim7 | 0.037 | | 0 3 6 | dim | 0.018 | | 0 4 7 11 | maj7 | 0.2 | | 0 3 7 10 | min7 | 0.2 | | 0 4 8 | aug | 0.02 | | 0 7 | no3 | 0.05 | | 0 7 10 | min7no3 | 0.05 |

Note: only the first 6 (maj to dim) are present in Pardo & Birmingham’s original paper, the rest were added for this work.

References

Broze, Y., & Shanahan, D. (2013). Diachronic changes in jazz harmony: A cognitive perspective. Music Perception, 31(1), 32–45. https://doi.org/10.1525/rep.2008.104.1.92

Harrison, P. M. C., & Pearce, M. T. (2018). An energy-based generative sequence model for testing sensory theories of Western harmony. In Proceedings of the 19th International Society for Music Information Retrieval Conference (pp. 160–167). Paris, France.

Pardo, B., & Birmingham, W. P. (2002). Algorithms for chordal analysis. Computer Music Journal, 26(2), 27–49. https://doi.org/10.1162/014892602760137167



pmcharrison/hcorp documentation built on March 16, 2023, 8:46 a.m.