In Computational-Cognitive-Musicology-Lab/humdrumR: humdrumR

source('vignette_header.R')

Welcome to "Getting started with r hm"! This article provides a quick introduction to the basics of r hm, getting you started loading humdrum data and performing (very) simple analyses of humdrum data. Before you continue, make sure r hm is installed: how to install humdrumR.
Once it's installed, you can open an R session and load the library using the command library(humdrumR)---now you are ready to rock!

This article, like all of our articles, closely parallels information in r hm's detailed code documentation, which can be found in the "Reference" section of the r hm homepage. Once r hm is installed and loaded, the code documentation can also be accessed directly within an R session by using the ? command, like ?humdrumR, Anywhere in one of our articles where you see a named variable followed by parentheses, like kern() or recip(), you can call ?kern or ?recip to see the corresponding documentation.

library(humdrumR)
humdrumR(syntaxHighlight = FALSE)

Quick Start

Let's just dive right in!

To illustrate how r hm works, we'll need some humdrum data to work with. Fortunately, r hm comes packaged with a small number of humdrum data files just for you to play around with. These files are stored in the directory where your computer installed r hm, in a subdirectory called "HumdrumData". You can move your R session to this directory using R's "set working directory" command: setwd(humdrumRroot). Once you're in the humdrumR directory, you can use the base R dir function to see what humdrum data is available to you.

library(humdrumR)

setwd(humdrumRroot)

dir('HumdrumData')

It looks like there are r humdrumR:::num2print(length(dir('HumdrumData'))) directories of humdrum data available to you. Using dir again, we can look inside one: let's start with the "BachChorales" directory.

dir('HumdrumData/BachChorales')

There are r humdrumR:::num2print(length(dir('HumdrumData/BachChorales'))) files in the directory, named "chor001.krn", "chor002.krn", etc. These are humdrum plain-text files, representing ten chorales by J.S. Bach; each file contains four spines (columns) of **kern data, representing musical pitch and rhythm (among other things). Take a minute to find the files in your computer's finder/explorer and open them up with a simple text editor. One of the core philosophies of r hm is that we maintain a direct, transparent relationship with our symbolic data---so always take the time to look at your data! You can also do this within Rstudio's "Files" pane---in fact, Rstudio will make things extra easy for you because you can (within the Files pane) click "More" > "Go To Working Directory" to quickly find the files.

Reading humdrum data

Now that we've found some humdrum data to look at, let's read it into r hm. We can do this using r hm's readHumdrum() command. Try this:

readHumdrum('HumdrumData/BachChorales/chor001.krn') -> chor1

This command does two things:

The readHumdrum() function will read the "chor001.krn" file into R and create a r hm data object from it.
This new object will be saved to a variable called chor1. (The name 'chor1' is just a name I chose---you are welcome to give it a different name if you want.)

Once we've created our chor1 object (or whatever you chose to call it), we can take a quick look at what it is by just typing its name on the command line and pressing enter:

chor1

(In R, when you enter something on the command line, R "prints" it out for you to read.) The print-out you see shows you the name of the file, the contents of the file, and some stuff about "Data fields" that you will learn about in our next article.

Cool! Still, looking at a single humdrum file is not really that exciting. The whole point of using computers is that they allow us to work with large amounts of data. Luckily, r hm makes this very easy. Check out this next command:

readHumdrum('HumdrumData/BachChorales/chor0') -> chorales

Instead of writing 'chor001.krn', I wrote 'chor0'. When we feed the string 'chor0' to readHumdrum(), it won't just look for a file called "chor0"; it will read any file in that directory whose name contains the sub-string "chor0"---which in this case is all ten files! Try printing the new chorales object to see how it is different:

chorales

Wow! We've now got a "humdrumR corpus of r humdrumR:::num2print(length(chorales)) pieces"---and that's nothing: readHumdrum() will work just as well reading hundreds or thousands of files! Notice that when you print a r hm object, r hm shows you the beginning of the first file and the end of the last file, as well as telling you how many files there are in total.

readHumdrum() has a number of other cool options which you can read about in more detail in our humdrumR read/write tutorial.

Counting Things

Once we have some data loaded, the next thing a good computational musicologist does is start counting! To count the contents of data, we can use the count() function.

chorales |> 
  count()

That's quite a mess! What have we done? When we pass our chorales data to count(), it counted all the unique data tokens (ignoring non-data tokens, like barline and interpretations) in the data. There are a lot of unique tokens in this data, so it's not super helpful. Maybe we'd like to look at just the twenty most common tokens in the chorales? To this, we can pass the count through to two base R functions, sort() and tail():

chorales |> count() |> sort() |> head(n = 1) -> most

chorales |> 
  count() |>
  sort() |>
  head(n = 20)

Ah, that's more promising! We see that the most common token is a quarter-note E4 (4e), which occurs r humdrumR:::num2print(most$n) times.

Separating pitch and rhythm

To make our tallies more useful, we might want to count only the pitch or rhythm part of the **kern data. To do this, we need to to be able to extract the pitch/rhythm from the original **kern tokens, which we can do that using r hm's suite of pitch and rhythm functions. For example, let's try the pitch() function:

chorales |> 
  pitch()

The pitch() function takes the original **kern tokens, reads the pitch part of each token, and translates it to scientific pitch notation. Let's pass that to count:

chorales |>
  pitch() |>
  count()

Pretty cool, but still quite a big table. Maybe we'd like to ignore octave information for now? Luckily, r hm's pitch functions have a "simple" argument, which can be used to ask for only simple pitch information (no octave).

chorales |>
  pitch(simple = TRUE) |>
  count()

We can make plot of our nice simple-pitch table, using r hm's draw() function:

chorales |>
  pitch(simple = TRUE) |>
  count() |>
  draw()

Instead of pitch, we could do the same sort of counting of rhythm information, for example, using the notehead() function:

chorales |>
  notehead() |>
  count() |>
  draw()

Filtering

Sometimes, we might only want to look at a subset of our data. For example, maybe we only want to count notes sung by the soprano. In the Bach chorale data we are working with, the soprano voice is always in the fourth spine (column). We can use the filter() function to indicate a subset we'd like to study:

chorales |> 
  pitch(simple = TRUE) |> 
  filter(Spine == 4) |>
  count()

Let's try something even cooler. Notice that, in the chorale data, there are tandem interpretations that look like *G: and *E:. These are indications of the key. Anytime you read humdrum data that has these key interpretations, r hm will read them into a "field" called Key. We could, for example, count all the notes sung when the key is G major like this:

chorales |>
  filter(Key == 'G:') |>
  pitch(simple = TRUE) |>
  count()

Guess what? There are a bunch more "fields" hidden in your r hm data object that you can use...and you can make your own! Check out our next article, on r hm's data fields, to learn more.

What next?

You've gotten started, but there is much more to learn! To keep learning check out the other articles on the humdrumR homepage. If you want to continue along the path we've started here, the next articles to check out are probably HumdrumR data fields, Getting to know your data, Filtering humdrum data, and Working with humdrum data. Since most musicological analysis involves pitch or rhythm, you'll probably want to learn about relevant ideas from the Pitch and tonality and Rhythm and meter articles.

If the humdrum data you are working with is complex---e.g., including multiple different exclusive interpretations, spine paths, or multi-stops---you'll probably find you need to check out the Shaping humdrum data article, which will give you tools to deal with with more complex humdrum data sets.

Computational-Cognitive-Musicology-Lab/humdrumR documentation built on Oct. 22, 2024, 9:28 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

Tweet to @rdrrHQ

GitHub issue tracker

ian@mutexlabs.com