subset.humdrumR | R Documentation |
HumdrumR defines subset() (base R) and filter() (tidyverse) methods
for humdrumR data—these two .humdrumR
methods are synonymous,
working exactly the same.
They are used to "filter" the contents of the underlying humdrum table.
R's standard indexing operators ([]
and [[]]
) can also be used to filter data—
you can read about these indexing options here—however,
the subset()
/filter()
can accomplish much more sophisticated filtering commands than the indexing
methods.
Filtering with subset()
/filter()
is (by default) not destructive,
allowing you to recover the filtered data
using removeSubset()
or unfilter()
(which are also synonyms).
## S3 method for class 'humdrumR'
subset(x, ..., dataTypes = "D", .by = NULL, removeEmptyPieces = TRUE)
## S3 method for class 'humdrumR'
filter(.data, ..., dataTypes = "D", .by = NULL, removeEmptyPieces = TRUE)
removeEmptyFiles(x)
removeEmptyPieces(x)
removeEmptySpines(x)
removeEmptyPaths(x)
removeEmptyRecords(x)
removeEmptyStops(x)
removeSubset(humdrumR, fields = dataFields(humdrumR), complement = NULL)
unfilter(humdrumR, fields = dataFields(humdrumR), complement = NULL)
complement(humdrumR, fields = dataFields(humdrumR))
x , .data , humdrumR |
HumdrumR data. Must be a humdrumR data object. |
... |
Arbitrary expressions passed to with(in). The "within" expression(s) must evaluate to either scalar or full-length |
dataTypes |
Which types of humdrum records to include. Defaults to Must be a single |
.by |
Optional grouping fields; an alternative to using group_by(). Defaults to Must be If not |
removeEmptyPieces |
Should empty pieces be removed? Defaults to Must be a singleton |
fields |
Which fields to unfilter or complement? Defaults to all data fields in the Must be |
complement |
Which field to use as the subset complement to restore? By default Must be a single |
subset()
and filter()
are passed one or more expressions which are using the
fields of the humdrum table using a call to within.
This evaluation can thus include all of within.humdrumR()
's functionality (and arguments)
including group-apply.
The only requirement is that the expressions/functions fed to subset()
/filter()
must be return a logical (TRUE
/FALSE
) vector (NA
values are treated as FALSE
).
The returned vector must either be scalar (length 1
), or be the same length as the input data (the number
of rows in the humdrum table).
If the logical result is scalar, it will be recycled to match the input length: this is useful
in combination with group_by()
; for example, you can split the data into groups, then
return a single TRUE
or FALSE
for each group, causing the whole group to be filtered or not.
Note that subset()
/filter()
are incompatible with contextual windows; if
your data has contextual windows defined, they will be removed (with a warning message) before filtering.
When using subset()
/filter()
, humdrumR doesn't actually delete the data you filter out.
Instead, what these functions do is set all filtered data fields to NA
(null) values, and
changing their data type to "d"
.
This ensures that the humdrum-syntax of the data is not broken by filtering!
Thus, when you print a
filtered humdrumR object you'll see all the filtered data points
turned to null data (.
).
Since, most humdrumR
functions ignore null data (d
) by default, the data is effectively filtered out
for most practical purposes.
However, if you need to use those null ('d'
) data points (like, with ditto()
), they
can be accessed by setting dataTypes = 'Dd'
in many functions.
See the ditto()
documentation for examples.
In many cases, filtering out large parts of your data leaves a bunch of empty null
data points ("."
) in your printout...which maybe be difficult to read.
If you want to actually remove these filtered data points, you can call removeEmptyFiles()
,
removeEmptyPieces()
, removeEmptySpines()
, removeEmptyPaths()
, removeEmptyRecords()
, or removeEmptyStops()
.
These functions will safely remove null data without breaking the humdrum syntax;
They do this by going through each piece/spine/path/record and checking if all the data in that region
is null; if, and only if, all the data is null, that portion of data will be removed.
By default, subset.humdrumR()
automatically calls removeEmptyPieces()
before returning.
However, you can stop this by specifying removeEmptyPieces = FALSE
.
If filtered pieces, files, or spines are removed from a corpus
(using removeEmptyPieces()
or removeEmptySpines()
)
the File
, Piece
, Record
and/or Spine
fields are renumbered to represented the remaining regions,
starting from 1
.
For example, if you have a corpus of 10 pieces and remove the first piece (Piece == 1
),
the remaining pieces are renumbered from 2:10
to 1:9
.
Spine/record renumbering works the same, except it is done independently within each piece.
When subset()
is applied, humdrumR
stores the complement of the subset of each data
field is retained (unless an explicit removeEmpty...()
function is called).
The removeSubset()
or unfilter()
functions can be used to restore the original data,
by combining the subset with the complement.
The fields
argument can be used to control which data fields are unfiltered—by default,
all data fields are unfiltered.
Normally, each data field is restored with its own complement data.
However, the complement
argument can be used to specify an field to use as the complement.
This allows you to, for instance, different parts of separate fields into a single field.
The complement()
function will directly swap the data-field subsets with their complements.
The indexing operators []
and [[]]
can be used as shortcuts for common subset
calls.
humData <- readHumdrum(humdrumRroot, "HumdrumData/BachChorales/chor00[1-4].krn")
# remove spine 1 (non destructive)
humData |> subset(Spine > 1)
# remove spine 1 (destructive)
humData |> subset(Spine > 1) |> removeEmptySpines()
# remove odd numbered bars
humData |> group_by(Bar) |> subset(Bar[1] %% 2 == 1)
# unfiltering and complement
humData |> filter(Spine %in% 1:2) |> complement()
humData |> filter(Spine %in% 1:2) |> unfilter()
humData |> filter(Spine %in% 1:2) |> solfa() |> unfilter(complement = 'Token')
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.