knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
library(mpactr)
mpactr is built on an R6 class-system, meaning it operates on reference semantics in which data is updated in-place. Compared to a shallow copy, where only data pointers are copied, or a deep copy, where the entire data object is copied in memory, any changes to the original data object, regardless if they are assigned to a new object, result in changes to the original data object. We can see this below.
data2 <- import_data( example_path("cultures_peak_table.csv"), example_path("cultures_metadata.csv"), format = "Progenesis" ) get_peak_table(data2)[, 1:5]
Where the raw data object has r nrow(get_peak_table(data2))
ions in the feature table.
data2_mispicked <- filter_mispicked_ions(data2, ringwin = 0.5, isowin = 0.01, trwin = 0.005, max_iso_shift = 3, merge_peaks = TRUE, merge_method = "sum", copy_object = FALSE ) get_peak_table(data2_mispicked)[, 1:5]
Running the filter_mispicked_ions
filter, with default setting copy_object = FALSE
(operates on reference semantics) results in r nrow(get_peak_table(data2_mispicked))
ions in the feature table.
Even though we created an object called data2_mispicked
, the original data2
object was also updated and now has r nrow(get_peak_table(data2))
ions in the feature table:
get_peak_table(data2)[, 1:5]
We recommend using the default copy_object = FALSE
as this makes for an extremely fast and memory-efficient way to chain mpactr filters together (see the Filter article); however, if you would like to run the filters individually with traditional R style objects, you can set copy_object
to TRUE
as shown in the filter examples.
The R6 class-system operates on reference semantics in which data is updated in-place. Compared to a shallow copy, where only data pointers are copied, or a deep copy, where the entire data object is copied in memory, any changes to the original data object, regardless if they are assigned to a new object, result in changes to the original data object. R6 Accomplishes this by taking advantaging of the environment system in R. Inside R, everything is created inside a base R environment. This contains all functions, saved variables, libraries, references, etc. Using R6 classes allows us to easily add this functionality to our R package.
In general, R relies on reference semantics to store data away from the outside because R environments are a container for a copious amount of data. In a normal R session, the base R environment is the outermost environment, allowing you to access to everything you need.
Reference semantics become noticeable when you send an environmental variable to a function. In R, functions rely on call-by-value semantics. Call-by-value is described as functions treating parameterized values (values specified when calling the function) as local variables when in the function. Anything you do to the variable in the function will have no effect on the variables outside. This follows traditional copy by value semantics. However, R does not allow you to send over variables by reference due to this idea. So, you can think of functions as temporary environments in R. What makes these environments so powerful, is the fact that you can send an environment to a function, and it will not copy the environment. This allows you to send variables by reference to functions. R6 classes rely on this, and mpactr uses this for speedy execution.
Memory usage really shines when you use R6 classes vs. a traditional workflow, such as copy by value. In a traditional workflow, all of the data must be copied to call functions and compute operations, using R6 classes we can minimize that problem, improving performance for large datasets.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.