Building models is an essential skill for scientists or practioneers
when working with data. In some research areas, such as it is for
medical data, contains sensitive information which should not get
shared. However, most R
models, such as lm
, contains the original
data used for training the model.
This package provides mechanisms to train algorithms without storing the
data or removing model data from a model. For simpler model objects
(e.g. lm
) it is sufficient to remove parts from the object to ensure
privacy. More complex algorithms (e.g. partykit::cforest
), on the
other side, requires adaptions of the fitting and prediction algorithm
to provide full privacy and functionality without storing data.
The functions you get by installing this package are: - A generic
rmdata
method for simpler models to just remove data - Functions to
train and predict more complex models. Available algorithms: - cforest
- traforest
Using a more complex function overwrites the original one and returns
the same object with a suffix .nodat
. For example, training a model
using partykit::cforest
returns an object of class cforest.nodat
.
For this class, special predict methods are implemented and used instead
of the original partykit:::predict.cforest
method.
In order to detect data in an R
object, rmmodeldata
provides a
function searchObjectForValue
which searches through an object for a
specific value.
It is suggested to use the namespace rmmodeldata
for calling
functions to ensure the usage of the correct method. A message is
printed when a method of rmmodeldata
is used. Otherwise, it cannot
be distinguished between the original and the new method.
Searching through objects ignores functions and environments. Thus, if data are stored within an environment it is not detected.
remotes::install_github("difuture-lmu/rmmodeldata")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.