parsnip uses three concepts to describe models:
The model type specifies its mathematical structure.
linear_reg()
is for models that predict a numeric outcome using a linear combination of predictors and coefficients. The model mode reflects the type of prediction outcome.
"classification"
, "regression"
, and "censored regression"
. The model engine is a designation of how the model should be fit.
"ranger"
for the ranger package). There are parsnip extension packages that use the parsnip model functions to define new engines. For example, the poissonreg package has engines for the poission_reg()
function.
There are many combinations of type/engine/mode available in parsnip. We keep track of these values for packages that have their model definitions in parsnip and fully adhere to the tidymodels APIs. A tab-delimited file with these values is in the package (called models.tsv
).
Each modeling function defined in parsnip has a documentation file (with extension .Rd
).
Also, each combination of engine and model type has a corresponding .Rd
file (a.k.a the "engine-specific" documentation files). The list of known engines is also shown in the .Rd
file for the main function.
If you are creating a new engine documentation file, the steps are:
.Rd
file that R's help system uses..Rmd
file that has the details of the engine. .Rmd
file to .md
so the .Rd
file from step 1 can link it. devtools::document()
(or use the RStudio IDE) to build the help files. Before you read the rest, you probably don't have to generate the help files in this way yourself:
This process only applies for engines whose model function is in parsnip. If your R package defines a new model, then you don't have to do all of this (unless you want to).
If you make a PR with a new engine, you can work on the .Rmd
file and the tidymodels folks can go through the process of merging it in (but let us know if you want us to do that in the PR).
.Rd
filesOnce create by document()
, the .Rd
files live in the man
directory. They are not handwritten but are automatically generated from .R
files in the R
directory.
How do we generate these .Rd
files? We'll use an example with poisson_reg()
and the "zeroinfl"
engine.
Each model/engine combination has its own .R
file with a naming convention reflecting the contents (poisson_reg_zeroinfl.R
). This file has a description of the type of model and the underlying function that is used for that engine:
[pscl::zeroinfl()]
uses maximum likelihood estimation to fit a model for count data that has separate model terms for predicting the counts and for predicting the probability of a zero count.
Next comes a roxygen comment including a specific markdown file (notice we use @includeRmd
but we actually include markdown):
@includeRmd man/rmd/poisson_reg_zeroinfl.md details
as well as a directive for the .Rd
file name to be created:
@name details_poisson_reg_zeroinfl
The engine markdown file (poisson_reg_zeroinfl.md
) is made by the developer offline.
Take a look at the actual file if you want to see it all.
To summarize: the relationship between these .R
files and the .md
files that they include is:
:
├── R
│ ├── :
│ ├── :
│ ├── bag_tree.R. <- model definition code that has the Sexprs
│ ├── bag_tree_C5.0.R <- wrapper around bag_tree_C5.0.md (below)
│ ├── bag_tree_rpart.R <- wrapper around bag_tree_rpart.md (below)
: :
: :
├── man
│ ├── rmd
│ │ ├──:
│ │ ├── :
│ │ ├── aaa.Rmd <- sourced by all other Rmd files
│ │ ├── :
│ │ ├── :
│ │ ├── bag_tree_C5.0.Rmd <- engine details for the C5.0 engine only
│ │ ├── bag_tree_C5.0.md
│ │ ├── bag_tree_rpart.Rmd
│ │ ├── bag_tree_rpart.md
: : :
.md
files for .Rmd
filesHow do we make these markdown files? These are created by corresponding .Rmd
files contained in parsnip/man/rmd/
. There are .Rmd
files for the engines defined in parsnip as well as the extension packages listed by parsnip:::extensions()
.
Each .Rmd
file uses parsnip/man/rmd/aaa.Rmd
as a child document. This file defines helper functions for the engine-specific documentation and loads some specific packages.
The .Rmd
files use packages that are not formally parsnip dependencies (these are listed in aaa.Rmd
). It also requires the parsnip extension packages defined in parsnip:::extensions()
.
There is an internal helper function (parsnip:::install_engine_packages()
) that will install the extension packages as well as a few others that are required to build all of the documentation files.
The .Rmd
files have a consistent structure and there are numerous examples of these files in the package. The main sections are:
To convert the .Rmd
files to .md
, use the function knit_engine_docs()
. After this, use devtools::document()
to create the engine specific .Rd
files.
Please look at the output of the knitting; if there are errors, they will show up there.
To test the results, do a hard restart of the R session (i.e., do not use load_all()
).
.Rd
filesThis section has information that is probably only needed by the core developers.
Recall that the .Rd
files are created from the .R
files. There are some fancy things that we do in the .R
files to list the engines. For example, poisson_reg.R
has the line:
#' \Sexpr[stage=render,results=rd]{parsnip:::make_engine_list("poisson_reg")}
This finds the relevant engine .Rd
files and creates the corresponding .Rd
markup:
There are different ways to fit this model, and the method of estimation is
chosen by setting the model \emph{engine}. The engine-specific pages for this
model are listed below.
\itemize{
\item \code{\link[parsnip:details_poisson_reg_glm]{glm}¹²}
\item \code{\link[parsnip:details_poisson_reg_gee]{gee}²}
\item \code{\link[parsnip:details_poisson_reg_glmer]{glmer}²}
\item \code{\link[parsnip:details_poisson_reg_glmnet]{glmnet}²}
\item \code{\link[parsnip:details_poisson_reg_hurdle]{hurdle}²}
\item \code{\link[parsnip:details_poisson_reg_stan]{stan}²}
\item \code{\link[parsnip:details_poisson_reg_stan_glmer]{stan_glmer}²}
\item \code{\link[parsnip:details_poisson_reg_zeroinfl]{zeroinfl}²}
}
¹ The default engine. ² Requires a parsnip extension package.
There is a similar line at the bottom of the files that creates the See Also list:
#' @seealso \Sexpr[stage=render,results=rd]{parsnip:::make_seealso_list("poisson_reg")}
This is also for core developers.
As previously mentioned, the parsnip package contains a file models.tsv
. To create this file:
parsnip:::extensions()
.parsnip::update_model_info_file()
. Note that the file should never have fewer lines that the current version.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.