CONTRIBUTING.md

The package is modular, so it would be great for contributors to provide new methods for other tree objects.

There are not many requirements to match.

Code file name

The code for methods related to a tree object needs to be called .R and go in the R folder, e.g. for the rpart tree class: rpart.R. Code for node predictions algorithms will be added to the R/node_prediction.R file at the moment. Probably more files and a proper function name template will be needed in the future.

Method guidelines

Method documentation

At the moment, the documentation for the methods is presented in the help page of the generic tidy_tree() function. Once the number of methods will increase we may want to change this. Method specific documentation is created by adding a Roxygen comment with just the @describeIn tag and optionally the @examples tag.

Testing

It's wise to add unit testing for each method using the testthat infrastructure. I tried to add the most logical tests to the package, connected to the various problems I encountered during the development; some tests may seem silly but one is never safe enough. Some test should work independently from the original tree object, therefore I added a set of general testing function in the tests/testthat/helpers.R file. These tests should be placed in a file in the tests/testthat/ folder with the test-.R convention (to work with devtools::test()), where class specific test can be added too. In general I added the following tests: * Method tests: check if the correct method is called given a input tree and in the method arguments follow the specification. Defined in the perform_method_tests() function. * Rules test: check if the rules are as expected from the original tree class. This is class specific so there's not general function. The idea is either to see if the rule editing in the method break something, or if an update in the tree package break something, or to check compatibility if you develop a rule extraction method which is faster than those exposed in the tree package. * Nodes test: check if all the nodes of the trees are included. Class specific function. * Stump test: check that a zero rows tibble and a warning are returned. Defined in the perform_stump_test() function. * Output tests: These tests are grouped based on the kind of outcome (continuous or discrete) and need an ideal result to compare the output to. This template result can be created using dput() on the outcome of your method once you are satisfied with it, to test if new changes break something unexpected. Performed by the perform_output_tests() function. - Number of rows test: test if there is one row per node or one row per node/y.level for multinomial trees. - Column names test: test if the column names are as expected give the ideal template and the various arguments. - Content test: test if the output is coherent in content with the ideal template. * Number of observations per node test: test if the number of observation in n.obs is as expected. This test in class specific, but since it must be repeated for every outcome type, I put it into a class specific function perform_n.obs_test() * Prediction test: test the estimations made by the default estimation function with the expected values obtained after filtering the training data with the eval_ready rules. As a bonus it also test whether the filtering produce a consistent n.obs with the original tree.



bakaburg1/tidytrees documentation built on Dec. 19, 2021, 6:40 a.m.