SRCL_2_train_neural_network: Training the monotonistic neural network
In ekstroem/SRCL: Synergistic Risk Contributions Learning

Description Usage Arguments Details Examples

This function trains the monotonistic neural network. Fitting the model is done in a step-wise procedure one individual at a time, where the model estimates individual's risk of the disease outcome, estimates the prediction's residual error and adjusts the model parameters to reduce this error. By iterating through all individuals for multiple epochs (one complete iterations through all individuals is called an epoch), we end with parameters for the model, where the errors are smallest possible for the full population. The model fit follows the linear expectation that synergism is a combined effect larger than the sum of independent effects. The initial values, derivatives, and learning rates are described in further detail in the Supplementary material. The monotonistic model ensures that the predicted value cannot be negative. The model does not prevent estimating probabilities above 1, but this would be unlikely, as risks of disease and mortality even for high risk groups in general are far below 1. The use of a test dataset does not seem to assist deciding on the optimal number of epochs possibly due to the constrains due to the monotonicity assumption. We suggest splitting data into a train and test data set, such that findings from the train data set can be confirmed in the test data set before developing hypotheses.

SRCL_2_train_neural_network(
  X,
  Y,
  model,
  lr = 0.01,
  epochs = 50000,
  patience = 500,
  plot_and_evaluation_frequency = 50,
  IPCW = NA
)

`X`	The exposure data
`Y`	The outcome data
`model`	The fitted monotonistic neural network
`lr`	Learning rate
`epochs`	Epochs
`patience`	The number of epochs allowed without an improvement in performance.
`plot_and_evaluation_frequency`	The interval for plotting the performance and checking the patience
`IPCW`	Inverse probability of censoring weights (Warning: not yet correctly implemented)

For each individual:

P(Y=1|X^+)=R^b+∑_iR^X_i

The below procedure is conducted for all individuals in a one by one fashion. The baseline risk, $R^b$, is simply parameterised in the model. The decomposition of the risk contributions for exposures, $R^X_i$, takes 3 steps:

Step 1 - Subtract the baseline risk, $R^b$:

R^X_k = P(Y=1|X^+)-R^b

Step 2 - Decompose to the hidden layer:

R^{X}_j = \frac{H_j w_{j,k}}{∑_j(H_j w_{j,k})} R^X_k

Where $H_j$ is the value taken by each of the $ReLU()_j$ functions for the specific individual.

Step 3 - Hidden layer to exposures:

R^{X}_i = ∑_j \Big(\frac{X_i^+ w_{i,j}}{∑_i( X_i^+ w_{i,j})}R^X_j\Big)

This creates a dataset with the dimensions equal to the number of individuals times the number of exposures plus a baseline risk value, which can be termed a risk contribution matrix. Instead of exposure values, individuals are given risk contributions, R^X_i.