Description Usage Arguments Details Value References See Also Examples

`higrad`

is used to implement hierarchical incremental gradient descent (HiGrad), an algorithm that conducts statistical inference for online learning.

1 2 3 4 |

`x` |
input matrix of features. Each row is an observation vector, and each column is a feature. |

`y` |
response variable. Quantitative for |

`model` |
type of model to fit. Currently only linear regression ( |

`nsteps` |
total number of steps. This is equivalent to the number of queries made to get a noisy evaluation of the gradient. |

`nsplits` |
number of splits in the HiGrad tree. |

`nthreads` |
numbers of threads each previous thread is split into. Either a number (equal split size throughout) or a vector. |

`step.ratio` |
ratio of the lengths of the threads from the two adjacent levels (the latter one divided by the previous). Either a number (equal ratio throughout) or a vector. |

`n0` |
length of the 0th-level thread. |

`skip` |
number of steps to skip when estimating the coefficients by averaging. |

`eta` |
constant in front of the step size. See Details for the formula of the step size. |

`alpha` |
exponent of the step size. See Details for the formula of the step size. |

`burnin` |
number of steps as the burn-in period. The burn-in period is not accounted for in the total budget |

`start` |
starting values of the coefficients. |

`replace` |
logical; whether or not to sample the data with replacement. |

`track` |
logical; whether or not to store the entire path for plotting. |

HiGrad is designed to conduct statistical inference for online learning, without incurring additional computational cost compared with the vanilla stochastic gradient descent (SGD).
The HiGrad procedure begins by performing SGD iterations for a while and then split the single thread into a few, and this procedure hierarchically operates in this fashion along each thread.
With predictions provided by multiple threads in place, a t-based confidence interval is constructed by de-correlating predictions using covariance structures given by the Ruppert–Polyak averaging scheme.
In order to implement HiGrad, a configuration of the tree structure needs to be specified. The default setting is a binary tree with 2 splits.
The step size is set to be `eta*t^(-alpha)`

.

An object with S3 class `higrad`

.

`coefficients` |
estimate of the coefficients. |

`coefficients.bootstrap` |
matrix of estimates of the coefficients along each HiGrad threads. |

`model` |
model type. |

`Sigma0` |
covariance structure |

`track` |
entire path of the estimates along each thread. Can be used for diagnostic and check for convergence. |

Weijie Su and Yuancheng Zhu. (2018) *Statistical Inference for Online Learning and Stochastic Approximation via Hierarchical Incremental Gradient Descent*. https://arxiv.org/abs/1802.04876.

See `print.higrad`

, `plot.higrad`

, `predict.higrad`

for other methods for the `higrad`

class.

1 2 3 4 5 6 7 8 9 10 11 12 13 | ```
# fitting linear regression on a simulated dataset
n <- 1e3
d <- 10
sigma <- 0.1
theta <- rep(1, d)
x <- matrix(rnorm(n * d), n, d)
y <- as.numeric(x %*% theta + rnorm(n, 0, sigma))
fit <- higrad(x, y, model = "lm")
print(fit)
# predict for 10 new samples
newx <- matrix(rnorm(10 * d), 10, d)
pred <- predict(fit, newx)
pred
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.