Home

/

GitHub

/

ZijunGao/LinCDE

/

LinCDE.boost: LinCDE.boost

LinCDE.boost: LinCDE.boost
In ZijunGao/LinCDE: Conditional Density Estimation via Lindsey's Method and Boosting

View source: R/LinCDE.boost.R

LinCDE.boost

R Documentation

LinCDE.boost

Description

This function implements LinCDE boosting: a boosting algorithm of conditional density estimation with shallow LinCDE trees as base-learners.

Usage

LinCDE.boost(
  y,
  X = NULL,
  splitPoint = 20,
  basis = "nsTransform",
  splineDf = 10,
  minY = NULL,
  maxY = NULL,
  numberBin = 40,
  df = 4,
  penalty = NULL,
  prior = "Gaussian",
  depth = 1,
  n.trees = 100,
  shrinkage = 0.1,
  terminalSize = 20,
  alpha = 0.2,
  subsample = 1,
  centering = FALSE,
  centeringMethod = "randomForest",
  verbose = TRUE,
  ...
)

Arguments

`y`	response vector, of length nobs.
`X`	input matrix, of dimension nobs x nvars; each row represents an observation vector.
`splitPoint`	a list of candidate splits of length nvars or a scalar/vector of candidate split numbers. If `splitPoint` is a list, each object is a vector corresponding to a variable's candidate splits (including the left and right endpoints). The list's objects should be ordered the same as X's columns. An alternative input is candidate split numbers, a scalar if all variables share the same number of candidate splits, a vector of length nvars if variables have different numbers of candidate splits. If candidate split numbers are given, each variable's range is divided into `splitPoint-1` intervals containing approximately the same number of observations. Default is 20. Note that if a variable has fewer unique values than the desired number of intervals, split intervals corresponding to unique values are created. The minimal accepted `splitPoint` is 3.
`basis`	a character or a function specifying sufficient statistics, i.e., spline basis. For `basis = "Gaussian"`, y, y^2 are used. For `basis = "nsTransform"`, transformed natural cubic splines are used. If `basis` is a function, it should take a vector of response values and output a basis matrix: each row stands for a response value and each column stands for a basis function. Default is "nsTransform".
`splineDf`	the number of sufficient statistics/spline basis. If `z = "Gaussian"`, `splineDf` is set to 2. Default is 10.
`minY`	the user-provided left end of the response range. If `centering` is `TRUE`, `minY` is ignored. Default is NULL.
`maxY`	the user-provided right end of the response range. If `centering` is `TRUE`, `maxY` is ignored. Default is NULL.
`numberBin`	the number of bins for response discretization. Default is 40. The response range is divided into `numberBin` equal-width bins.
`df`	approximate degrees of freedom. `df` is used for determining the ridge regularization parameter. If `basis = "Gaussian"`, no penalization is implemented. If `df = splineDf`, there is no ridge penalization. Default is 6.
`penalty`	vector of penalties applied to each sufficient statistics' coefficient.
`prior`	a character or a function specifying initial carrier density. For `prior = "uniform"`, the uniform distribution over the response range is used. For `prior = "Gaussian"`, the Gaussian distribution with the marginal response mean and standard deviation is used. For `prior = "LindseyMarginal"`, the marginal response density estimated by Lindsey's method based on all responses is used. The argument `prior` can also be a homogeneous or heterogeneous conditional density function. The conditional density function should take a covariate matrix X, a response vector y, and output a vector of conditional densities f(yi \| Xi). See the LinCDE vignette for examples. Default is "Gaussian".
`depth`	the number of splits of each LinCDE tree. The number of terminal nodes is `depth + 1`. If `depth = 1`, an additive model is fitted. Default is 1.
`n.trees`	the number of trees to fit. Default is 100.
`shrinkage`	the shrinkage parameter applied to each tree in the expansion, value in (0,1]. Default is 0.1.
`terminalSize`	the minimum number of observations in a terminal node. Default is 20.
`alpha`	a hyperparameter in (0,1] to early stop the boosting. A smaller `alpha` is more likely to induce early stopping. If `alpha = 1`, no early stopping will be conducted. Default is 0.2.
`subsample`	subsample ratio of the training samples in (0,1]. Default is 1.
`centering`	a logical value. If `TRUE`, a conditional mean model is fitted first, and LinCDE boosting is applied to the residuals. The centering is recommended for responses whose conditional support varies wildly. See the LinCDE vignette for examples. Default is `FALSE`.
`centeringMethod`	a character or a function specifying the conditional mean estimator. If `centeringMethod = "linearRegression"`, a regression model is fitted to the response. If `centeringMethod = "randomForest"`, a random forest model is fitted. Hyperparameters used by the centering method can be directly fed to LinCDE.boost, such as `nodesize = 10` for `centeringMethod = "randomForest"`. If `centeringMethod` is a function, the call `centeringMethod(y, X)` should return a conditional mean model with a predict function. Default is "randomForest". Applies only to `centering = TRUE`.
`verbose`	a logical value. If `TRUE`, progress and performance are printed. Default is `TRUE`.
`...`	other parameters, such as hyperparameters to be passed to the conditional mean estimator.

Value

This function returns a LinCDE object consisting of a list of values.

trees: a list of LinCDE trees.
importanceScore: a named vector measuring the contribution of each covariate to the objective.
splitMidPointY: the vector of discretized bins' mid-points.
z: the spline basis matrix.
zTransformMatrix: the transformation matrix (of dimension splineDf x splineDf) multiplied by the standard natural cubic spline basis if basis = "nsTransform".
prior: the prior function. The call prior(X, Y) should return a vector of prior conditional densities f(yi | Xi).
basis/depth/shrinkage/centering/centeringMethod: values inherited from the input arguments. If centering is FALSE, no centeringMethod is returned.
centeringModel: a centering model with a predict function. If centering is FALSE, no centeringModel is returned.

ZijunGao/LinCDE documentation built on Jan. 2, 2023, 11:14 p.m.

ZijunGao/LinCDE index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

ZijunGao/LinCDE
Conditional Density Estimation via Lindsey's Method and Boosting

LinCDE.boost: LinCDE.boost
In ZijunGao/LinCDE: Conditional Density Estimation via Lindsey's Method and Boosting

LinCDE.boost

Description

Usage

Arguments

Value

Related to LinCDE.boost in ZijunGao/LinCDE...

R Package Documentation

Browse R Packages

We want your feedback!

ZijunGao/LinCDE Conditional Density Estimation via Lindsey's Method and Boosting

LinCDE.boost: LinCDE.boost In ZijunGao/LinCDE: Conditional Density Estimation via Lindsey's Method and Boosting

LinCDE.boost

Description

Usage

Arguments

Value

Related to LinCDE.boost in ZijunGao/LinCDE...

R Package Documentation

Browse R Packages

We want your feedback!

ZijunGao/LinCDE
Conditional Density Estimation via Lindsey's Method and Boosting

LinCDE.boost: LinCDE.boost
In ZijunGao/LinCDE: Conditional Density Estimation via Lindsey's Method and Boosting