Double Descent | R Documentation |
Belkin and others have shown that some machine learning algorithms exhibit surprising behavior when in overfitting settings. The classic U-shape of mean loss plotted against model complexity may be followed by a surprise second "mini-U."
Alternatively, one might keep the model complexity fixed while varying the number of data points n, including over a region in which n is smaller than the complexity value of the model. The surprise here is that mean loss may actually increase with n in the overfitting region.
The function doubleD
facilitates easy exploration of this
phenomenon.
doubleD(qeFtnCall,xPts,nReps,makeDummies=NULL,classif=FALSE)
qeFtnCall |
Quoted string; somewhere should include 'xPts[i]'. |
xPts |
Range of values to be used in the experiments, e.g. a vector of degrees for polynomial models. |
nReps |
Number of repetitions for each experiment, typically the number in the holdout set. |
makeDummies |
If non-NULL, call |
classif |
Set TRUE if this is a classification problem. |
The function will run the code in qeFtnCall
nreps
times for each level specified in xPts
, recording the test and
training error in each case. So, for each level, we will have a mean
test and training error.
Each call in xPts
results in one line in the return value
of doubleD
. The return matrix can then be plotted, using the
generic plot.doubleD
. Mean test (red) and training (blue) accuracy
will be plotted against xPts
.
Norm Matloff
## Not run:
data(mlb1)
hw <- mlb1[,2:3]
doubleD('qePolyLin(hw,"Weight",deg=xPts[i])',1:20,250)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.