Shiny dashboard "Statistical foundations of machine learning"

Parametric identification and validation

Goal: show the relation between parametric identification and optimization

Common left panel:

Linear least-squares

Data generating process: ${\bf y}=\beta_0+\beta_1 x+{\bf w}$ where ${\bf w} \sim {\mathcal N} (0,\sigma_w^2)$.

Top sliders:

Bottom left panel: training data set, regression function (in green) and estimated regression function $h(x)=\hat{\beta}_0+\hat{\beta}_1 x$ (in red)

Bottom right panel: convex empirical risk function $J(\hat{\beta}_0,\hat{\beta}_1)$. Green dot denotes the pair of real parameters $\beta_0,\beta_1$. Red dot denotes the pair of estimations $\hat{\beta}_0,\hat{\beta}_1$

Suggested manipulations:

NNET least-squares

Data generating process: ${\bf y}=\sin (\pi x) + {\bf w}$ where ${\bf w} \sim {\mathcal N} (0,\sigma_w^2)$.

Hypothesis function: $y=w_7 {\mathcal s}(w_2 x+ w_1)+ w_8 {\mathcal s}(w_4 x+ w_3)+w_9 {\mathcal s}(w_6 x+ w_5) +w_{10}$ where ${\mathcal s}(z)=\frac{1.0}{1.0+\exp^{-z}}$ stands for the sigmoid function.

Top sliders:

Suggested manipulations:

KNN cross-validation

Data generating process: ${\bf y}=\sin (\pi x) + {\bf w}$ where ${\bf w} \sim {\mathcal N} (0,\sigma_w^2)$.

Hypothesis function: K nearest-neighbors

Bottom left panel: training data set. At each click of the button "CV step" the points that belong to the test fold are put in green and the corresponding prediction is in red

Bottom red panel: it shows for each input sample the associated CV error: the figure is updated as far as we proceed with the cross-validation folds

Suggested manipulations:



gbonte/gbcode documentation built on Feb. 27, 2024, 7:38 a.m.