knitr::opts_chunk$set( collapse = TRUE, comment = "#>" ) options(np.messages = FALSE)
This vignette is meant to be the smallest useful package-side introduction to
np. The emphasis is on one clean workflow that users can run after
installation: choose a bandwidth, fit a model, inspect the result, and plot it.
Broader worked examples, package comparisons, and method-specific articles are better carried by the gallery site:
In np, the bandwidth object is often the key object in the analysis.
library(np) data(cps71, package = "np") bw <- npregbw(logwage ~ age, data = cps71) summary(bw) fit <- npreg(bws = bw) summary(fit)
plot(cps71$age, cps71$logwage, cex = 0.25, col = "grey") lines(cps71$age, fitted(fit), col = 2, lwd = 2)
One important feature of np is that it handles mixed data directly. Variable
class matters: unordered categorical variables should be factors, and ordered
categorical variables should be ordered factors when appropriate.
set.seed(42) mydat <- data.frame( y = rnorm(200), x_cont = runif(200), x_unordered = factor(sample(c("a", "b", "c"), 200, replace = TRUE)), x_ordered = ordered(sample(1:4, 200, replace = TRUE)) ) bw_mixed <- npregbw(y ~ x_cont + x_unordered + x_ordered, data = mydat) fit_mixed <- npreg(bws = bw_mixed) summary(fit_mixed)
For local-polynomial-capable methods, np now supports joint selection of
polynomial order and bandwidth. The modern route is to use
search.engine = "nomad+powell" when you want the search to choose both
together.
If you want the recommended route without spelling out all of the LP tuning
arguments, use nomad = TRUE. This is a documented convenience preset, not a
generic optimizer alias: it fills only missing values among the LP degree-search
controls and leaves compatible explicit overrides in place. This route uses the
optional NOMAD backend provided by the suggested package crs, so install
crs first if you want to use nomad = TRUE or
search.engine = "nomad"/"nomad+powell".
if (requireNamespace("crs", quietly = TRUE) && utils::packageVersion("crs") >= package_version("0.15-41")) { set.seed(7) n <- 120 x <- runif(n, -1, 1) y <- x + 0.4 * x^2 + rnorm(n, sd = 0.18) fit_nomad <- npreg(y ~ x, nomad = TRUE, degree.max = 1L, nmulti = 1L) fit_nomad$bws$nomad.shortcut # Tune one component explicitly while leaving the rest of the preset in place. fit_nomad_direct <- npreg( y ~ x, nomad = TRUE, search.engine = "nomad", degree.max = 1L, nmulti = 1L ) }
The same convenience entry point is available for the other LP-capable families:
npcdens, npcdist, npplreg, npscoef, and npindex, together with their
corresponding *bw constructors.
Keep the first run modest and runnable. Fuller worked examples belong on the gallery rather than in this package vignette.
In np, the formula interface tells the function which variables are the
response and regressors. It is not imposing an ordinary linear-additive model.
It is also important not to pass blocks of 0/1 dummies as if this were a
standard linear-model workflow. If the underlying variable is categorical, it
is usually better to keep it as one factor or ordered variable.
This vignette keeps the package-side introduction intentionally narrow. Other common first routes are:
?npudens and ?npudist for unconditional density and distribution work,?npcdens, ?npcdist, and ?npqreg for conditional density, distribution,
and quantiles,?npconmode for classification and conditional mode estimation,?npplreg, ?npindex, and ?npscoef for semiparametric models.Those broader branches are better carried by help pages and website articles than by a single shipped vignette.
vignette("np_entropy_tests", package = "np") for a compact package-side
testing overview?npreg, ?npregbw, ?npudens, and ?npcdens for core help pagesAny scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.