Basic, introductory example to illustrate how Quantile Regression works using the package QRMon.
For detailed explanations see the vignette "Rapid making of Quantile Regression workflows".
Here is a diagram showing the concepts in a QRMon pipeline (in Mathematica notation.)
The package/library QRMon can be installed with the command:
devtools::install_github("antononcube/QRMon-R")
Then we load that package with:
library(QRMon)
Sometimes I have to explicitly load the dependency libraries:
library(splines) library(quantreg) library(purrr) library(magrittr) library(ggplot2)
Those libraries can be installed with the command:
install.packages( "quantreg", "purrr", "magrittr", "ggplot2")
Below the curves produced by Quantile Regression are called "regression quantiles".
A QRMon monad object is a S3 object and it is constructed withQRMonUnit
.
Here are the S3 object element names:
names(QRMonUnit())
Here is the class attribute:
class(QRMonUnit())
Remarks:
The class attribute is not used/respected in QRMon's functions because they use the prefix "QRMon".
Some of QRMon's functions can put additional elements into the monad object.
Here we compute the fractions of the points separated by the regression quantiles with the following pipeline:
qFracs <- QRMonUnit( setNames(dfTemperatureData, c("Regressor", "Value")) ) %>% # Get data QRMonQuantileRegression( df = 12, probabilities = seq(0.2,0.8,0.2) ) %>% # Quantile Regression with B-splines QRMonPlot %>% # Plot data and regression quantiles QRMonSeparateToFractions %>% # Separate the points and find fractions QRMonTakeValue # Take the value of the monad object
qFracs
The above result should :
illustrate what Quantile Regression does, and
convince us that the concrete QRMon implementation works.
Consider the application of the points separation process for finding (and defining) outliers.
qrObj<- QRMonUnit( setNames(dfTemperatureData, c("Regressor", "Value")) ) %>% QRMonQuantileRegression( df = 16, probabilities = c(0.01,0.98) ) %>% QRMonOutliers %>% QRMonOutliersPlot
Let use make a more interesting example by plotting the points separated by the regression quantiles with different colors.
First we compute a non-cumulative point separation:
qFracPoints <- QRMonUnit( setNames( dfTemperatureData, c("Time", "Value") ) ) %>% QRMonQuantileRegression( df = 16, probabilities = seq(0.2,0.8,0.2) ) %>% QRMonPlot(datePlotQ = T, dateOrigin = "1900-01-01") %>% # Make a date-axis plot QRMonSeparate( cumulativeQ = FALSE ) %>% # Non-cumulative point sets QRMonTakeValue()
The following result shows that the found point sets have roughly the same number of elements that adhere to the selected quantile proabilities.
rbind( purrr::map_df(qFracPoints, nrow), purrr::map_df(qFracPoints, nrow) / nrow(dfTemperatureData) )
Here we plot the separated points with different colors:
qDF <- dplyr::bind_rows( qFracPoints , .id = "Quantile") qDF$Time <- as.POSIXct( qDF$Regressor, origin = "1900-01-01" ) ggplot(qDF) + geom_point(aes(x = Time, y = Value, color = Quantile) )
One of the unique applications of Quantile Regression is to do "realistic" time series simulations.
Let us first do Quantile Regression fit of the time series data:
qrmon <- QRMonUnit( setNames(dfTemperatureData, c("Time", "Value") )) %>% QRMonQuantileRegression( df = 16, probabilities = c( 0.01, seq(0.1,0.9,0.1), 0.99) ) %>% QRMonPlot(datePlotQ = TRUE, dateOrigin = "1900-01-01" )
Here with the obtained monad object we do several time series simulations over 1000 regular grid points:
set.seed(2223) qDF <- rbind( cbind( Type = "Original", qrmon %>% QRMonTakeData() ), cbind( Type = "Simulated.1", as.data.frame( qrmon %>% QRMonSimulate(1000) %>% QRMonTakeValue() )), cbind( Type = "Simulated.2", as.data.frame( qrmon %>% QRMonSimulate(1000) %>% QRMonTakeValue() )), cbind( Type = "Simulated.3", as.data.frame( qrmon %>% QRMonSimulate(1000) %>% QRMonTakeValue() )) ) qDF$Regressor <- as.POSIXct( qDF$Regressor, origin = "1900-01-01" ) ggplot( qDF ) + geom_line( aes( x = Regressor, y = Value ), color = "lightblue" ) + facet_wrap( ~Type, ncol=1)
Simulations like these can be used in some Operations Research applications.
qrmon2 <- qrmon %>% QRMonConditionalCDFPlot( sample( dfTemperatureData$Time, 12), quantileGridLinesQ = T, dateOrigin = "1900-01-01", ncol = 3, scales = "free_x" )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.