| GasolineYield | R Documentation |
Operational data on the proportion of crude oil converted to gasoline after distillation and fractionation processes.
GasolineYield
A data frame with 32 observations on 6 variables:
numeric. Proportion of crude oil converted to gasoline after distillation and fractionation (response variable).
numeric. Crude oil gravity in degrees API (American Petroleum Institute scale).
numeric. Vapor pressure of crude oil in pounds per square inch (psi).
numeric. Temperature in degrees Fahrenheit at which 10\ crude oil has vaporized.
numeric. Temperature in degrees Fahrenheit at which all gasoline has vaporized (end point).
factor. Batch indicator distinguishing the 10 different crude oils used in the experiment.
This dataset was collected by Prater (1956) to study gasoline yield from crude oil. The dependent variable is the proportion of crude oil after distillation and fractionation. Atkinson (1985) analyzed this dataset using linear regression and noted that there is "indication that the error distribution is not quite symmetrical, giving rise to some unduly large and small residuals".
The dataset contains 32 observations. It has been noted (Daniel and Wood, 1971,
Chapter 8) that there are only ten sets of values of the first three
explanatory variables which correspond to ten different crudes subjected to
experimentally controlled distillation conditions. These conditions are captured
in variable batch and the data were ordered according to the ascending order
of temp10.
Taken from Prater (1956).
Atkinson, A.C. (1985). Plots, Transformations and Regression: An Introduction to Graphical Methods of Diagnostic Regression Analysis. New York: Oxford University Press.
Cribari-Neto, F., and Zeileis, A. (2010). Beta Regression in R. Journal of Statistical Software, 34(2), 1–24. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.18637/jss.v034.i02")}
Daniel, C., and Wood, F.S. (1971). Fitting Equations to Data. New York: John Wiley and Sons.
Ferrari, S.L.P., and Cribari-Neto, F. (2004). Beta Regression for Modeling Rates and Proportions. Journal of Applied Statistics, 31(7), 799–815.
Prater, N.H. (1956). Estimate Gasoline Yields from Crudes. Petroleum Refiner, 35(5), 236–238.
require(gkwreg)
require(gkwdist)
data(GasolineYield)
# Example 1: Kumaraswamy regression with batch effects
# Model mean yield as function of batch and temperature
# Allow precision to vary with temperature (heteroscedasticity)
fit_kw <- gkwreg(yield ~ batch + temp | temp,
data = GasolineYield,
family = "kw"
)
summary(fit_kw)
# Interpretation:
# - Alpha (mean): Different batches have different baseline yields
# Temperature affects yield transformation
# - Beta (precision): Higher temperatures may produce more variable yields
# Example 2: Full model with all physical-chemical properties
fit_kw_full <- gkwreg(
yield ~ gravity + pressure + temp10 + temp |
temp10 + temp,
data = GasolineYield,
family = "kw"
)
summary(fit_kw_full)
# Interpretation:
# - Mean model captures effects of crude oil properties
# - Precision varies with vaporization temperatures
# Example 3: Exponentiated Kumaraswamy for extreme yields
# Some batches may produce unusually high/low yields
fit_ekw <- gkwreg(
yield ~ batch + temp | # alpha: batch effects
temp | # beta: temperature precision
batch, # lambda: batch-specific tail behavior
data = GasolineYield,
family = "ekw"
)
summary(fit_ekw)
# Interpretation:
# - Lambda varies by batch: Some crude oils have more extreme
# yield distributions (heavy tails for very high/low yields)
# Model comparison: Does tail flexibility improve fit?
anova(fit_kw, fit_ekw)
# Diagnostic plots
par(mfrow = c(2, 2))
plot(fit_kw, which = c(1, 2, 4, 5))
par(mfrow = c(1, 1))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.