# Hall-Yarborough correlation In zFactor: Calculate the Compressibility Factor 'z' for Hydrocarbon Gases

knitr::opts_chunk\$set(echo=T, comment=NA, error=T, warning=F, message = F, fig.align = 'center', results="hold")


## The Hall-Yarborough correlation

Kenneth Hall and Lyman Yarborough used the hard-sphere equation as the basis for the equation of state. They tested the correlation with 12 reservoir gas reservoir systems up to Ppr as high as 20.5. The Standing-Katz chart only extends to Ppr=15. At that moment the Standing-Katz chart had 30 years of existance. See [@Hall1973].

## Get z at selected Ppr and Tpr=2.0

Use the the corelation to calculate z. From the Standing-Katz chart we obtain a digitized point at the given Tpr and Ppr.

# get a z value using HY
library(zFactor)

ppr <- 1.5
tpr <- 2.0

z.calc <- z.HallYarborough(pres.pr = ppr, temp.pr = tpr)

# get a z value from the SK chart at the same Ppr and Tpr
z.chart <- getStandingKatzMatrix(tpr_vector = tpr,
pprRange = "lp")[1, as.character(ppr)]

# calculate the APE (Average Percentage Error)
ape <- abs((z.calc - z.chart) / z.chart) * 100

df <- as.data.frame(list(Ppr = ppr,  z.calc =z.calc, z.chart = z.chart, ape=ape))
rownames(df) <- tpr
df


## Get z at selected Ppr and Tpr = 1.1

From the Standing-Katz chart we obtain a digitized point:

library(zFactor)
ppr <- 1.5
tpr <- 1.1

z.calc <- z.HallYarborough(pres.pr = ppr, temp.pr = tpr)

# From the Standing-Katz chart we obtain a digitized point:
z.chart <- getStandingKatzMatrix(tpr_vector = tpr,
pprRange = "lp")[1, as.character(ppr)]

# calculate the APE
ape <- abs((z.calc - z.chart) / z.chart) * 100

df <- as.data.frame(list(Ppr = ppr,  z.calc =z.calc, z.chart = z.chart, ape=ape))
rownames(df) <- tpr
df


We see here a noticeable difference between the values of z from the HY correlation and the value read from the Standing-Katz chart. This is expected at these low pressures and temperatures. This area is a challenge for all of the correlations as well.

## Get values of z for combinations of Ppr and Tpr

In this example we provide vectors instead of a single point. With the same ppr and tpr vectors that we use for the correlation, we do the same for the Standing-Katz chart. We want to compare both and find the absolute percentage error.

library(zFactor)

ppr <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5)
tpr <- c(1.05, 1.1, 1.7, 2)

# calculate using the correlation
z.calc <- z.HallYarborough(ppr, tpr)

# With the same ppr and tpr vector, we do the same for the Standing-Katz chart
z.chart <- getStandingKatzMatrix(ppr_vector = ppr, tpr_vector = tpr)
ape <- abs((z.calc - z.chart) / z.chart) * 100

# calculate the APE
cat("z.correlation \n"); print(z.calc)
cat("\n z.chart \n"); print(z.chart)
cat("\n APE \n"); print(ape)


You can see errors of 22.48% and 12.09% in the isotherm Tpr=1.05 at Ppr=1.5 and 2.5, respectively. Other errors, greater than one, can also be found at the isothermTpr=1.1. Then, the rest of the curves are fine.

## Analyze the error at the isotherms

We apply the function summary over the transpose of the matrix. Applying the transpose will alow us to see the statistics at each of the isotherms.

sum_t_ape <- summary(t(ape))
sum_t_ape


We see that the errors in z are considerable with a r sum_t_ape[1,1]% and r sum_t_ape[6,1]% for Tpr=1.05, and a r sum_t_ape[1,2]% and r sum_t_ape[6,2]% for Tpr=1.10.

The Hall-Yarborough correlation shows a very high error at values of Tpr lower or equal than 1.1, being Tpr=1.05 the worst curve to calculate z values from. Keep that in mind. We will explore later a comparative tile chart where we confirm these early calculations.

## Analyze the error for greater values of Tpr

Let's see the numbers now for higher values of Tpr at various values of Ppr.

library(zFactor)
# enter vectors for Tpr and Ppr
tpr2 <- c(1.2, 1.3, 1.5, 2.0, 3.0)
ppr2 <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5)
# get z values from the SK chart
z.chart <- getStandingKatzMatrix(ppr_vector = ppr2, tpr_vector = tpr2, pprRange = "lp")

# We do the same with the HY correlation:
# calculate z values at lower values of Tpr
z.calc <- z.HallYarborough(pres.pr = ppr2, temp.pr = tpr2)
ape <- abs((z.calc - z.chart) / z.chart) * 100

# calculate the APE
cat("z.correlation \n"); print(z.calc)
cat("\n z.chart \n"); print(z.chart)
cat("\n APE \n"); print(ape)


At Tpr above or equal to 1.2 the HY correlation behaves very well.

## Analyze the error at the isotherms

Now, let's apply the function summary over the transpose of the matrix to observe the error of the correlation at each isotherm.

sum_t_ape <- summary(t(ape))
sum_t_ape


In all, the error of z at these isotherms is less than 0.82%. Pretty good.

## Prepare to plot SK chart values vs HY correlation

Now, we will be plotting the difference between the z values in the Standing-Katz and the z values calculated by Hall-Yarborough. For the moment let's not pay attention to the numerical error but to the error bars in the plot (the orange bars).

library(zFactor)
library(tibble)
library(ggplot2)

tpr2 <- c(1.05, 1.1, 1.2, 1.3)
ppr2 <- c(0.5, 1.0, 1.5, 2, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5)

sk_corr_2 <- createTidyFromMatrix(ppr2, tpr2, correlation = "HY")
as_tibble(sk_corr_2)

p <- ggplot(sk_corr_2, aes(x=Ppr, y=z.calc, group=Tpr, color=Tpr)) +
geom_line() +
geom_point() +
geom_errorbar(aes(ymin=z.calc-abs(dif), ymax=z.calc+abs(dif)), width=.4,
position=position_dodge(0.05))
print(p)


As we were expecting, the errors can be found on Tpr=1.05 and Tpr=1.1

## Analyzing the error for all the Tpr curves

In this last example, we compare the values of z at all the isotherms. We use the function getCurvesDigitized in the zfactor package to obtain all the isotherms or Tpr curves in the Standing-Katz chart that have been digitized. The next function createTidyFromMatrix calculates z using the correlation and prepares a tidy dataset ready to plot.

library(ggplot2)
library(tibble)

# get all lp Tpr curves
tpr_all <- getStandingKatzTpr(pprRange = "lp")
ppr <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5)
sk_corr_all <- createTidyFromMatrix(ppr, tpr_all, correlation = "HY")
as_tibble(sk_corr_all)

p <- ggplot(sk_corr_all, aes(x=Ppr, y=z.calc, group=Tpr, color=Tpr)) +
geom_line() +
geom_point() +
geom_errorbar(aes(ymin=z.calc-dif, ymax=z.calc+dif), width=.4,
position=position_dodge(0.05))
print(p)


As we saw before, we confirm that the greatest errors are localized in two of the Tpr curves: at 1.05 and 1.1.

## Range of applicability of the correlation

What we will see here is the distribution of the Mean Average Percentage Error or MAPE accross all the Tpr isotherms and Ppr grids. Remember from the README: red and yellow are bad.

# MSE: Mean Squared Error
# RMSE: Root Mean Sqyared Error
# RSS: residual sum of square
# RMSLE: Root Mean Squared Logarithmic Error. Penalizes understimation.
# MAPE: Mean Absolute Percentage Error = AARE
# MPE: Mean Percentage error = ARE
# MAE: Mean Absolute Error
library(dplyr)
grouped <- group_by(sk_corr_all, Tpr, Ppr)
smry_tpr_ppr <- summarise(grouped,
RMSE= sqrt(mean((z.chart-z.calc)^2)),
MPE = sum((z.calc - z.chart) / z.chart) * 100 / n(),
MAPE = sum(abs((z.calc - z.chart) / z.chart)) * 100 / n(),
MSE = sum((z.calc - z.chart)^2) / n(),
MAE = sum(abs(z.calc - z.chart)) / n(),
RMLSE = sqrt(1/n()*sum((log(z.calc +1)-log(z.chart +1))^2))
)

ggplot(smry_tpr_ppr, aes(Ppr, Tpr)) +
geom_tile(data=smry_tpr_ppr, aes(fill=MAPE), color="white") +
scale_fill_gradient2(low="blue", high="red", mid="yellow", na.value = "pink",
midpoint=12.5, limit=c(0, 25), name="MAPE") +
theme(axis.text.x = element_text(angle=45, vjust=1, size=11, hjust=1)) +
coord_equal() +
ggtitle("Hall-Yarborough", subtitle = "HY")


## Plotting the Tpr and Ppr values that show more error

This is just plotting the couple of isotherms where we see the largest errors.

library(dplyr)

sk_corr_all %>%
filter(Tpr %in% c("1.05", "1.1")) %>%
ggplot(aes(x = z.chart, y=z.calc, group = Tpr, color = Tpr)) +
geom_point(size = 3) +
geom_line(aes(x = z.chart, y = z.chart), color = "black") +
facet_grid(. ~ Tpr) +
geom_errorbar(aes(ymin=z.calc-abs(dif), ymax=z.calc+abs(dif)),
position=position_dodge(0.05))


## Looking numerically at the errors

Finally, the dataframe with the calculated errors between the z from the correlation and the z read from the chart:

as_tibble(smry_tpr_ppr)


## RMSE: Root Mean Square Error

$$RMSE = \sqrt {\frac {1}{n} \sum_{i=1}^{n} (y_i - \hat y_i)^2}$$

## Appendix

getStandingKatzMatrix is equivalent to using the sapply function with the internal function .z.HallYarborough (the dot means it's internal), which we call adding the prefix zFactor:::. That is, the package name and three colons.

# test HY with 1st-derivative using the values from the paper

ppr <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5)
tpr <- c(1.3, 1.5, 1.7, 2)

hy <- sapply(ppr, function(x)
sapply(tpr, function(y) zFactor:::.z.HallYarborough(pres.pr = x, temp.pr = y)))

rownames(hy) <- tpr
colnames(hy) <- ppr
print(hy)
`

## Try the zFactor package in your browser

Any scripts or data that you put into this service are public.

zFactor documentation built on Aug. 1, 2019, 5:04 p.m.