# Dranchuk-Purvis-Robinson correlation In zFactor: Calculate the Compressibility Factor 'z' for Hydrocarbon Gases

```knitr::opts_chunk\$set(echo=T, comment=NA, error=T, warning=F, message = F, fig.align = 'center', results="hold")
```

## Why `DPR` is different

The `Dranchuk-Purvis-Robinson` correlation is different than the other correlations in the sense that the authors were looking at redrawing three of the isotherms of the original work by Standing-Katz: the curves at `Tpr=1.05`, `Tpr=1.1` and `Tpr=1.15`. The authors were expecting that later the corrections to be adopted with experimental data matching those two new curves. We haven't found a paper confirming that yet. [See @Dranchuk1973].

Other than that, the `DPR` correlation behaves pretty well but not much better than the `DAK` correlation.

## Get `z` at selected `Ppr` and `Tpr`

Use the corelation to calculate `z`. From the Standing-Katz chart we obtain `z` at a digitized point at the given `Tpr` and `Ppr`.

```# get a z value
library(zFactor)

ppr <- 1.5
tpr <- 2.0

z.calc <- z.DranchukPurvisRobinson(pres.pr = ppr, temp.pr = tpr)

# get a z value from the SK chart at the same Ppr and Tpr
z.chart <- getStandingKatzMatrix(tpr_vector = tpr,
pprRange = "lp")[1, as.character(ppr)]

# calculate the APE
ape <- abs((z.calc - z.chart) / z.chart) * 100

df <- as.data.frame(list(Ppr = ppr,  z.calc =z.calc, z.chart = z.chart, ape=ape))
rownames(df) <- tpr
df
# HY = 0.9580002; # DAK = 0.9551087
```

## Get `z` at selected `Ppr` and `Tpr=1.1`

From the Standing-Katz chart we obtain `z` at a digitized point:

```library(zFactor)

ppr <- 1.5
tpr <- 1.1
z.calc <- z.DranchukPurvisRobinson(pres.pr = ppr, temp.pr = tpr)

# From the Standing-Katz chart we obtain a digitized point:
z.chart <- getStandingKatzMatrix(tpr_vector = tpr,
pprRange = "lp")[1, as.character(ppr)]

# calculate the APE
ape <- abs((z.calc - z.chart) / z.chart) * 100

df <- as.data.frame(list(Ppr = ppr,  z.calc =z.calc, z.chart = z.chart, ape=ape))
rownames(df) <- tpr
df
```

At lower `Tpr` we can see that there is some difference between the values of z from the DPR calculation and the value read from the Standing-Katz chart.

## Get values of `z` for combinations of `Ppr` and `Tpr`

In this example we provide vectors instead of a single point. With the same `ppr` and `tpr` vectors that we use for the correlation, we do the same for the `Standing-Katz` chart. We want to compare both and find the `absolute percentage error`.

```# test HY with 1st-derivative using the values from paper
ppr <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5)
tpr <- c(1.05, 1.1, 1.7, 2)

# calculate using the correlation
z.calc <- z.DranchukPurvisRobinson(ppr, tpr)

# With the same ppr and tpr vector, we do the same for the Standing-Katz chart
z.chart <- getStandingKatzMatrix(ppr_vector = ppr, tpr_vector = tpr)
ape <- abs((z.calc - z.chart) / z.chart) * 100

# calculate the APE
cat("z.correlation \n"); print(z.calc)
cat("\n z.chart \n"); print(z.chart)
cat("\n APE \n"); print(ape)
```

You can see errors of `r round(ape["1.05", "1.5"],2)`% and `r round(ape["1.05", "2.5"],2)`% in the isotherm `Tpr=1.05`. Other errors, greater than one, can also be found at the isotherm `Tpr=1.1`: `r round(ape["1.1", "2.5"],2)`%. Then, the rest of the `Tpr` curves are fine.

## Analyze the error at the `isotherms`

Applying the function `summary` over the transpose of the matrix:

```sum_t_ape <- summary(t(ape))
sum_t_ape
```

We see that the errors at `Tpr=1.05` in `z` are considerable with a `r sum_t_ape[1,1]`% and `r sum_t_ape[6,1]`%, and a `r sum_t_ape[1,2]`% and `r sum_t_ape[6,2]`% for `Tpr=1.10`. We will explore later a comparative tile chart where we confirm these early calculations.

## Analyze the error for greater values of `Tpr`

```library(zFactor)
# enter vectors for Tpr and Ppr
tpr2 <- c(1.2, 1.3, 1.5, 2.0, 3.0)
ppr2 <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5)
# get z values from the SK chart
z.chart <- getStandingKatzMatrix(ppr_vector = ppr2, tpr_vector = tpr2, pprRange = "lp")

# We do the same with the DPR correlation:
# calculate z values at lower values of Tpr
z.calc <- z.DranchukPurvisRobinson(pres.pr = ppr2, temp.pr = tpr2)
ape <- abs((z.calc - z.chart) / z.chart) * 100

# calculate the APE
cat("z.correlation \n"); print(z.calc)
cat("\n z.chart \n"); print(z.chart)
cat("\n APE \n"); print(ape)
```

We observe that at `Tpr` above or equal to 1.2 the `DAK` correlation behaves very well.

## Analyze the error at the `isotherms`

Applying the function `summary` over the transpose of the matrix to observe the error of the correlation at each isotherm.

```sum_t_ape <- summary(t(ape))
sum_t_ape
# Hall-Yarborough
#      1.2               1.3              1.5               2
# Min.   :0.05224   Min.   :0.1105   Min.   :0.1021   Min.   :0.0809
# 1st Qu.:0.09039   1st Qu.:0.2080   1st Qu.:0.1623   1st Qu.:0.1814
# Median :0.28057   Median :0.3181   Median :0.1892   Median :0.1975
# Mean   :0.30122   Mean   :0.3899   Mean   :0.2597   Mean   :0.2284
# 3rd Qu.:0.51710   3rd Qu.:0.5355   3rd Qu.:0.3533   3rd Qu.:0.2627
# Max.   :0.57098   Max.   :0.8131   Max.   :0.5162   Max.   :0.4338
#       3
# Min.   :0.09128
# 1st Qu.:0.17466
# Median :0.35252
# Mean   :0.34820
# 3rd Qu.:0.52184
# Max.   :0.59923
```

We see that the errors in `z` with `DPR` are far lower than `Hall-Yarborough` with a `r sum_t_ape[1,1]`% and `r sum_t_ape[6,1]`% for `Tpr=1.2`, and a `r sum_t_ape[1,2]`% and `r sum_t_ape[6,2]`% for `Tpr=1.3`.

## Prepare to plot `SK` vs `DPR` correlation

The error bars represent the difference between the calculated values by the `DPR` corrrelation and the `z` values read from the Standing-Katz chart.

```library(zFactor)
library(tibble)
library(ggplot2)

tpr2 <- c(1.05, 1.1, 1.2, 1.3)
ppr2 <- c(0.5, 1.0, 1.5, 2, 2.5, 3.0, 3.5, 4.0, 4.5, 5.0, 5.5, 6.0, 6.5)

sk_corr_2 <- createTidyFromMatrix(ppr2, tpr2, correlation = "DPR")
as_tibble(sk_corr_2)

p <- ggplot(sk_corr_2, aes(x=Ppr, y=z.calc, group=Tpr, color=Tpr)) +
geom_line() +
geom_point() +
geom_errorbar(aes(ymin=z.calc-dif, ymax=z.calc+dif), width=.4,
position=position_dodge(0.05))
print(p)
```

We observe that with `Dranchuk-Purvis-Robinson` there are still errors or differences between the `z` values in the Standing-Katz chart and the values obtained from the correlation but they are not so bad as in the HY correlation.

## Analysis at the lowest `Tpr`

This is the isotherm where we see the greatest error.

```library(zFactor)

sk_corr_3 <- sk_corr_2[sk_corr_2\$Tpr==1.05,]
sk_corr_3

p <- ggplot(sk_corr_3, aes(x=Ppr, y=z.calc, group=Tpr, color=Tpr)) +
geom_line() +
geom_point() +
geom_errorbar(aes(ymin=z.calc-dif, ymax=z.calc+dif), width=.2,
position=position_dodge(0.05))
print(p)
```

## Analyzing performance of the `DPR` correlation for all the `Tpr` curves

In this last example, we compare the values of `z` at all the isotherms. We use the function `getStandingKatzTpr` to obtain all the isotherms or `Tpr` curves in the Standing-Katz chart that have been digitized. The next function `createTidyFromMatrix` calculate `z` using the correlation and prepares a tidy dataset ready to plot.

```library(ggplot2)
library(tibble)

# get all `lp` Tpr curves
tpr_all <- getStandingKatzTpr(pprRange = "lp")
ppr <- c(0.5, 1.5, 2.5, 3.5, 4.5, 5.5, 6.5)

sk_corr_all <- createTidyFromMatrix(ppr, tpr_all, correlation = "DPR")
as_tibble(sk_corr_all)

p <- ggplot(sk_corr_all, aes(x=Ppr, y=z.calc, group=Tpr, color=Tpr)) +
geom_line() +
geom_point() +
geom_errorbar(aes(ymin=z.calc-dif, ymax=z.calc+dif), width=.4,
position=position_dodge(0.05))
print(p)
```

The greatest errors are localized in two of the Tpr curves: at 1.05 and 1.1.

## Range of applicability of the correlation

```# MSE: Mean Squared Error
# RMSE: Root Mean Squared Error
# RSS: residual sum of square
# ARE:  Average Relative Error, %
# AARE: Average Absolute Relative Error, %
library(dplyr)
grouped <- group_by(sk_corr_all, Tpr, Ppr)

smry_tpr_ppr <- summarise(grouped,
RMSE= sqrt(mean((z.chart-z.calc)^2)),
MPE = sum((z.calc - z.chart) / z.chart) * 100 / n(),
MAPE = sum(abs((z.calc - z.chart) / z.chart)) * 100 / n(),
MSE = sum((z.calc - z.chart)^2) / n(),
MAE = sum(abs(z.calc - z.chart)) / n(),
RMLSE = sqrt(1/n()*sum((log(z.calc +1)-log(z.chart +1))^2))
)

ggplot(smry_tpr_ppr, aes(Ppr, Tpr)) +
geom_tile(data=smry_tpr_ppr, aes(fill=MAPE), color="white") +
scale_fill_gradient2(low="blue", high="red", mid="yellow", na.value = "pink",
midpoint=12.5, limit=c(0, 25), name="MAPE") +
theme(axis.text.x = element_text(angle=45, vjust=1, size=11, hjust=1)) +
coord_equal() +
ggtitle("Dranchuk-Purvis-Robinson", subtitle = "DPR")
```

The greatest errors are localized in two of the Tpr curves: 1.05 and barely at 1.1

## Plotting the `Tpr` and `Ppr` values that show more error

```library(dplyr)

sk_corr_all %>%
filter(Tpr %in% c("1.05", "1.1")) %>%
ggplot(aes(x = z.chart, y=z.calc, group = Tpr, color = Tpr)) +
geom_point(size = 3) +
geom_line(aes(x = z.chart, y = z.chart), color = "black") +
facet_grid(. ~ Tpr) +
geom_errorbar(aes(ymin=z.calc-abs(dif), ymax=z.calc+abs(dif)),
position=position_dodge(0.5))
```

## Looking numerically at the errors

Finally, the dataframe with the calculated errors between the `z` from the correlation and the `z` read from the chart:

```as_tibble(smry_tpr_ppr)
```

## Try the zFactor package in your browser

Any scripts or data that you put into this service are public.

zFactor documentation built on Aug. 1, 2019, 5:04 p.m.