ccc: Concordance correlation coefficient
In yardstick: Tidy Characterizations of Model Performance

View source: R/num-ccc.R

ccc	R Documentation

Concordance correlation coefficient

Description

Calculate the concordance correlation coefficient.

Usage

ccc(data, ...)

## S3 method for class 'data.frame'
ccc(
  data,
  truth,
  estimate,
  bias = FALSE,
  na_rm = TRUE,
  case_weights = NULL,
  ...
)

ccc_vec(truth, estimate, bias = FALSE, na_rm = TRUE, case_weights = NULL, ...)

Arguments

`data`	A `data.frame` containing the columns specified by the `truth` and `estimate` arguments.
`...`	Not currently used.
`truth`	The column identifier for the true results (that is `numeric`). This should be an unquoted column name although this argument is passed by expression and supports quasiquotation (you can unquote column names). For `⁠_vec()⁠` functions, a `numeric` vector.
`estimate`	The column identifier for the predicted results (that is also `numeric`). As with `truth` this can be specified different ways but the primary method is to use an unquoted variable name. For `⁠_vec()⁠` functions, a `numeric` vector.
`bias`	A `logical`; should the biased estimate of variance be used (as is Lin (1989))?
`na_rm`	A `logical` value indicating whether `NA` values should be stripped before the computation proceeds.
`case_weights`	The optional column identifier for case weights. This should be an unquoted column name that evaluates to a numeric column in `data`. For `⁠_vec()⁠` functions, a numeric vector, `hardhat::importance_weights()`, or `hardhat::frequency_weights()`.

Details

ccc() is a metric of both consistency/correlation and accuracy, while metrics such as rmse() are strictly for accuracy and metrics such as rsq() are strictly for consistency/correlation

CCC is a metric that should be maximized. The output ranges from -1 to 1, with 1 indicating perfect agreement.

The formula for CCC is:

\text{CCC} = \frac{2 \cdot \text{cov}(\text{truth}, \text{estimate})}{\text{var}(\text{truth}) + \text{var}(\text{estimate}) + (\bar{\text{truth}} - \bar{\text{estimate}})^2}

Value

A tibble with columns .metric, .estimator, and .estimate and 1 row of values.

For grouped data frames, the number of rows returned will be the same as the number of groups.

For ccc_vec(), a single numeric value (or NA).

Author(s)

Max Kuhn

References

Lin, L. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics, 45 (1), 255-268.

Nickerson, C. (1997). A note on "A concordance correlation coefficient to evaluate reproducibility". Biometrics, 53(4), 1503-1507.

Examples

# Supply truth and predictions as bare column names
ccc(solubility_test, solubility, prediction)

library(dplyr)

set.seed(1234)
size <- 100
times <- 10

# create 10 resamples
solubility_resampled <- bind_rows(
  replicate(
    n = times,
    expr = sample_n(solubility_test, size, replace = TRUE),
    simplify = FALSE
  ),
  .id = "resample"
)

# Compute the metric by group
metric_results <- solubility_resampled |>
  group_by(resample) |>
  ccc(solubility, prediction)

metric_results

# Resampled mean estimate
metric_results |>
  summarise(avg_estimate = mean(.estimate))
# Using a different value of 'bias'... if you are adding the metric to a
# metric set, you can create a new metric function with the updated argument
# value:

ccc_bias <- metric_tweak("ccc_bias", ccc, bias = TRUE)
multi_metrics <- metric_set(ccc, ccc_bias)
multi_metrics(solubility_test, solubility, prediction)

yardstick documentation built on April 8, 2026, 1:06 a.m.