mean_dscore: Calculate the mean divergence scores across event reports
In eventreport: Diagnose, Visualize, and Aggregate Event Report Level Data

mean_dscore

R Documentation

Calculate the mean divergence scores across event reports

Description

This function calculates the mean divergence score for one or more variables grouped by an event identifier. The divergence score captures how often values for a given variable differ across event reports describing the same event.

Usage

mean_dscore(data, group_var, variables, normalize = FALSE, plot = FALSE)

Arguments

`data`	A data frame containing event report level data.
`group_var`	A character string naming the column that uniquely identifies events (e.g., "event_id").
`variables`	A character vector of column names to compute divergence scores for.
`normalize`	Logical, indicating whether to normalize the scores by the total number of unique values for each variable.
`plot`	Logical, indicating whether to return a ggplot object visualizing the scores.

Details

For each variable and event, the function computes the number of unique values reported, subtracts one, and averages these values across all events. This reflects how much inconsistency exists across sources. Optionally, the scores can be normalized by the total number of unique values observed for each variable across the dataset. The result is a long-format dataframe showing which variables are most sensitive to aggregation. A plotting option is also available.

Value

Either a tibble or a ggplot object, depending on the value of plot. If plot = FALSE, returns a tibble with two columns:

variable: The name of each variable.
dscore: The mean divergence score or normalized score.

If plot = TRUE, returns a lollipop-style plot showing divergence scores by variable.

Examples

df <- data.frame(
  event_id = c(1, 1, 2, 2, 3),
  country = c("US", "US", "UK", "UK", "CA"),
  actor1 = c("Actor A", "Actor B", "Actor B", "Actor C", "Actor D"),
  deaths_best = c(10, 20, 5, 15, 10)
)
mean_dscore(df, "event_id", c("country", "actor1", "deaths_best"), normalize = TRUE, plot = TRUE)

eventreport documentation built on March 11, 2026, 1:07 a.m.