mean_sd: Calculate the mean within-event standard deviation across...
In eventreport: Diagnose, Visualize, and Aggregate Event Report Level Data

mean_sd

R Documentation

Calculate the mean within-event standard deviation across event reports for numeric variables

Description

This function calculates the mean standard deviation for one or more numeric variables grouped by an event identifier. It is useful for diagnosing aggregation sensitivity by assessing how much variation exists in numeric values reported across event reports concerning the same event.

Usage

mean_sd(data, group_var, variables)

Arguments

`data`	A data frame containing event report level data.
`group_var`	A character string naming the column that uniquely identifies events (e.g., "event_id").
`variables`	A character vector of column names to compute standard deviations for. All specified variables must be numeric.

Details

For each variable and event, the function computes the standard deviation of values reported across event reports These values are then averaged across all events to produce a single score per variable. The result is a long-format dataframe that shows which numeric variables exhibit the most event report level disagreement

Value

A tibble with two columns:

variable: The name of each variable.
mean_sd: The mean standard deviation across events for that variable.

Examples

df <- data.frame(
  event_id = c(1, 1, 2, 2, 3),
  country = c("US", "US", "UK", "UK", "CA"),
  actor1 = c("Actor A", "Actor B", "Actor B", "Actor C", "Actor D"),
  deaths_best = c(10, 20, 5, 15, 10)
)
mean_sd(
  df,
  group_var = "event_id",
  variables = c("deaths_best")
)

eventreport documentation built on March 11, 2026, 1:07 a.m.