View source: R/share_disagreement.R
| share_disagreement | R Documentation |
This function calculates the proportion of events for which two or more distinct values are reported for each specified variable. It is useful for identifying which variables are most commonly inconsistent across event reports describing the same event.
share_disagreement(data, group_var, variables)
data |
A data frame containing event report level data. |
group_var |
A character string naming the column that uniquely identifies events (e.g., "event_id"). |
variables |
A character vector of column names to check for disagreement. |
For each event and variable, the function checks whether all values reported across event reports are identical. It then calculates the share of events for which at least two different values are reported. The result is a long-format dataframe that highlights which variables most frequently exhibit inter-source disagreement.
A tibble with two columns:
The name of each variable.
The proportion of events with disagreement for that variable.
df <- data.frame(
event_id = c(1, 1, 2, 2, 3),
actor1 = c("Actor A", "Actor B", "Actor B", "Actor B", "Actor C"),
deaths_best = c(10, 10, 5, 15, 10)
)
share_disagreement(
df,
group_var = "event_id",
variables = c("actor1", "deaths_best")
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.