\FloatBarrier
This file analyzes data derived from publicly archived data files (Campbell, 2019) to produce outputs for a descriptive paper about the criminal histories of suspected serial sexual perpetrators.^[In this document, we use the terms perpetrator and offender interchangeably. Both terms should be interpreted as referring to individuals suspected of committing a crime but we occasionally omit that qualifier for the sake of brevity.] The data come from Detroit, MI.
\FloatBarrier
Numerous sections below contain output for GEE models and estimates derived from them, such as estimated marginal means and contrasts for ratios of those means. We performed those analyses because we think the contrasts provide useful insights into how much offenders' rates of criminal activity differed between the periods before versus after the testing window. Reviewers found some of them less relevant than we did. We have left those results in the research compendium despite having removed some of them from the final paper because readers may still find them interesting and relevant. Preserving them facilitates achieving transparency regarding all the analyses we conducted.
In particular note that we ran models that compared incident counts with and without adjustments for period duration. Only the adjusted comparisons remain in the published paper.
Set global R chunk options (local chunk options will over-ride global options).
The method for creating a size option that controls font size in code chunks and
their text output is based on an answer to a question posted on
stackoverflow.com.
``` {r global_options, cfsize = "footnotesize"}
def.chunk.hook <- knitr::knit_hooks$get("chunk") knitr::knit_hooks$set(chunk = function(x, options) { x <- def.chunk.hook(x, options) ifelse(options$cfsize != "normalsize", paste0("\n \", options$cfsize,"\n\n", x, "\n\n \normalsize"), x) })
knitr::opts_chunk$set(include = TRUE, echo = TRUE, error = TRUE, message = TRUE, warning = TRUE, cfsize = "footnotesize")
here::i_am(path = "inst/Step_02_Analysis.Rmd")
Load R packages that we need to get additional functions. ``` {r load_packages} library(here) # for here() library(plyr) # For mapvalues() library(dplyr) # for %>%, filter(), group_by(), & summarise() library(tidyr) # for arrange(), filter(), group_by(), mutate(), # spread(), summarise(), %>%, etc. library(rmarkdown) # for render() library(knitr) # for kable() options(kableExtra.latex.load_packages = FALSE) library(kableExtra) # for kable_styling(), add_header_above(), column_spec(), # collapse_rows(), and landscape() library(descr) # For freq(). options(descr.plot=FALSE) # Make freq() & crosstab() skip plots by default. library(lubridate) # For date conversion, eg. ymd(), time_length(). library(sjlabelled) # For set_label(), get_label() library(haven) # for read_spss() library(lattice) # For xyplot(), bwplot(), etc. library(latticeExtra) # for layer() library(psych) # For describe() library(car) # For recode() library(SSACHR) # for git_report(), rvlabel(), which_latex() library(geepack) # for geeglm() library(emmeans) # for emmeans() library(texreg) # for texreg() library(ggplot2) # for ggplot() library(ggdist) # for stat_halfeye()
Create objects to hold settings we plan to reuse often.
# Custom settings for use with bwplot(). my.boxes <- list(box.rectangle = list(col = "black", lwd = 2), box.umbrella = list(col = "black", lwd = 2))
\FloatBarrier
The chunk below loads the data saved by the prior script Step_01_Data_Mgt.Rmd.
load(file = here::here("data/CHR_Data.RData"))
\FloatBarrier
We have few offender-level demographic variables that are worth reporting (we will omit summarizing height, weight, hair color, & eye color). After examining sex, race, and age, we switch focus to looking at descriptive data about the criminal history records for arrests, prosecutor charges, and adjudicated charges.
# Store objects with sample sizes we will reuse often later. N_All <- IDNEW %>% nrow() N_Before <- IDNEW %>% filter(Years_Before > 0) %>% nrow() N_During <- IDNEW %>% filter(Years_During > 0) %>% nrow() N_After <- IDNEW %>% filter(Years_After > 0) %>% nrow() data.frame(N_All, N_Before, N_During, N_After)
All the CHR summaries in this section focus on variables that have been
aggregated to the offender level. That is why the sample size for all of these
variables is N = r format(N_All, big.mark = ",")
(the number of offenders for
whom Michigan CHR data were available and the start date for the earliest SAK
testing window could be established).
\FloatBarrier
kable(freq(as_factor(IDNEW$Sex)), format = "latex", booktabs = TRUE, digits = 2, caption = "Offender Sex")
\FloatBarrier
kable(freq(as_factor(IDNEW$Race), user.missing = "Unknown"), format = "latex", booktabs = TRUE, digits = 2, caption = "Offender Race")
\FloatBarrier
Table \ref{tab:examine_Age} presents descriptive statistics for offender age at the start of the earliest testing window and at the time of CHR collection. Figure \ref{fig:boxplot_Age} shows boxplots of the distributions for both of these age variables.
CNames <- c("Variable", "N", "Mean", "SD", "Min", "Max", "Skew", "Kurtosis", "Q25", "Q50", "Q75") IDNEW %>% select(AgeWDate, Age) %>% describe(., quant=c(.25, .50, .75)) %>% mutate(Variable = c("Age at earliest SAK", "Age at CHR collection")) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, col.names = CNames, row.names = FALSE, caption = "Offender Age")
# Figure caption. FCap <- paste("\\label{fig:boxplot_Age}", "Box Plots for Offender Age. Only offenders with valid testing", "window start dates are included. The red line marks age 16", "years (the start of observed adult CHR data). CHR collection", "occurred on 04/15/2016 for all offfenders.", "CHR, criminal history record; SAK, sexual assault kit.") IDNEW %>% select(OID, AgeWDate, Age) %>% # Rearrange data so we can easily plot multiple variables. pivot_longer(cols = -OID, names_to = "VName", values_to = "Age") %>% mutate(VName = ordered(VName, levels = c("AgeWDate", "Age"), labels = c("At earliest SAK", "At CHR collection"))) %>% bwplot(VName ~ Age, data = ., factor = 1, xlim = c(-2, 82), panel = panel.mybox, par.settings = my.boxes, xlab = "Offender Age") + latticeExtra::layer(panel.abline(v = 16, col.line = "red"))
\FloatBarrier
The observed portions of these offenders' adult criminal histories include incident, arrest, prosecutor charge, and judicial charge records associated with crimes that occurred between the offender's 16th birthday and April 15, 2016 (when the CHR data were collected from the official statewide CHR database). That time frame can be divided into three important periods for the purposes of this study: before, during, and after the testing window associated with the offender's earliest sexual assault kit (SAK). Here we provide descriptive data about those periods. Note that the relative timing of an offender's earliest SAK relative to their 16th birthday controls how much of the before and during periods were observed versus unobserved. Figures \ref{fig:boxplot_Age16Date} and \ref{fig:boxplot_WDate} respectively show boxplots for the date of the offenders 16th birthday and the start date of the earliest testing window (which is the date the earliest SAK was collected).
# Figure caption. FCap <- paste("\\label{fig:boxplot_Age16Date}", "Box Plot for Date of Offender's 16th Birthday. Only offenders", "with valid testing window start dates are included.", "These dates mark the start of the observed adult criminal", "history record for each offender.") bwplot(~ Age16Date, data = IDNEW, factor = 9, xlim = c(ymd("1948-01-01"), ymd("2012-12-31")), panel = panel.mybox, par.settings = my.boxes, xlab = "Offender's 16th Birthday")
# Figure caption. FCap <- paste("\\label{fig:boxplot_WDate}", "Box Plot for Start Date of Earliest Testing Window. Only", "offenders with valid testing window start dates are included.") bwplot(~ WDate, data = IDNEW, factor = 9, xlim = c(ymd("1984-01-01"), ymd("2016-12-31")), panel = panel.mybox, par.settings = my.boxes, xlab = "Start of Earliest Testing Window")
Those dates were used in the previous data management script to compute the durations of the observed portions of adult CHR that correspond to the periods before, during, and after the earliest testing window. Table \ref{tab:describe_windows} shows descriptive statistics for the these durations and Figure \ref{fig:boxplot_windows} shows corresponding boxplots. These provide crucial context for interpreting the numbers of crimes observed in each of those periods and provide the denominators for translating them into incidence rates.
# Table caption. TCap <- paste("Observed Period Durations (Years)") # Footnote text. FN <- paste0("Only offenders with valid testing window start dates are ", "included. ", "TW, earliest testing window.") rlabs <- c("Before TW", "During TW", "After TW") IDNEW %>% select(Years_Before, Years_During, Years_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = rlabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Figure caption. FCap <- paste("\\label{fig:boxplot_windows}", "Box Plots for Duration of Observed Portions of Periods.", "Only offenders with valid testing", "window start dates are included. The red line marks age 16", "years, which is the start of observed adult CHR, which was ", "collected on 04/15/2016.", "TW, earliest testing window.") IDNEW %>% select(OID, Years_Before, Years_During, Years_After) %>% # Rearrange data so we can easily plot multiple variables. pivot_longer(cols = -OID, names_to = "VName", values_to = "Duration") %>% mutate(VName = ordered(VName, levels = rev(c("Years_Before", "Years_During", "Years_After")), labels = rev(rlabs))) %>% bwplot(VName ~ Duration, data = ., factor = 1, xlim = c(-2, 42), panel = panel.mybox, par.settings = my.boxes, xlab = "Observed Period Duration (Years)")
We aggregated data to count how many CHR records of each type are associated with each offender. Here we are counting only records associated with incidents for which the offender was arrested for, charged with, or convicted of at least one offense from the 12 main crime categories. We summarize the resulting variables (NINC12, NARR12, NCHG12, NJUD12, & NCON12) below. Table \ref{tab:describe_rcvars} provides overall descriptive statistics for all record types, plus summaries broken down by when the associated incidents occurred relative to the testing window.
Only the data about incident records counts from Table \ref{tab:describe_rcvars} is directly reported in the manuscript. The numbers about other types of records are contextual information about the volume of data that was available.
# Table caption. TCap <- paste("Offender-Level Record Counts (All Crime Categories, Overall and", "By When Incident Occurred Relative to Testing Window)") # Footnote text. FN <- paste("Only offenders with valid testing window start dates are included.", "Only records associated with incidents where the offender was", "arrested for, charged with, or convicted of at least one offense", "from one of the 12 main crime categories were counted.", "Before, during, and after refer to when the incident associated", "with the record occurred, not when the arrest date, charge date,", "adjudication date, or conviction date occurred relative to the", "testing window.") BDA <- c("...Before testing window", "...During testing window", "...After testing window") rclabs <- c("No. of incidents", BDA, "No. of arrest offenses", BDA, "No. of prosecutor charges", BDA, "No. of adjudicated charges (all)", BDA, "No. of adjudicated charges (convictions)", BDA) # Summarize CHR record counts (both overall and broken down by IWhen). IDNEW %>% rename(NINC12 = HXCat12_Any, NINC12_Before = HXCat12_Any_Before, NINC12_During = HXCat12_Any_During, NINC12_After = HXCat12_Any_After) %>% select(NINC12, NINC12_Before, NINC12_During, NINC12_After, NARR12, NARR12_Before, NARR12_During, NARR12_After, NCHG12, NCHG12_Before, NCHG12_During, NCHG12_After, NJUD12, NJUD12_Before, NJUD12_During, NJUD12_After, NCON12, NCON12_Before, NCON12_During, NCON12_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = rclabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap) %>% group_rows(" ", 1, 4) %>% group_rows(" ", 5, 8) %>% group_rows(" ", 9, 12) %>% group_rows(" ", 13, 16) %>% group_rows(" ", 17, 20) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Vectors of variables names and labels. rcvars <- c("NINC12", "NARR12", "NCHG12", "NJUD12", "NCON12") rctypes <- c("Incidents", "Arrest Offenses", "Prosecutor Charges", "Adjudicated Charges", "Convictions") IDNEW %>% rename(NINC12 = HXCat12_Any) %>% select(OID, NINC12, NARR12, NCHG12, NJUD12, NCON12) %>% # Rearrange data so we can easily plot multiple variables. pivot_longer(cols = -OID, names_to = "VName", values_to = "Count") %>% mutate(VName = ordered(VName, levels = rcvars, labels = rctypes)) -> RCCOUNTS
# Figure caption. FCap <- paste("\\label{fig:boxplot_RCCOUNTS}", "Box Plots for Overall Record Counts by Criminal History Record", "Type. Only offenders with valid testing window start dates are", "included.", "Only records associated with incidents where the offender was", "arrested for, charged with, or convicted of at least one offense", "from one of the 12 main crime categories were counted.") bwplot(VName ~ Count, data = RCCOUNTS, factor = 1, xlim = c(-4, 64), panel = panel.mybox, par.settings = my.boxes, xlab = "Overall Number of Records for Individual Offender")
``` {r densityplot_RCCOUNTS, fig.height = 3, fig.width = 8.5, fig.cap=FCap}
FCap <- paste("\label{fig:densityplot_RCCOUNTS}", "Density Plots for Overall Record Counts by Criminal History", "Record Type. Only offenders with valid testing window start", "dates are included.", "Only records associated with incidents where the offender was", "arrested for, charged with, or convicted of at least one offense", "from one of the 12 main crime categories were counted.", "Dashed red lines are normal distribution curves (for comparison).")
densityplot(~ Count | VName, data = RCCOUNTS, layout = c(5,1), panel = panel.mydensity, jitter.amount = 0.005, strip = strip.custom(par.strip.text = list(cex = .8)), xlab = "Overall Number of Records for Offender")
In the subsections below, Tables \ref{tab:freq_NINC12}, \ref{tab:freq_NARR12}, \ref{tab:freq_NCHG12}, \ref{tab:freq_NJUD12}, and \ref{tab:freq_NCON12} respectively show the frequency distributions underlying those statistics for incidents, arrest offenses, prosecutor charges, all adjudicated charges, and adjudicated charges that were convictions. \FloatBarrier ### Incident Records ```r # Table caption. TCap <- paste("Frequency Distributions for Number of Incidents", "Overall and By When Incident Occurred") # Footnote text. FN <- paste0("Only offenders with valid testing window start dates are ", "included. Only incidents where the offender was arrested for, ", "charged with, or convicted of at least one offense from one of ", "the 12 main crime categories were included. ", "N = ", format(N_All, big.mark = ","), ". TW, earliest ", "testing window.") # Get frequency distributions. IDNEW %>% rename(NINC12 = HXCat12_Any, NINC12_Before = HXCat12_Any_Before, NINC12_During = HXCat12_Any_During, NINC12_After = HXCat12_Any_After) %>% select(OID, NINC12, NINC12_Before, NINC12_During, NINC12_After) %>% rename(All = NINC12, Before = NINC12_Before, During = NINC12_During, After = NINC12_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
To more closely examine the incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_Any_gee1 <- IDNEWL %>% # Keep only observations for observed periods (i.e., those with Years > 0). filter(Years > 0) %>% geeglm(HXCat12_Any ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_Any_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Any ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_Any_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference level for When was the period after the testing", "window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_Any_gee1, HXCat12_Any_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_Any_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_Any_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_Any_emmeans1 <- emmeans(HXCat12_Any_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_Any_emmeans2 <- emmeans(HXCat12_Any_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_Any_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_Any_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_Any_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Incident", "Counts and Incidence Rates By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_Any_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_Any_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Output in this section is not directly used in the manuscript. It is here because it helped us to understand the volume of raw data that was being processed along the way to constructing the data sets used for the manuscript.
# Table caption. TCap <- paste("Frequency Distributions for Number of Arrest Offenses (All", "Crime Categories) Overall and By When Incident Occurred") # Footnote text. FN <- paste0("N = ", format(N_All, big.mark = ","), ". TW, earliest ", "testing window. ", "Only arrest records for offenders with valid testing window start ", "dates and with offenses from the 12 main crime categories were ", "included. ", "Before, during, and after refer to when the incident associated ", "with the record occurred, not when the arrest date occurred ", "relative to the testing window. For example, an offender could be ", "arrested during the testing window for an incident that occurred ", "before it began. That would show up here in the before column.") # Get frequency distributions. IDNEW %>% select(OID, NARR12, NARR12_Before, NARR12_During, NARR12_After) %>% rename(All = NARR12, Before = NARR12_Before, During = NARR12_During, After = NARR12_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Output in this section is not directly used in the manuscript. It is here because it helped us to understand the volume of raw data that was being processed along the way to constructing the data sets used for the manuscript.
# Table caption. TCap <- paste("Frequency Distributions for Number of Prosecutor Charges (All", "Crime Categories) Overall and By When Incident Occurred") # Footnote text. FN <- paste0("N = ", format(N_All, big.mark = ","), ". TW, earliest ", "testing window. ", "Only charge records for offenders with valid testing window start ", "dates and with charges from the 12 main crime categories were ", "included. ", "Before, during, and after refer to when the incident associated ", "with the record occurred, not when the charge date occurred ", "relative to the testing window. For example, an offender could be ", "charged during the testing window for an incident that occurred ", "before it began. That would show up here in the before column.") # Get frequency distributions. IDNEW %>% select(OID, NCHG12, NCHG12_Before, NCHG12_During, NCHG12_After) %>% rename(All = NCHG12, Before = NCHG12_Before, During = NCHG12_During, After = NCHG12_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Output in this section is not directly used in the manuscript. It is here because it helped us to understand the volume of raw data that was being processed along the way to constructing the data sets used for the manuscript.
# Table caption. TCap <- paste("Frequency Distributions for Number of Adjudicated Charges (All", "Crime Categories) Overall and By When Incident Occurred") # Footnote text. FN <- paste0("N = ", format(N_All, big.mark = ","), ". TW, earliest ", "testing window. ", "Only records for offenders with valid testing window start dates ", "and with charges from one of the 12 main crime categories were ", "included. ", "Before, during, and after refer to when the incident associated ", "with the record occurred, not when the adjudication date occurred ", "relative to the testing window. For example, a charge could be ", "adjudicated during the testing window for an incident that ", "occurred before it began. That would show up here in the before ", "column.") # Get frequency distributions. IDNEW %>% select(OID, NJUD12, NJUD12_Before, NJUD12_During, NJUD12_After) %>% rename(All = NJUD12, Before = NJUD12_Before, During = NJUD12_During, After = NJUD12_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Output in this section is not directly used in the manuscript. It is here because it helped us to understand the volume of raw data that was being processed along the way to constructing the data sets used for the manuscript.
# Table caption. TCap <- paste("Frequency Distributions for Number of Adjudicated Charges (All", "Crime Categories, Convictions) Overall and By When Incident", "Occurred") # Footnote text. FN <- paste0("N = ", format(N_All, big.mark = ","), ". TW, earliest ", "testing window. ", "Only conviction records for offenders with valid testing window ", "start dates and with charges from one of the 12 main crime ", "categories were included. ", "Before, during, and after refer to when the incident associated ", "with the record occurred, not when the adjudication date occurred ", "relative to the testing window. For example, an offender could be ", "convicted during the testing window for an incident that occurred ", "before it began. That would show up here in the before column.") # Get frequency distributions. IDNEW %>% select(OID, NCON12, NCON12_Before, NCON12_During, NCON12_After) %>% rename(All = NCON12, Before = NCON12_Before, During = NCON12_During, After = NCON12_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Table \ref{tab:describe_cccvars} summarizes how many of 12 different crime categories each offender was arrested for, charged with, or convicted of according to the CHR data. This sums a set of binary transforms on the incident count variables associated with each crime category in HXCat12. It is possible to have multiple arrest offense records, charge records, or conviction records for a specific category, but we only count each distinct category once at most. The Excluded user-missing value is not counted as a category, so the possible range of values for these counts is 0-12. We report both overall results, and results broken down by when the incidents associated with the arrest, charge, and conviction records occurred relative to the earliest testing window. Table \ref{tab:freq_HXCat12_Count} shows the frequency distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Crime Category Counts (Overall and By When", "Incident Occurred)") # Footnote text. FN <- paste("Only offenders with valid testing window start dates are included.", "Only records associated with incidents where the offender was ", "arrested for, charged with, or convicted of at least one offense ", "from one of the 12 main crime categories were counted. ", "Before, during, and after refer to when the incident associated", "with the record occurred, not when the arrest date, charge date,", "adjudication date, or conviction date occurred relative to the", "testing window.") cclabs <- c("No. of crime categories (arrested, charged, or convicted)", BDA) # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_Count, HXCat12_Count_Before, HXCat12_Count_During, HXCat12_Count_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = cclabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Crime Categories for", "Which Offender Was Arrested, Charged, or Convicted (Overall", "and By When Incident Occurred)") # Footnote text. FN <- paste0("N = ", format(N_All, big.mark = ","), ". TW, earliest ", "testing window. ", "Only records associated with incidents where the offender was ", "arrested for, charged with, or convicted of at least one offense ", "from one of the 12 main crime categories were counted. ", "Before, during, and after refer to when the incident associated ", "with the record occurred, not when the charge date occurred ", "relative to the testing window. For example, an offender could be ", "convicted during the testing window for an incident that occurred ", "before it began. That would show up here in the before column.") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_Count, HXCat12_Count_Before, HXCat12_Count_During, HXCat12_Count_After) %>% rename(All = HXCat12_Count, Before = HXCat12_Count_Before, During = HXCat12_Count_During, After = HXCat12_Count_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the crime category counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
cccvar_gee1 <- IDNEWL %>% # Keep only observations for observed periods (i.e., those with Years > 0). filter(Years > 0) %>% geeglm(HXCat12_Count ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") cccvar_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Count ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:cccvar_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Crime Category Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference level for When was the period after the testing", "window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(cccvar_gee1, cccvar_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:cccvar_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:cccvar_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Crime Category Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") cccvar_emmeans1 <- emmeans(cccvar_gee1, specs = pairwise ~ When, type = "response", ref = "During") cccvar_emmeans2 <- emmeans(cccvar_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(cccvar_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(cccvar_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:cccvar_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Crime", "Category Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(cccvar_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(cccvar_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
In this section, we analyze variables that contain the number of unique incidents in the offender's criminal history for a given type of crime. For each category of crime, we report include counts of incidents based on a combined criterion where the offender was arrested for, charged with, or convicted of (any of the three events, or any combination of them) crimes classified into the relevant category.
Note that these variables were created by first aggregating ARR, CHG, and JUD records to the incident level to flag incidents that fit the relevant category, then aggregated again to get to the offender level incident counts. It is possible to have multiple arrest offense records, charge records, or adjudication records for convictions on a single incident, but these variables ignore that and only count each incident once.
It may be helpful to note that the order of subsections below follows the numerical coding of the categories (which was influenced by alphabetical order of the labels we chose). We grouped and presented results in a different order in the manuscript that we think makes the material easier to present to readers. So, in comparing these materials to the published paper, pay attention to the crime category labels and section headings.
The chunk below creates some objects we use to dynamically construct text later in the subsections below.
# Save a formatted value for total number of offenders. N_IDNEW <- format(N_All, big.mark = ",") # Check how many offenders had >= 1 CSC arrest, charge, or conviction in # their CHR data for each crime category. IDNEW %>% filter(HXCat12_1 >= 1) %>% nrow() -> N_HXCat12_1 IDNEW %>% filter(HXCat12_2 >= 1) %>% nrow() -> N_HXCat12_2 IDNEW %>% filter(HXCat12_3 >= 1) %>% nrow() -> N_HXCat12_3 IDNEW %>% filter(HXCat12_4 >= 1) %>% nrow() -> N_HXCat12_4 IDNEW %>% filter(HXCat12_5 >= 1) %>% nrow() -> N_HXCat12_5 IDNEW %>% filter(HXCat12_6 >= 1) %>% nrow() -> N_HXCat12_6 IDNEW %>% filter(HXCat12_7 >= 1) %>% nrow() -> N_HXCat12_7 IDNEW %>% filter(HXCat12_8 >= 1) %>% nrow() -> N_HXCat12_8 IDNEW %>% filter(HXCat12_9 >= 1) %>% nrow() -> N_HXCat12_9 IDNEW %>% filter(HXCat12_10 >= 1) %>% nrow() -> N_HXCat12_10 IDNEW %>% filter(HXCat12_11 >= 1) %>% nrow() -> N_HXCat12_11 IDNEW %>% filter(HXCat12_12 >= 1) %>% nrow() -> N_HXCat12_12 IDNEW %>% filter(HXCat12_Sexual >= 1) %>% nrow() -> N_HXCat12_Sexual IDNEW %>% filter(HXCat12_Violent >= 1) %>% nrow() -> N_HXCat12_Violent IDNEW %>% filter(HXCat12_Property >= 1) %>% nrow() -> N_HXCat12_Property IDNEW %>% filter(HXCat12_Other >= 1) %>% nrow() -> N_HXCat12_Other # Convert those new variables to percentages. P_HXCat12_1 <- round(100*N_HXCat12_1/N_All, digits = 0) P_HXCat12_2 <- round(100*N_HXCat12_2/N_All, digits = 0) P_HXCat12_3 <- round(100*N_HXCat12_3/N_All, digits = 0) P_HXCat12_4 <- round(100*N_HXCat12_4/N_All, digits = 0) P_HXCat12_5 <- round(100*N_HXCat12_5/N_All, digits = 0) P_HXCat12_6 <- round(100*N_HXCat12_6/N_All, digits = 0) P_HXCat12_7 <- round(100*N_HXCat12_7/N_All, digits = 0) P_HXCat12_8 <- round(100*N_HXCat12_8/N_All, digits = 0) P_HXCat12_9 <- round(100*N_HXCat12_9/N_All, digits = 0) P_HXCat12_10 <- round(100*N_HXCat12_10/N_All, digits = 0) P_HXCat12_11 <- round(100*N_HXCat12_11/N_All, digits = 0) P_HXCat12_12 <- round(100*N_HXCat12_12/N_All, digits = 0) P_HXCat12_Sexual <- round(100*N_HXCat12_Sexual/N_All, digits = 0) P_HXCat12_Violent <- round(100*N_HXCat12_Violent/N_All, digits = 0) P_HXCat12_Property <- round(100*N_HXCat12_Property/N_All, digits = 0) P_HXCat12_Other <- round(100*N_HXCat12_Other/N_All, digits = 0)
This next chunk creates some objects to store things like standardized footnotes that we will re-use.
# Footnote text: Descriptives tables. FN1 <- paste("Only offenders with valid testing window start dates are", "included.", "Only incidents where the offender was arrested for, charged with,", "or convicted of at least one offense from one of the 12 main crime", "categories were counted.", "Before, during, and after refer to when the incident", "associated with the record occurred, not when the arrest date,", "charge date, or conviction date occurred relative to the testing", "window.") # Footnote text. FN5 <- paste0("N = ", N_IDNEW, ". TW, earliest testing window. ", "Only offenders with valid testing window start dates are ", "included. Before, during, and after refer to when the incident ", "associated with the record occurred, not when the arrest, ", "charge, or conviction date occurred relative to the testing ", "window. For example, an offender could be arrested during the ", "testing window for an incident that occurred before it began. ", "That would show up here in the before column. ", "An incident is counted if there are any arrest, charge, or ", "conviction records (any one type, or any combination of them ", "will suffice) for the specified crime category associated with ", "it.") # Vector of count types counttypes <- c("Arrested", "Charged", "Convicted", "Arrested, Charged, or Convicted") # Vector of row labels for descriptives tables. TRowLabs <- c("Incident count (arrested, charged, or convicted)", BDA)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_1
(r P_HXCat12_1
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for arson. Table
\ref{tab:describe_arson} summarizes the counts of unique arson incidents
overall and by when the incident occurred, while Table \ref{tab:freq_HXCat12_1}
shows the frequency distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Arson Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_1, HXCat12_1_Before, HXCat12_1_During, HXCat12_1_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Arson Incidents for Which", "Offender Was Arrested, Charged, or Convicted (Overall and By", "When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_1, HXCat12_1_Before, HXCat12_1_During, HXCat12_1_After) %>% rename(All = HXCat12_1, Before = HXCat12_1_Before, During = HXCat12_1_During, After = HXCat12_1_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the arson incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_1_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_1 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_1_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_1 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_1_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Arson Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_1_gee1, HXCat12_1_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_1_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_1_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Arson Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_1_emmeans1 <- emmeans(HXCat12_1_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_1_emmeans2 <- emmeans(HXCat12_1_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_1_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_1_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_1_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Arson", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_1_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_1_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_2
(r P_HXCat12_2
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for assault involving
domestic violence and/or stalking. Table \ref{tab:describe_assaultdv}
summarizes the counts of unique assault incidents overall and by when the
incident occurred, while Table \ref{tab:freq_HXCat12_2} shows the frequency
distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Assault Involving Domestc Violence and/or", "Stalking Incident Counts (Overall and By When Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_2, HXCat12_2_Before, HXCat12_2_During, HXCat12_2_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Assault Involving", "Domestic Violence and/or Stalking Incidents for Which", "Offender Was Arrested, Charged, or Convicted (Overall and By", "When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_2, HXCat12_2_Before, HXCat12_2_During, HXCat12_2_After) %>% rename(All = HXCat12_2, Before = HXCat12_2_Before, During = HXCat12_2_During, After = HXCat12_2_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the assault involving domestic violence and/or stalking incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_2_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_2 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_2_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_2 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_2_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Assault Involving", "Domestic Violence and/or Stalking Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_2_gee1, HXCat12_2_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_2_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_2_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Assault Involving Domestic Violenace and/or", "Stalking Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_2_emmeans1 <- emmeans(HXCat12_2_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_2_emmeans2 <- emmeans(HXCat12_2_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_2_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_2_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_2_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Assault", "Involving Domestic Violence and/or Stalking", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_2_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_2_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were
r N_HXCat12_3
(r P_HXCat12_3
\%) offenders who had criminal histories
containing at least one incident associated with an arrest, charge, or
conviction for a non-sexual, non-domestic violence assault. Table
\ref{tab:describe_assaultndv} summarizes the counts of unique assault incidents
overall and by when the incident occurred, while Table \ref{tab:freq_HXCat12_3}
shows the frequency distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Assault (Non-Sexual, Non-Domestic Violence)", "Incident Counts (Overall and By When Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_3, HXCat12_3_Before, HXCat12_3_During, HXCat12_3_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Assault (Non-Sexual,", "Non-Domestic Violence) Incidents for Which Offender Was", "Arrested, Charged, or Convicted (Overall and By When Incident", "Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_3, HXCat12_3_Before, HXCat12_3_During, HXCat12_3_After) %>% rename(All = HXCat12_3, Before = HXCat12_3_Before, During = HXCat12_3_During, After = HXCat12_3_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the assault (non-sexual, non-domestic violence) incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_3_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_3 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_3_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_3 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_3_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Assault (Non-Sexual,", "Non-Domestic Violance) Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_3_gee1, HXCat12_3_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_3_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_3_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Assault (Non-Sexual, Non-Domestic Violence)", "Incident Counts and Incidence Rates By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_3_emmeans1 <- emmeans(HXCat12_3_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_3_emmeans2 <- emmeans(HXCat12_3_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_3_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_3_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_3_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Assault", "(Non-Sexual, Non-Domestic Violence)", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_3_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_3_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_4
(r P_HXCat12_4
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for burglary. Table
\ref{tab:describe_burglary} summarizes the counts of unique burglary incidents
overall and by when the incident occurred, while Table \ref{tab:freq_HXCat12_4}
shows the frequency distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Burglary Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_4, HXCat12_4_Before, HXCat12_4_During, HXCat12_4_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Burglary Incidents for", "Which Offender Was Arrested, Charged, or Convicted (Overall and", "By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_4, HXCat12_4_Before, HXCat12_4_During, HXCat12_4_After) %>% rename(All = HXCat12_4, Before = HXCat12_4_Before, During = HXCat12_4_During, After = HXCat12_4_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the burglary incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_4_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_4 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_4_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_4 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_4_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Burglary Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_4_gee1, HXCat12_4_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_4_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_4_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Burglary Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_4_emmeans1 <- emmeans(HXCat12_4_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_4_emmeans2 <- emmeans(HXCat12_4_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_4_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_4_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_4_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Burglary", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_4_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_4_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_5
(r P_HXCat12_5
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for criminal sexual
conduct. Table \ref{tab:describe_csc} summarizes the counts of unique criminal
sexual conduct incidents overall and by when the incident occurred, while Table
\ref{tab:freq_HXCat12_5} shows the frequency distributions underlying those
summaries.
# Table caption. TCap <- paste("Offender-Level Criminal Sexual Conduct Incidents (Overall and", "By When Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_5, HXCat12_5_Before, HXCat12_5_During, HXCat12_5_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Criminal Sexual Conduct", "Incidents for Which Offender Was Arrested, Charged, or", "Convicted (Overall and By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_5, HXCat12_5_Before, HXCat12_5_During, HXCat12_5_After) %>% rename(All = HXCat12_5, Before = HXCat12_5_Before, During = HXCat12_5_During, After = HXCat12_5_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
The earliest known SAKs come from a different data source than the criminal history records. Those SAKs were tested long after most of the offenders’ criminal history records had accumulated. It is possible that some or even all of the criminal sexual conduct incidents found in the offenders' criminal histories during the testing window represent the same event as the earliest known SAK that starts the testing window. It is also possible that some or all of the criminal sexual conduct incidents in the criminal histories are entirely different events that show repeat offending behavior within a short period of time. Therefore, below we assess how much overlap there is between the earliest SAKs that define when the testing windows start and the CSC incidents from the criminal histories that occur during that period.
We start by examining criminal sexual conduct incident records. On these records, the IID variable is a primary key used for uniquely identifying each incident, the ESAK_IID variable on an incident record contains the IID associated with the offender's earliest known SAK, and IWhen classifies each incident as occurring before, during, or after the testing window associated with the offender's earliest known SAK. When IID = ESAK_IID, the incident record overlaps with the earliest known SAK. So, Table \ref{tab:crosstab_overlap} shows how many CSC incident records there are in each IWhen period that overlap with the earliest SAK.
# Create tibble with only CSC incident records. INCEW %>% filter(HXCat12_5 == 1) %>% mutate(Overlap = (ESAK_IID == IID)) -> INCEW_CSC # Create tibble with only CSC incident records during the testing window. INCEW_CSC %>% filter(IWhen == "During") -> INCEW_CSCD TCap <- paste("Number of Criminal Sexual Conduct Incidents By When Incident", "Occurred and Whether The Incident Overlaps The Offender's", "Earliest SAK") # Crosstab xtabs(~IWhen + Overlap, data = INCEW_CSC, addNA = TRUE) %>% addmargins() %>% as.data.frame() %>% pivot_wider(names_from = Overlap, values_from = Freq) %>% kable(., format = "latex", booktabs = TRUE, caption = TCap, col.names = c("IWhen", "False", "True", "Sum"), format.args = list(big.mark = ",")) %>% add_header_above(c("", "Overlap Earliest SAK" = 2, "")) %>% column_spec(2:4, width = "1.6cm")
All of the overlap incidents should occur during the testing window and that is consistent with what we see in Table \ref{tab:crosstab_overlap}. Therefore, we move on to extract a few key numbers and do some additional computations to quantify the degree of overlap.
# No. of CSC incidents observed (all periods). INCEW_CSC %>% nrow() -> N_ICSC N_ICSC # No. of CSC incidents observed during testing window. INCEW_CSCD %>% nrow() -> N_ICSCD N_ICSCD # No. of unique offenders with >=1 CSC incident in CHR INCEW_CSC %>% pull(OID) %>% unique() %>% length() -> N_OCSC N_OCSC # No. of unique offenders with >=1 CSC incident in CHR during testing window INCEW_CSCD %>% pull(OID) %>% unique() %>% length() -> N_OCSCD N_OCSCD # No. of unique offenders whose earliest SAK overlaps a CSC incident in CHR # There is a maximum of one overlap incident per offender, so this is also the # number of unique overlap CSC incidents. INCEW_CSC %>% filter(ESAK_IID == IID) %>% pull(OID) %>% unique() %>% length() -> N_Overlap N_Overlap
From an incident perspective, there were a total of r N_ICSC
CSC incidents
in the criminal history data, but only r N_ICSCD
(r round(100*N_ICSCD/N_ICSC, digits = 2)
%) of them occurred during an
offender's testing window. The r N_Overlap
incidents that overlap with the
offender's earliest known SAK comprise
r round(100*N_Overlap/N_ICSC, digits = 2)
% of all the CSC incidents and
r round(100*N_Overlap/N_ICSCD, digits = 2)
% of the CSC incidents that occurred
during testing windows. So, most of the CSC incidents that occurred during an
offender’s testing window are linked to the very SAK that started the testing
window itself.
Meanwhile, from an offender perspective, there are r N_OCSC
offenders
(r round(100*N_OCSC/N_All, digits = 2)
% of the sample) with at least one CSC
incident in their criminal history records. However, only r N_OCSCD
(r round(100*N_OCSCD/N_All, digits = 2)
%) offenders have one or more CSC
incidents in their criminal history that specifically fall during their testing
window. Further subsetting shows that there are r N_Overlap
offenders for whom
we believe the earliest known SAK can be linked to a specific CSC incident from
their criminal history data. They comprise
r round(100*N_Overlap/N_All, digits = 2)
% of the full sample and
r round(100*N_Overlap/N_OCSCD, digits = 2)
% of the subset of offenders who had
at least one CSC incident during the testing window according to the criminal
history records.
\FloatBarrier
To more closely examine the criminal sexual conduct incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_5_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_5 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_5_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_5 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_5_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Criminal Sexual", "Conduct Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_5_gee1, HXCat12_5_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_5_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_5_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Criminal Sexual Conduct Incident Counts and", "Incidence Rates By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_5_emmeans1 <- emmeans(HXCat12_5_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_5_emmeans2 <- emmeans(HXCat12_5_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_5_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_5_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_5_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Criminal", "Sexual Conduct Incident Counts and Incidence Rates By When", "Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_5_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_5_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_6
(r P_HXCat12_6
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for drug crime. Table
\ref{tab:describe_drug} summarizes the counts of unique drug crime incidents
overall and by and when the incident occurred, while Table
\ref{tab:freq_HXCat12_6} shows the frequency distributions underlying those
summaries.
# Table caption. TCap <- paste("Offender-Level Drug Crime Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_6, HXCat12_6_Before, HXCat12_6_During, HXCat12_6_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Drug Crime Incidents for", "Which Offender Was Arrested, Charged, or Convicted (Overall and", "By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_6, HXCat12_6_Before, HXCat12_6_During, HXCat12_6_After) %>% rename(All = HXCat12_6, Before = HXCat12_6_Before, During = HXCat12_6_During, After = HXCat12_6_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the drug crime incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_6_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_6 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_6_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_6 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_6_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Drug Crime Incident", "Counts Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_6_gee1, HXCat12_6_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_6_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_6_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Drug Crime Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_6_emmeans1 <- emmeans(HXCat12_6_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_6_emmeans2 <- emmeans(HXCat12_6_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_6_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_6_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_6_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Drug Crime", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_6_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_6_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_7
(r P_HXCat12_7
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for homicide. Table
\ref{tab:describe_homicide} summarizes the counts of unique homicide incidents
overall and by when the incident occurred, while Table \ref{tab:freq_HXCat12_7}
shows the frequency distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Homicide Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_7, HXCat12_7_Before, HXCat12_7_During, HXCat12_7_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Homicide Incidents for", "Which Offender Was Arrested, Charged, or Convicted (Overall and", "By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_7, HXCat12_7_Before, HXCat12_7_During, HXCat12_7_After) %>% rename(All = HXCat12_7, Before = HXCat12_7_Before, During = HXCat12_7_During, After = HXCat12_7_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the homicide incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_7_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_7 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_7_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_7 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_7_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Homicide Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_7_gee1, HXCat12_7_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_7_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_7_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Homicide Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_7_emmeans1 <- emmeans(HXCat12_7_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_7_emmeans2 <- emmeans(HXCat12_7_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_7_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_7_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_7_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Homicide", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_7_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_7_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_8
(r P_HXCat12_8
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for larceny, theft, or
fraud. Table \ref{tab:describe_larceny} summarizes the counts of unique larceny,
theft, or fraud incidents overall and by when the incident occurred, while
Table \ref{tab:freq_HXCat12_8} shows the frequency distributions underlying
those summaries.
# Table caption. TCap <- paste("Offender-Level Larceny, Theft, or Fraud Incidents (Overall and", "By When Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_8, HXCat12_8_Before, HXCat12_8_During, HXCat12_8_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Larceny, Theft, or Fraud", "Incidents for Which Offender Was Arrested, Charged, or", "Convicted (Overall and By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_8, HXCat12_8_Before, HXCat12_8_During, HXCat12_8_After) %>% rename(All = HXCat12_8, Before = HXCat12_8_Before, During = HXCat12_8_During, After = HXCat12_8_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the larceny, theft or fraud incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_8_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_8 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_8_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_8 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_8_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Larceny, Theft, or", "Fraud Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_8_gee1, HXCat12_8_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_8_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_8_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Larceny, Theft, or Fraud Incident Counts and", "Incidence Rates By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_8_emmeans1 <- emmeans(HXCat12_8_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_8_emmeans2 <- emmeans(HXCat12_8_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_8_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_8_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_8_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Larceny,", "Theft, or Fraud Incident Counts and Incidence Rates By When", "Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_8_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_8_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_9
(r P_HXCat12_9
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for robbery. Table
\ref{tab:describe_robbery} summarizes the counts of unique robbery incidents
overall and by when the incident occurred, while Table \ref{tab:freq_HXCat12_9}
shows the frequency distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Robbery Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_9, HXCat12_9_Before, HXCat12_9_During, HXCat12_9_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Robbery Incidents for", "Which Offender Was Arrested, Charged, or Convicted (Overall and", "By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_9, HXCat12_9_Before, HXCat12_9_During, HXCat12_9_After) %>% rename(All = HXCat12_9, Before = HXCat12_9_Before, During = HXCat12_9_During, After = HXCat12_9_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the robbery incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_9_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_9 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_9_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_9 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_9_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Robbery Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_9_gee1, HXCat12_9_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_9_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_9_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Robbery Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_9_emmeans1 <- emmeans(HXCat12_9_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_9_emmeans2 <- emmeans(HXCat12_9_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_9_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_9_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_9_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Robbery", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_9_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_9_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_10
(r P_HXCat12_10
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for sex crimes. Table
\ref{tab:describe_sexcrime} summarizes the counts of unique sex crime incidents
overall and by when the incident occurred, while Table \ref{tab:freq_HXCat12_10}
shows the frequency distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Sex Crime Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_10, HXCat12_10_Before, HXCat12_10_During, HXCat12_10_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Sex Crime Incidents for", "Which Offender Was Arrested, Charged, or Convicted (Overall and", "By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_10, HXCat12_10_Before, HXCat12_10_During, HXCat12_10_After) %>% rename(All = HXCat12_10, Before = HXCat12_10_Before, During = HXCat12_10_During, After = HXCat12_10_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the sex crime incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_10_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_10 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_10_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_10 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_10_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Sex Crime Incident", "Counts Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_10_gee1, HXCat12_10_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_10_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_10_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Sex Crime Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_10_emmeans1 <- emmeans(HXCat12_10_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_10_emmeans2 <- emmeans(HXCat12_10_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_10_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_10_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_10_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Sex Crime", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_10_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_10_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_11
(r P_HXCat12_11
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for traffic and
ordinance violations. Table \ref{tab:describe_traffic} summarizes the counts of
unique traffic and ordinance incidents overall and by when the incident
occurred, while Table \ref{tab:freq_HXCat12_11} shows the frequency
distributions underlying those summaries.
# Table caption. TCap <- paste("Offender-Level Traffic and Ordinance Incidents (Overall and", "By When Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_11, HXCat12_11_Before, HXCat12_11_During, HXCat12_11_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Traffic and Ordinance", "Incidents for Which Offender Was Arrested, Charged, or", "Convicted (Overall and By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_11, HXCat12_11_Before, HXCat12_11_During, HXCat12_11_After) %>% rename(All = HXCat12_11, Before = HXCat12_11_Before, During = HXCat12_11_During, After = HXCat12_11_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the traffic and ordinance incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_11_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_11 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_11_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_11 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_11_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Traffic and", "Ordinance Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_11_gee1, HXCat12_11_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_11_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_11_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Traffic and Ordinance Incident Counts and", "Incidence Rates By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_11_emmeans1 <- emmeans(HXCat12_11_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_11_emmeans2 <- emmeans(HXCat12_11_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_11_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_11_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_11_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Traffic and", "Ordinance Incident Counts and Incidence Rates By When", "Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_11_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_11_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were r N_HXCat12_12
(r P_HXCat12_12
\%) offenders who had criminal histories containing at least one
incident associated with an arrest, charge, or conviction for weapons crimes.
Table \ref{tab:describe_weapons} summarizes the counts of unique weapons crime
incidents overall and by when the incident occurred, while Table
\ref{tab:freq_HXCat12_12} shows the frequency distributions underlying those
summaries.
# Table caption. TCap <- paste("Offender-Level Weapons Crime Incident Counts (Overall and By", "When Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_12, HXCat12_12_Before, HXCat12_12_During, HXCat12_12_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Weapons Crime Incidents for", "Which Offender Was Arrested, Charged, or Convicted (Overall and", "By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_12, HXCat12_12_Before, HXCat12_12_During, HXCat12_12_After) %>% rename(All = HXCat12_12, Before = HXCat12_12_Before, During = HXCat12_12_During, After = HXCat12_12_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the weapon incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_12_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_12 ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_12_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_12 ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_12_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Weapon Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_12_gee1, HXCat12_12_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_12_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_12_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Weapon Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_12_emmeans1 <- emmeans(HXCat12_12_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_12_emmeans2 <- emmeans(HXCat12_12_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_12_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_12_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_12_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Weapon", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_12_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_12_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
In this section, we analyze variables that contain the number of unique incidents in the offender's criminal history for a given broader category of crime. The numerical results for broader categories were omitted from the paper for the sake of brevity, but we retain them here for future reference.
For each broader crime category, we report include counts of incidents based on a combined criterion where the offender was arrested for, charged with, or convicted of (any of the three events, or any combination of them) any of the more specific types of crimes classified into the relevant category.
Note that these variables were created by first aggregating ARR, CHG, and JUD records to the incident level to flag incidents that fit the relevant category, then aggregated again to get to the offender level incident counts. It is possible to have multiple arrest offense records, charge records, or adjudication records for convictions on a single incident, but these variables ignore that and only count each incident once.
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were
r N_HXCat12_Sexual
(r P_HXCat12_Sexual
\%) offenders who had criminal
histories containing at least one incident associated with an arrest, charge, or
conviction for a sexual crime (either a CSC or a non-CSC sex crime). Table
\ref{tab:describe_sexual} summarizes the counts of unique sexual crimes incidents
overall and by when the incident occurred, while Table
\ref{tab:freq_HXCat12_Sexual} shows the frequency distributions underlying those
summaries.
# Table caption. TCap <- paste("Offender-Level Sexual Crimes Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_Sexual, HXCat12_Sexual_Before, HXCat12_Sexual_During, HXCat12_Sexual_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Sexual Crimes Incidents ", "for Which Offender Was Arrested, Charged, or Convicted (Overall", "and By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_Sexual, HXCat12_Sexual_Before, HXCat12_Sexual_During, HXCat12_Sexual_After) %>% rename(All = HXCat12_Sexual, Before = HXCat12_Sexual_Before, During = HXCat12_Sexual_During, After = HXCat12_Sexual_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the sexual crimes incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_Sexual_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Sexual ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_Sexual_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Sexual ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_Sexual_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Sexual Crimes Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_Sexual_gee1, HXCat12_Sexual_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_Sexual_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_Sexual_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Sexual Crimes Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_Sexual_emmeans1 <- emmeans(HXCat12_Sexual_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_Sexual_emmeans2 <- emmeans(HXCat12_Sexual_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_Sexual_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_Sexual_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_Sexual_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Sexual Crimes", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_Sexual_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_Sexual_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were
r N_HXCat12_Violent
(r P_HXCat12_Violent
\%) offenders who had criminal
histories containing at least one incident associated with an arrest, charge, or
conviction for a violent, non-sexual crime (homicide; assault involving domestic
violence and/or stalking; non-sexual, non-domestic violence assault; robbery; or
weapons). Table \ref{tab:describe_violent} summarizes the counts of unique sexual
crimes incidents overall and by when the incident occurred, while Table
\ref{tab:freq_HXCat12_Violent} shows the frequency distributions underlying those
summaries.
# Table caption. TCap <- paste("Offender-Level Violent, Non-Sexual Crimes Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_Violent, HXCat12_Violent_Before, HXCat12_Violent_During, HXCat12_Violent_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Violent, Non-Sexual Crimes Incidents ", "for Which Offender Was Arrested, Charged, or Convicted (Overall", "and By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_Violent, HXCat12_Violent_Before, HXCat12_Violent_During, HXCat12_Violent_After) %>% rename(All = HXCat12_Violent, Before = HXCat12_Violent_Before, During = HXCat12_Violent_During, After = HXCat12_Violent_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the violent, non-sexual crimes incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_Violent_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Violent ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_Violent_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Violent ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_Violent_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Violent, Non-Sexual Crimes Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_Violent_gee1, HXCat12_Violent_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_Violent_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_Violent_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Violent, Non-Sexual Crimes Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_Violent_emmeans1 <- emmeans(HXCat12_Violent_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_Violent_emmeans2 <- emmeans(HXCat12_Violent_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_Violent_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_Violent_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_Violent_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Violent, Non-Sexual Crimes", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_Violent_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_Violent_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were
r N_HXCat12_Property
(r P_HXCat12_Property
\%) offenders who had criminal
histories containing at least one incident associated with an arrest, charge, or
conviction for a property crime (arson, burglary, or larceny/theft/fraud). Table
\ref{tab:describe_property} summarizes the counts of unique sexual crimes incidents
overall and by when the incident occurred, while Table
\ref{tab:freq_HXCat12_Property} shows the frequency distributions underlying those
summaries.
# Table caption. TCap <- paste("Offender-Level Property Crimes Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_Property, HXCat12_Property_Before, HXCat12_Property_During, HXCat12_Property_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Property Crimes Incidents ", "for Which Offender Was Arrested, Charged, or Convicted (Overall", "and By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_Property, HXCat12_Property_Before, HXCat12_Property_During, HXCat12_Property_After) %>% rename(All = HXCat12_Property, Before = HXCat12_Property_Before, During = HXCat12_Property_During, After = HXCat12_Property_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the property crimes incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_Property_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Property ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_Property_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Property ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_Property_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Property Crimes Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_Property_gee1, HXCat12_Property_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_Property_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_Property_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Property Crimes Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_Property_emmeans1 <- emmeans(HXCat12_Property_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_Property_emmeans2 <- emmeans(HXCat12_Property_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_Property_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_Property_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_Property_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Property Crimes", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_Property_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_Property_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Among these r N_IDNEW
suspected sexual offenders, there were
r N_HXCat12_Property
(r P_HXCat12_Property
\%) offenders who had criminal
histories containing at least one incident associated with an arrest, charge, or
conviction for other crimes (drug crimes or traffic and ordinances). Table
\ref{tab:describe_other} summarizes the counts of unique sexual crimes incidents
overall and by when the incident occurred, while Table
\ref{tab:freq_HXCat12_Other} shows the frequency distributions underlying those
summaries.
# Table caption. TCap <- paste("Offender-Level Other Crimes Incident Counts (Overall and By When", "Incident Occurred)") # Summarize crime category counts (both overall and broken down by IWhen). IDNEW %>% select(HXCat12_Other, HXCat12_Other_Before, HXCat12_Other_During, HXCat12_Other_After) %>% describe(., quant=c(.25, .50, .75)) %>% bind_cols(data.frame(Variable = TRowLabs), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap, linesep = c('', '', '', '\\addlinespace')) %>% column_spec(column = 1, width = "6 cm") %>% footnote(general = FN1, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
# Table caption. TCap <- paste("Frequency Distributions for Number of Other Crimes Incidents ", "for Which Offender Was Arrested, Charged, or Convicted (Overall", "and By When Incident Occurred)") # Get frequency distributions. IDNEW %>% select(OID, HXCat12_Other, HXCat12_Other_Before, HXCat12_Other_During, HXCat12_Other_After) %>% rename(All = HXCat12_Other, Before = HXCat12_Other_Before, During = HXCat12_Other_During, After = HXCat12_Other_After) %>% pivot_longer(cols = -OID, names_to = "When", values_to = "Value") %>% group_by(When, Value) %>% count() %>% arrange(Value) %>% pivot_wider(names_from = When, values_from = n, values_fill = 0) %>% mutate(All_p = 100*All/N_All, All_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*All/N_All), Before_p = 100*Before/N_All, Before_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*Before/N_Before), During_p = 100*During/N_All, During_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*During/N_During), After_p = 100*After/N_All, After_v = if_else(is.na(Value), true = as.numeric(NA), false = 100*After/N_After)) %>% select(Value, All, All_p, All_v, Before, Before_p, Before_v, During, During_p, During_v, After, After_p, After_v) %>% kable(format = "latex", booktabs = TRUE, digits = 2, col.names = c("Value", rep(c("N", "%", "Valid %"), 4)), caption = TCap) %>% kable_styling() %>% add_header_above(c(" " = 1, "Overall" = 3, "Before TW" = 3, "During TW" = 3, "After TW" = 3)) %>% footnote(general = FN5, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
To more closely examine the other crimes incident counts, we fitted two generalized estimating equation (GEE) models to the long-format IDNEWL data, after dropping rows for unobserved periods (Years = 0) because the counts are missing for those observations.
Model 1 used When as a predictor of the period-specific counts, but did not adjust for the period durations. Model 2 supplemented When as a predictor with an offset term for log(Years) to adjust for the period durations. Both models used Poisson distributions, an exchangeable correlation structure, a log link function, and robust standard errors based on the sandwich estimator.
HXCat12_Other_gee1 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Other ~ When, family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se") HXCat12_Other_gee2 <- IDNEWL %>% filter(Years > 0) %>% geeglm(HXCat12_Other ~ When + offset(log(Years)), family = poisson(link = "log"), data = ., id = OID, corstr = "exchangeable", std.err = "san.se")
Table \ref{tab:HXCat12_Other_texreg} below shows the parameter estimates for both GEE models.
TCap <- paste("Generalized Estimating Equation Models for Other Crimes Incident Counts", "Predicted by When Incidents Occurred") FN <- paste ("\\item * p < .05, null hypothesis value outside 95\\% confidence", "interval based on z-score.", "\\\\\n\\item Note.", "The data contain up to 3 longitudinal observations (one per", "period) for each offender (i.e., cluster).", "The reference priod for When was after the testing window.", "Both models used a log link function, exchangeable correlation", "structure, a Poisson distribution, and robust standard errors", "(sandwich estimator).", "Model 1 did not use an offset.", "Model 2 used log(Years) as an offset term.") texreg(list(HXCat12_Other_gee1, HXCat12_Other_gee2), booktabs = TRUE, dcolumn = TRUE, threeparttable = TRUE, fontsize = "normalsize", table = TRUE, use.packages = FALSE, ci.force = TRUE, label = "tab:HXCat12_Other_texreg", stars = 0.05, caption = TCap, custom.note = FN)
\FloatBarrier
Table \ref{tab:HXCat12_Other_emmeans} provides the estimated marginal means obtained from Models 1 and 2, along with corresponding confidence intervals. The units of measurement for those means are counts per offender for Model 1 (which ignore period duration) and counts per offender-year for Model 2 (which adjust for duration and are thus considered annual incidence rates).
# Table caption. TCap <- paste("Marginal Means of Other Crimes Incident Counts and Incidence Rates", "By When Incidents Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "Confidence intervals are based on z-scores.") HXCat12_Other_emmeans1 <- emmeans(HXCat12_Other_gee1, specs = pairwise ~ When, type = "response", ref = "During") HXCat12_Other_emmeans2 <- emmeans(HXCat12_Other_gee2, specs = pairwise ~ When, type = "response", ref = "During") as_tibble(HXCat12_Other_emmeans1$emmeans) %>% full_join(x = ., y = as_tibble(HXCat12_Other_emmeans2$emmeans)) %>% mutate(Model = c(1, 1, 1, 2, 2, 2), When = factor(When, levels = c("Before", "During", "After"))) %>% arrange(Model, When) %>% select(When, rate, SE, df, asymp.LCL, asymp.UCL) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("When", "Rate", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Mean per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Mean per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Finally, we also estimated a set of contrasts to compare the marginal means from Models 1 and 2 (Table \ref{tab:HXCat12_Other_ratios}). Each of these contrasts estimates the ratios of a pair of means rather than a raw difference between means. This is because of the nature of the underlying Poisson GEE model.
TCap <- paste("Contrasts Estimating Ratios of Marginal Means for Other Crimes", "Incident Counts and Incidence Rates By When Incidents", "Occurred") # Table footnote. FN <- paste("Model 1 did not use an offset, so means are counts.", "Model 2 used log(Years) as an offset term, so means are annual", "incidence rates.", "These contrasts estimate ratios of those quantities and used", "Tukey's method to adjust for multiple comparisons.", "Confidence intervals are based on z-scores.") as_tibble(confint(HXCat12_Other_emmeans1$contrasts)) %>% full_join(x = ., y = as_tibble(confint(HXCat12_Other_emmeans2$contrasts))) %>% kable(format = "latex", booktabs = TRUE, caption = TCap, digits = 2, col.names = c("Contrast", "Ratio", "SE", "df", "LCL", "UCL")) %>% kable_styling() %>% add_header_above(c(" " = 4, "95% CI" = 2)) %>% group_rows(group_label = "Model 1: Ratios of means per offender", start_row = 1, end_row = 3, italic = TRUE) %>% group_rows(group_label = "Model 2: Ratios of means per offender-year", start_row = 4, end_row = 6, italic = TRUE) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
The content of this section was not reported in the manuscript for the sake of brevity. Table \ref{tab:before_summary} remains here to document an analysis performed and provide it in a form consistent with the one used for some of the after period results.
TCap <- paste("Amount of Crime That Occurred Before the Testing Window") FN <- paste0("Separately for each crime category, we counted offenders ", "arrested, charged, or convicted for at least one incident ", "occurring before the testing window. Total incidents are summed ", "across those offenders. The bottom four italicized broader ", "categories include offenders who were arrested, charged or ", "convicted for any of the more specific crime categories ", "nested below them, while the Any Crimes category includes ", "incidents associated with at least one of the 12 specific ", "crime categories. ", "The numerator for each percentage is the listed value for n; ", "the denominator is the sample size for the before period (N = ", format(N_Before, big.mark = ","), "). ", "Incidents associated with more than one kind of crime are ", "counted in all applicable specific categories, but only once ", "per applicable broader category. For example, an incident with ", "both a homicide and a robbery only counts once toward the total ", "incident count for the broader violent crimes category. TW, ", "testing window.") IDNEWB %>% # Drops rows for crime category counts and individuals with no incidents filter(Variable != "HXCat12_Count") %>% # Drops rows for offenders with no incidents of a given type occuring after # the testing window. filter(Count > 0) %>% group_by(Variable, VLabel, VOrder) %>% summarise(N = n(), Pct = 100*N/N_Before, Incidents = sum(Count)) %>% ungroup() %>% arrange(VOrder) %>% select(VLabel, N, Pct, Incidents) %>% kable(., format = "latex", booktabs = TRUE, digits = 1, row.names = FALSE, col.names = c("Crime Category", "n", "%", "Total Incidents"), format.args = list(big.mark = ','), caption = TCap, linesep = "") %>% kable_styling() %>% add_header_above(c(" " = 1, "Offenders" = 2, " " = 1)) %>% row_spec(., row = c(1, 2, 5, 11, 15), italic = TRUE) %>% pack_rows(" ", start_row = 1, end_row = 1) %>% pack_rows(" ", start_row = 2, end_row = 4) %>% pack_rows(" ", start_row = 5, end_row = 10) %>% pack_rows(" ", start_row = 11, end_row = 14) %>% pack_rows(" ", start_row = 15, end_row = 17) %>% add_indent(., positions = c(3:4), level_of_indent = 1) %>% add_indent(., positions = c(6:10), level_of_indent = 1) %>% add_indent(., positions = c(12:14), level_of_indent = 1) %>% add_indent(., positions = c(16:17), level_of_indent = 1) %>% column_spec(., column = 2, width = "1.15cm") %>% column_spec(., column = 3, width = "1.15cm") %>% column_spec(., column = 4, width = "2cm") %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
The content of this section was not reported in the manuscript for the sake of brevity. Table \ref{tab:during_summary} remains here to document an analysis performed and provide it in a form consistent with the one used for some of the after period results.
TCap <- paste("Amount of Crime That Occurred During the Testing Window") FN <- paste0("Separately for each crime category, we counted offenders ", "arrested, charged, or convicted for at least one incident ", "occurring during the testing window. Total incidents are summed ", "across those offenders. The bottom four italicized broader ", "categories include offenders who were arrested, charged or ", "convicted for any of the more specific crime categories ", "nested below them, while the Any Crimes category includes ", "incidents associated with at least one of the 12 specific ", "crime categories. ", "The numerator for each percentage is the listed value for n; ", "the denominator is the sample size for the during period (N = ", format(N_During, big.mark = ","), "). ", "Incidents associated with more than one kind of crime are ", "counted in all applicable specific categories, but only once ", "per applicable broader category. For example, an incident with ", "both a homicide and a robbery only counts once toward the total ", "incident count for the broader violent crimes category. TW, ", "testing window.") IDNEWD %>% # Drops rows for crime category counts and individuals with no incidents filter(Variable != "HXCat12_Count") %>% # Drops rows for offenders with no incidents of a given type occuring after # the testing window. filter(Count > 0) %>% group_by(Variable, VLabel, VOrder) %>% summarise(N = n(), Pct = 100*N/N_During, Incidents = sum(Count)) %>% ungroup() %>% arrange(VOrder) %>% select(VLabel, N, Pct, Incidents) %>% kable(., format = "latex", booktabs = TRUE, digits = 1, row.names = FALSE, col.names = c("Crime Category", "n", "%", "Total Incidents"), format.args = list(big.mark = ','), caption = TCap, linesep = "") %>% kable_styling() %>% add_header_above(c(" " = 1, "Offenders" = 2, " " = 1)) %>% row_spec(., row = c(1, 2, 5, 11, 15), italic = TRUE) %>% pack_rows(" ", start_row = 1, end_row = 1) %>% pack_rows(" ", start_row = 2, end_row = 4) %>% pack_rows(" ", start_row = 5, end_row = 10) %>% pack_rows(" ", start_row = 11, end_row = 14) %>% pack_rows(" ", start_row = 15, end_row = 17) %>% add_indent(., positions = c(3:4), level_of_indent = 1) %>% add_indent(., positions = c(6:10), level_of_indent = 1) %>% add_indent(., positions = c(12:14), level_of_indent = 1) %>% add_indent(., positions = c(16:17), level_of_indent = 1) %>% column_spec(., column = 2, width = "1.15cm") %>% column_spec(., column = 3, width = "1.15cm") %>% column_spec(., column = 4, width = "2cm") %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Next we summarize the amount of crime that occurred after the testing window. Separately for each crime category, Table \ref{tab:after_summary} shows the number of offenders with one or more incidents from the given category. It also translates those offender counts into percentages relative to the total sample size and reports the total number of incidents associated with those offenders that occurred after the testing window ended.
TCap <- paste("Amount of Crime That Occurred After the Testing Window") FN <- paste0("Separately for each crime category, we counted offenders ", "arrested, charged, or convicted for at least one incident " , "occurring after the testing window. Total incidents are summed ", "across those offenders. The bottom four italicized broader ", "categories include offenders who were arrested, charged or ", "convicted for any of the more specific crime categories ", "nested below them, while the Any Crimes category includes ", "incidents associated with at least one of the 12 specific ", "crime categories. ", "The numerator for each percentage is the listed value for n; ", "the denominator is the sample size for the after period (N = ", format(N_After, big.mark = ","), "). ", "Incidents associated with more than one kind of crime are ", "counted in all applicable specific categories, but only once ", "per applicable broader category. For example, an incident with ", "both a homicide and a robbery only counts once toward the total ", "incident count for the broader violent crimes category. TW, ", "testing window.") IDNEWA %>% # Drops rows for crime category counts and individuals with no incidents filter(Variable != "HXCat12_Count") %>% # Drops rows for offenders with no incidents of a given type occuring after # the testing window. filter(Count > 0) %>% group_by(Variable, VLabel, VOrder) %>% summarise(N = n(), Pct = 100*N/N_After, Incidents = sum(Count)) %>% ungroup() %>% arrange(VOrder) %>% select(VLabel, N, Pct, Incidents) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = c("Crime Category", "n", "%", "Total Incidents"), format.args = list(big.mark = ','), caption = TCap, linesep = "") %>% kable_styling() %>% add_header_above(c(" " = 1, "Offenders" = 2, " " = 1)) %>% row_spec(., row = c(1, 2, 5, 11, 15), italic = TRUE) %>% pack_rows(" ", start_row = 1, end_row = 1) %>% pack_rows(" ", start_row = 2, end_row = 4) %>% pack_rows(" ", start_row = 5, end_row = 10) %>% pack_rows(" ", start_row = 11, end_row = 14) %>% pack_rows(" ", start_row = 15, end_row = 17) %>% add_indent(., positions = c(3:4), level_of_indent = 1) %>% add_indent(., positions = c(6:10), level_of_indent = 1) %>% add_indent(., positions = c(12:14), level_of_indent = 1) %>% add_indent(., positions = c(16:17), level_of_indent = 1) %>% column_spec(., column = 2, width = "1.15cm") %>% column_spec(., column = 3, width = "1.15cm") %>% column_spec(., column = 4, width = "2cm") %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
Figure \ref{fig:plot_Crime_full_width} plots the distributions of the incident count variables for each of the 12 main crime categories during the period after the testing window. Because a few large outlier values make the details harder to see for the majority of the data (nearly all values are for 12 or fewer incidents), we also produced Figure \ref{fig:plot_Crime_partial_width}, which zooms in on the 0-12 range of incident counts.
FCap <- paste("\\label{fig:plot_Crime_full_width}", "Distributions for Number of Incidents After the Testing Window", "for Which Offender Was Arrested, Charged, or Convicted,", "By Crime Category.", "Each panel shows a histogram above a strip plot with an", "overlaid boxplot. Histogram bar heights show the percentage of", "offenders who had a given number of incidents.", "The strip plots show a dot for each offender, with positions", "jittered to reduce overlap.") # Graph after period frequency distributions. IDNEWA %>% # Keep only the main 12 crime categories (drops the broader crime categories). filter(!is.na(Crime)) %>% # use y = -... to position boxplot and jitterplot below the histogram ggplot(data = ., aes(x = Count, y = -.4)) + # Add histogram with bar heights scaled to percentages via after_stat. geom_histogram(aes(y = after_stat(count/N_After)), binwidth = .5, fill = "black") + # Add a jittered strip plot. geom_point(aes(), position = position_jitter(width = .25, height = .25), size = 1, color = "gray50") + # Overlay a boxplot on the strip plot from ggstance geom_boxplot(lwd = 1, width = .2, outlier.shape = NA) + # Merged those calls to one display guides(scale = "none") + # Split each crime category into a separate panel (facet) facet_wrap(~Crime, nrow = 6, strip.position = "top") + xlab("Incident Count After Testing Window") + # Remove y axis title label from the y axis theme(axis.title.y = element_blank()) + # Format the y axis ticks. scale_y_continuous(limits = c(-.65, 1.00), breaks = c(0, .5, 1.0), labels = c("0%", "50%", "100%")) -> Figure_Crime_full_width Figure_Crime_full_width
FCap <- paste("\\label{fig:plot_Crime_partial_width}", "Distributions for Number of Incidents After the Testing Window", "for Which Offender Was Arrested, Charged, or Convicted,", "By Crime Category.", "Each panel shows a histogram above a strip plot with an", "overlaid boxplot. Histogram bar heights show the percentage of", "offenders who had a given number of incidents.", "The strip plots show a dot for each offender, with positions", "jittered to reduce overlap. This plot has been zoomed", "in to show incident counts in the range 0-12, omitting a few", "outliers for (a) larceny, theft, fraud,", "(b) drug crimes, and (c) traffic and ordinances.") Figure_Crime_full_width + scale_x_continuous(limits = c(-.6, 12.5)) -> Figure_Crime_partial_width Figure_Crime_partial_width
Figure \ref{fig:plot_BCrime_full_width} plots the distributions of the incident count variables for each of the four broader crime categories during the period after the testing window. Because a few large outlier values make the details harder to see for the majority of the data (nearly all values are for 15 or fewer incidents), we also produced Figure \ref{fig:plot_BCrime_partial_width}, which zooms in on the 0-15 range of incident counts.
FCap <- paste("\\label{fig:plot_BCrime_full_width}", "Distributions for Number of Incidents After the Testing Window", "for Which Offender Was Arrested, Charged, or Convicted,", "By Broader Crime Category.", "Each panel shows a histogram above a strip plot with an", "overlaid boxplot. Histogram bar heights show the percentage of", "offenders who had a given number of incidents.", "The strip plots show a dot for each offender, with positions", "jittered to reduce overlap.") # Graph after period frequency distributions. IDNEWA %>% # Keep only the 4 broader crime categories (drops 12 main crime categories). filter(!is.na(BCrime)) %>% # use y = -... to position boxplot and jitterplot below the histogram ggplot(data = ., aes(x = Count, y = -.4)) + # Add histogram with bar heights scaled to percentages via after_stat. geom_histogram(aes(y = after_stat(count/N_After)), binwidth = .5, fill = "black") + # Add a jittered strip plot. geom_point(aes(), position = position_jitter(width = .25, height = .25), size = 1, color = "gray50") + # Overlay a boxplot on the strip plot from ggstance geom_boxplot(lwd = 1, width = .2, outlier.shape = NA) + # Merged those calls to one display guides(scale = "none") + # Split each crime category into a separate panel (facet) facet_wrap(~BCrime, nrow = 4, strip.position = "top") + xlab("Incident Count After Testing Window") + # Remove y axis title label from the y axis theme(axis.title.y = element_blank()) + # Format the y axis ticks. scale_y_continuous(limits = c(-.65, 1.00), breaks = c(0, .5, 1.0), labels = c("0%", "50%", "100%")) -> Figure_BCrime_full_width Figure_BCrime_full_width
FCap <- paste("\\label{fig:plot_BCrime_partial_width}", "Distributions for Number of Incidents After the Testing Window", "for Which Offender Was Arrested, Charged, or Convicted,", "By Broader Crime Category.", "Each panel shows a histogram above a strip plot with an", "overlaid boxplot. Histogram bar heights show the percentage of", "offenders who had a given number of incidents.", "The strip plots show a dot for each offender, with positions", "jittered to reduce overlap. This plot has been zoomed", "in to show incident counts in the range 0-15, omitting some", "larger values in the any crimes panel and some outliers for", "(a) property crimes and (b) other crimes.") Figure_BCrime_full_width + scale_x_continuous(limits = c(-.6, 15.5)) -> Figure_BCrime_partial_width Figure_BCrime_partial_width
\FloatBarrier
Table \ref{tab:describe_IYearsAfterTW} shows descriptive statistics about the distribution for IYearsAfterTW, which records how many years after the end of the testing window an incident actually occurred, broken down by crime category. This is particularly useful for identifying the median (Q50) and quartiles (Q25 and Q75), which clearly vary across types of crime.
# Table caption. TCap <- paste("Years to Incident Date By Crime Category", "for Incidents After the Testing Window Where Offender Was", "Arrested, Charged, or Convicted.") FN2 <- paste("Only offenders with valid testing window start dates are", "included.", "Only incidents that occurred after the testing window ended and", "where the offender was arrested for, charged with, or convicted", "of at least one offense from the relevant crime category were", "included.", "The bottom four italicized broader categories include offenders", "who were arrested, charged or convicted for any of the more", "specific crime categories nested below them, while the Any", "Crimes category includes incidents associated with at least one", "of the 12 specific crime categories.", "Incidents associated with more than one kind of crime are", "included in all applicable specific categories, but only once per", "applicable broader category. For example, an incident with both a", "homicide and a robbery only contributes one observation to the", "years summary for the broader violent crimes category.") VLLevels <- c("Any Crimes (12 categories)", "Sexual Crimes (2 categories)", "Criminal Sexual Conduct (CSC)", "Sex Crimes (non-CSC crimes)", "Violent Non-Sexual Crimes (5 categories)", "Assault, Domestic Violence", "Assault, non-Domestic Violence", "Homicide", "Robbery", "Weapons", "Property Crimes (3 categories)", "Arson", "Burglary", "Larceny, Theft, Fraud", "Other Crimes (2 categories)", "Drug Crimes", "Traffic & Ordinances") CNames2 <- c("Crime Category", "N", "Mean", "SD", "Min", "Max", "Skew", "Kurtosis", "Q25", "Q50", "Q75") # Summarize crime category counts (both overall and broken down by IWhen). INCEWA %>% filter(!is.na(VLabel)) %>% select(VLabel, IYearsAfterTW) %>% describeBy(., group = "VLabel", quant=c(.25, .50, .75), mat = TRUE) %>% filter(vars == 2) %>% rename(Category = group1) %>% mutate(Category = factor(Category, levels = VLLevels, labels = VLLevels)) %>% arrange(Category) %>% select(Category, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames2, caption = TCap, linesep = "") %>% row_spec(., row = c(1, 2, 5, 11, 15), italic = TRUE) %>% pack_rows(" ", start_row = 1, end_row = 1) %>% pack_rows(" ", start_row = 2, end_row = 4) %>% pack_rows(" ", start_row = 5, end_row = 10) %>% pack_rows(" ", start_row = 11, end_row = 14) %>% pack_rows(" ", start_row = 15, end_row = 17) %>% add_indent(., positions = c(3:4), level_of_indent = 1) %>% add_indent(., positions = c(6:10), level_of_indent = 1) %>% add_indent(., positions = c(12:14), level_of_indent = 1) %>% add_indent(., positions = c(16:17), level_of_indent = 1) %>% column_spec(column = 1, width = "6.8 cm") %>% footnote(general = FN2, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
Figure \ref{fig:plot_IYearsAfterTW_Crime} shows the density plots (i.e., smoothed histograms) for the distribution of IYearsAfterTW for each crime category. The shapes of these plots help us better understand Table \ref{tab:describe_IYearsAfterTW}.
FCap <- paste("\\label{fig:plot_IYearsAfterTW_Crime}", "Distributions for Years to Incident Date", "for Incidents After the Testing Window Where Offender Was", "Arrested, Charged, or Convicted, By Crime Category.", "Each panel shows a densityplot above a strip plot with an", "overlaid boxplot.", "The strip plots show a dot for each incident, with positions", "jittered to reduce overlap.") # Graph after period frequency distributions. INCEWA %>% # Keep only the main 12 crime categories (drops the broader crime categories). filter(!is.na(Crime)) %>% select(OID, IID, Crime, IYearsAfterTW) %>% # use y = -... to position boxplot and jitterplot below the histogram ggplot(data = ., aes(x = IYearsAfterTW, y = -.4)) + # Add density plot. stat_halfeye(slab_fill = "black", width = .6, justification = -.42, .width = 0, point_colour = "NA") + # Add a jittered strip plot. geom_point(aes(), position = position_jitter(width = 0, height = .25), size = 1, color = "gray50") + # Overlay a boxplot on the strip plot from ggstance geom_boxplot(lwd = 1, width = .20, outlier.shape = NA) + # Merged those calls to one display guides(scale = "none") + # Split each crime category into a separate panel (facet) facet_wrap(~Crime, nrow = 6, strip.position = "top") + xlab("Years to Incident Date") + ylab("Density") + # Remove y axis title label from the y axis theme(axis.title.y = element_blank(), axis.text.y = element_blank(), axis.ticks.y = element_blank()) + # Format the y axis ticks. scale_x_continuous(limits = c(-.5, 30.00), breaks = c(0, 5, 10, 15, 20, 25, 30), labels = c(0, 5, 10, 15, 20, 25, 30)) -> Figure_Crime_Years Figure_Crime_Years
Figure \ref{fig:plot_IYearsAfterTW_Crime_ecdf} shows the cumulative distribution functions for the number of years elapsed between the end of the testing window and each incident that occurred after the window ended, broken down by crime category. This is useful because the number of preventable incidents depends on when after the end of the testing window the offender reference sample gets uploaded to CODIS. This figure is an alternative way to visualize the same data shown in Figure \ref{fig:plot_IYearsAfterTW_Crime} as a cumulative density up to a given number of years rather than the point density at each time point. The medians (Q50) shown in Table \ref{tab:describe_IYearsAfterTW} are number of years on the horizontal axis of \ref{fig:plot_IYearsAfterTW_Crime_ecdf} where the line rises to the 50% mark on the vertical axis. Similarly, the 25th and 75th quantiles (Q25 and Q75) are where the line rises to the 25% and 75% marks on the vertical axis.
FCap <- paste("\\label{fig:plot_IYearsAfterTW_Crime_ecdf}", "Cumulative Distribution Function for Years to Incident Date", "for Incidents After the Testing Window Where Offender Was", "Arrested, Charged, or Convicted, By Crime Category.", "Each panel shows a cumulative density function plot above a ", "strip plot with an overlaid boxplot.", "The strip plots show a dot for each incident, with positions", "jittered to reduce overlap.") # Graph after period frequency distributions. INCEWA %>% # Keep only the main 12 crime categories (drops the broader crime categories). filter(!is.na(Crime)) %>% select(OID, IID, Crime, IYearsAfterTW) %>% # use y = -... to position boxplot and jitterplot below the histogram ggplot(data = ., aes(x = IYearsAfterTW, y = -.4)) + # Add empirical cumulative distribution function plot. stat_ecdf(geom = "step", pad = FALSE) + # Add a jittered strip plot. geom_point(aes(), position = position_jitter(width = 0, height = .25), size = 1, color = "gray50") + # Overlay a boxplot on the strip plot from ggstance geom_boxplot(lwd = 1, width = .20, outlier.shape = NA) + # Merged those calls to one display guides(scale = "none") + # Split each crime category into a separate panel (facet) facet_wrap(~Crime, nrow = 6, strip.position = "top") + xlab("Years to Incident Date") + ylab("Cumulative Density") + # Format the y axis ticks. scale_y_continuous(limits = c(-.65, 1.00), breaks = c(0, .5, 1.0), labels = c("0%", "50%", "100%")) + scale_x_continuous(limits = c(-.5, 30.00), breaks = c(0, 5, 10, 15, 20, 25, 30), labels = c(0, 5, 10, 15, 20, 25, 30))
Figure \ref{fig:plot_IYearsAfterTW_BCrime} shows the distributions for the number of years elapsed between the end of the testing window and each incident that occurred after the window ended broken down by broader crime category.
FCap <- paste("\\label{fig:plot_IYearsAfterTW_BCrime}", "Distributions for Years to Incident Date", "for Incidents After the Testing Window Where Offender Was", "Arrested, Charged, or Convicted, By Broader Crime Category.", "Each panel shows a densityplot above a strip plot with an", "overlaid boxplot.", "The strip plots show a dot for each incident, with positions", "jittered to reduce overlap.") # Graph after period frequency distributions. INCEWA %>% # Keep only the 4 broader crime categories (drops 12 main crime categories). filter(!is.na(BCrime)) %>% select(OID, IID, BCrime, IYearsAfterTW) %>% # use y = -... to position boxplot and jitterplot below the histogram ggplot(data = ., aes(x = IYearsAfterTW, y = -.4)) + # Add density plot. stat_halfeye(slab_fill = "black", width = .6, justification = -.42, .width = 0, point_colour = "NA") + # Add a jittered strip plot. geom_point(aes(), position = position_jitter(width = 0, height = .25), size = 1, color = "gray50") + # Overlay a boxplot on the strip plot from ggstance geom_boxplot(lwd = 1, width = .20, outlier.shape = NA) + # Merged those calls to one display guides(scale = "none") + # Split each crime category into a separate panel (facet) facet_wrap(~BCrime, nrow = 4, strip.position = "top") + xlab("Years to Incident Date") + ylab("Density") + # Remove y axis title label from the y axis theme(axis.title.y = element_blank(), axis.text.y = element_blank(), axis.ticks.y = element_blank()) + # Format the y axis ticks. scale_x_continuous(limits = c(-.5, 30.00), breaks = c(0, 5, 10, 15, 20, 25, 30), labels = c(0, 5, 10, 15, 20, 25, 30)) -> Figure_BCrime_Years Figure_BCrime_Years
Figure \ref{fig:plot_IYearsAfterTW_Crime_ecdf} was the original version of a
figure we submitted in a revision to our manuscript. Reviewers found it
confusing, so we created Figure \ref{fig:plot_IYearsAfterTW_Crime_ecdf2}
as a replacement when resubmitting the manuscript. The cumulative density at any
given number of years after the end of the testing window is the proportion of
incidents that had already occurred by that time. Switching to a focus on the
proportion remaining instead more closely aligns with the number of potentially
preventable incidents. We also improved the labeling of the axes and the figure
caption to address reviewer comments and make the figure easier to understand.
Subsequent peer review feedback caused us to eventually remove the figure even though we believe is is useful in communicating the results.
FCap <- paste("\\label{fig:plot_IYearsAfterTW_Crime_ecdf2}", "Percentage of Incidents Remaining As a Function of Years From", "End of Testing Window to Incident Date, By Crime Category.", "Only incidents occurring after the testing window where", "the offender was arrested for, charged with, or convicted of at", "least one offense from the relevant crime category were", "included.", "The lines show how the percentage of incidents remaining", "declines over time (they are the complements of the cumulative", "density functions).", "The strip plots below the lines show a dot for when each", "incident occurred: these are the data being summarized.", "Dot positions were jittered to reduce overlap.", "The overlaid boxplots provide an additional summary that ", "highlight when 75%, 50%, and 25% of the incidents remain.", "Bold x-axis values (at 1, 2, 5, and 10) are time points we", "examined to quantify implications of hypothetical delayed CODIS", "matches for how many crimes might still be preventable.") x.ticks <- c(0, 1, 2, 5, 10, 15, 20, 25, 30) x.face <- c("0", expression(bold("1")), expression(bold("2")), expression(bold("5")), expression(bold("10")), "15", "20", "25", "30") # Graph after period frequency distributions. INCEWA %>% # Keep only the main 12 crime categories (drops the broader crime categories). filter(!is.na(Crime)) %>% select(OID, IID, Crime, IYearsAfterTW) %>% # use y = -... to position boxplot and jitterplot below the histogram ggplot(data = ., aes(x = IYearsAfterTW, y = -.4)) + # Add the complement of the empirical cumulative distribution function plot. stat_ecdf(mapping = aes(y = 1 - ..y..), geom = "step", pad = FALSE) + # Add a jittered strip plot. geom_point(aes(), position = position_jitter(width = 0, height = .25), size = 1, color = "gray50") + # Overlay a boxplot on the strip plot from ggstance geom_boxplot(lwd = 1, width = .20, outlier.shape = NA) + # Merged those calls to one display guides(scale = "none") + # Split each crime category into a separate panel (facet) facet_wrap(~Crime, nrow = 6, strip.position = "top") + xlab("Years From End of Testing Window to Incident Date") + ylab("Percentage of Incidents Remaining") + # Format the y axis ticks. scale_y_continuous(limits = c(-.65, 1.00), breaks = c(0, .5, 1.0), labels = c("0%", "50%", "100%")) + scale_x_continuous(limits = c(-.5, 30.00), breaks = x.ticks, labels = x.face) -> #theme(axis.text.x = element_text(face=x.face)) -> Figure_Crime_Years Figure_Crime_Years
\FloatBarrier
Now we extract the empirical cumulative distribution functions data underlying Figure \ref{fig:plot_IYearsAfterTW_Crime_ecdf} and use them to prepare Table \ref{tab:table_IYearsAfterTW_ecdf}. A reviewer noted that uploading an offender DNA profile from a SAK to CODIS can only produce immediate identification of the offender if the corresponding offender reference sample was already present in CODIS beforehand. However, that may not always happen. The offender reference sample that allows identifying the offender associated with SAK may not arrive in CODIS until later. That means there could be a delay between uploading the SAK forensic testing results to CODIS as a forensic sample and getting a match to an offender. Incidents occurring between the end of the testing window and the arrival of a matching offender reference sample cannot be prevented through timely response to CODIS search results that are not yet available. Only crimes occurring after a match is available could be prevented that way.
It would be ideal if we had the CODIS upload date for the reference sample that matched to each SAK. That would allow us to examine and describe the distribution of the delay between uploading the SAK sample and receiving an offender match. There would be no delay ($\le 0$ years) when the offender reference sample was already present in CODIS, but it could be anywhere from 1 day to many years before one is available. Knowing more about that distribution would allow us to find the subset of potentially preventable incidents that occurred both after the testing window ended and after the matching reference sample was uploaded to CODIS.
Unfortunately, we do not have access to those offender reference sample upload dates. Therefore, we examine the implications of assuming a fixed delay in when offender reference samples arrive in CODIS relative to the end of the testing window. We do that by using the cumulative distribution functions graphed in Figure \ref{fig:plot_IYearsAfterTW_Crime_ecdf} to compute how many incidents occurred 1, 2, 5, and 10 years after the end of the testing window. This makes the crude assumption that all SAKs experience the same delay.
# Table caption. TCap <- paste("Number and Percentage of Incidents Remaining Given Hypothetical", "Delays Between End of Testing Window and Matching to an ", "Offender Reference Sample") FN2 <- paste("Only offenders with valid testing window start dates are", "included.", "Only incidents that occurred after the testing window ended and", "where the offender was arrested for, charged with, or convicted", "of at least one offense from the relevant crime category were", "included.", "The bottom four italicized broader categories include offenders", "who were arrested, charged or convicted for any of the more", "specific crime categories nested below them, while the Any", "Crimes category includes incidents associated with at least one", "of the 12 specific crime categories. ", "Incidents associated with more than one kind of crime are", "included in all applicable specific categories, but only once per", "applicable broader category. For example, an incident with both a", "homicide and a robbery only contributes one observation to the", "years summary for the broader violent crimes category.", "N, total number of incidents after the end of testing window;", "n, number of incidents remaining after specified delay." ) INCEWA %>% filter(!is.na(VLabel)) %>% mutate(Category = factor(VLabel, levels = VLLevels, labels = VLLevels)) %>% select(Category, IYearsAfterTW) %>% arrange(Category, IYearsAfterTW) %>% group_by(Category) %>% summarize(N = n(), # If CODIS reference sample uploaded 1 year after SAK PRemain_1 = 1 - ecdf(IYearsAfterTW)(1), NRemain_1 = round(N*PRemain_1, digits = 0), # If CODIS reference sample uploaded 2 years after SAK PRemain_2 = 1 - ecdf(IYearsAfterTW)(2), NRemain_2 = round(N*PRemain_2, digits = 0), # If CODIS reference sample uploaded 5 years after SAK PRemain_5 = 1 - ecdf(IYearsAfterTW)(5), NRemain_5 = round(N*PRemain_5, digits = 0), # If CODIS reference sample uploaded 10 years after SAK PRemain_10 = 1 - ecdf(IYearsAfterTW)(10), NRemain_10 = round(N*PRemain_10, digits = 0)) %>% # Rescale proportions to percentages mutate(PRemain_1 = PRemain_1*100, PRemain_2 = PRemain_2*100, PRemain_5 = PRemain_5*100, PRemain_10 = PRemain_10*100) %>% select(Category, N, NRemain_1, PRemain_1, NRemain_2, PRemain_2, NRemain_5, PRemain_5, NRemain_10, PRemain_10) %>% kable(format = "latex", digits = 2, booktabs = TRUE, caption = TCap, format.args = list(big.mark = ','), linesep = "", col.names = c("Category", "N", rep(c("n", "%"), times = 4))) %>% kable_styling() %>% add_header_above(., c(" ", " ", "1 Year" = 2, "2 Years" = 2, "5 Years" = 2, "10 Years" = 2)) %>% add_header_above(., c(" ", " ", "Delay in Matching to Offender Reference Sample" = 8)) %>% row_spec(., row = c(1, 2, 5, 11, 15), italic = TRUE) %>% pack_rows(" ", start_row = 1, end_row = 1) %>% pack_rows(" ", start_row = 2, end_row = 4) %>% pack_rows(" ", start_row = 5, end_row = 10) %>% pack_rows(" ", start_row = 11, end_row = 14) %>% pack_rows(" ", start_row = 15, end_row = 17) %>% add_indent(., positions = c(3:4), level_of_indent = 1) %>% add_indent(., positions = c(6:10), level_of_indent = 1) %>% add_indent(., positions = c(12:14), level_of_indent = 1) %>% add_indent(., positions = c(16:17), level_of_indent = 1) %>% column_spec(column = 1, width = "6.8 cm") %>% footnote(general = FN2, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
The American Psychological Association's general instructions regarding manuscript preparation refer authors to KnowledgeWorks Global Digital Art Support General Guidelines for guidance on preparing figures. We are complying with those guidelines for preparing the following Figure files that we export.
We can choose between submitting TIFF versus EPS file. Because EPS files are orders of magnitude smaller, we choose those for submission. We also export PNG files for convenience (they are easier to view without specialized software).
We wrote the code below to export a version of Figure \ref{fig:plot_IYearsAfterTW_Crime_ecdf2}, but have subsequently disabled the code chunk below because the final manuscript will no longer contain the figure.
ggsave(filename = here::here("inst/Figure_Crime_Years.eps"), plot = Figure_Crime_Years, device = "eps", dpi = 600, height = 6.0, width = 7.2, units = "in") ggsave(filename = here::here("inst/Figure_Crime_Years.png"), plot = Figure_Crime_Years, device = "png", dpi = 600, height = 6.0, width = 7.2, units = "in")
\FloatBarrier
All summaries in this section directly summarize the ARR data without first aggregating to the incident or offender level. These outputs are not directly used in the manuscript.
\FloatBarrier
FN <- paste("Only arrest records for offenders with valid testing window start", "dates and with offenses from the 12 main crime categories were", "included.") kable(freq(as_factor(ARREW$ACat12), user.missing = "Excluded user-missing"), format = "latex", booktabs = TRUE, digits = 2, format.args = list(big.mark = ','), caption = "Number of Arrest Offense Records by Crime Category (12 levels)") %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
# Table caption. TCap <- paste("Offender Age at Arrest for Criminal Sexual Conduct (Overall)") # Footnote text. FN <- paste("This is an arrest offense record-level summary (offenders with", "multiple arrests contribute one age value per arrest record.", "Only arrest records for offenders with valid testing window start", "dates and where the arrest was for criminal sexual conduct are", "included.") ARREW %>% filter(ACat12_5 == 1) %>% select(OAgeA, ALag) %>% describe(., quant=c(.25, .50, .75)) %>% filter(vars == 1) %>% bind_cols(data.frame(Variable = "Age at arrest"), .) %>% select(Variable, n, mean, sd, min, max, skew, kurtosis, Q0.25, Q0.5, Q0.75) %>% kable(., format = "latex", booktabs = TRUE, digits = 2, row.names = FALSE, col.names = CNames, caption = TCap) %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
This section directly summarizes the CHG data without first aggregating to the incident or offender level. These outputs are not directly used in the manuscript.
\FloatBarrier
FN <- paste("Only charge records for offenders with valid testing window start", "dates and with charges from the 12 main crime categories were", "included.") kable(freq(as_factor(CHGEW$CCat12), user.missing = "Excluded user-missing"), format = "latex", booktabs = TRUE, digits = 2, format.args = list(big.mark = ','), caption = "Number of Prosecutor Charge Records by Crime Category (12 levels)") %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
All summaries in this section directly summarize the JUD data without first aggregating to the incident or offender level. While some of these records are convictions, others will have other dispositions. These outputs are not directly used in the manuscript.
\FloatBarrier
FN <- paste("Only records for offenders with valid testing window start dates", "and with charges from one of the 12 main crime categories were", "included.") kable(freq(as_factor(JUDEW$JCat12), user.missing = "Excluded user-missing"), format = "latex", booktabs = TRUE, digits = 2, format.args = list(big.mark = ','), caption = "Number of Adjudicated Charge Records (All Dispositions) by Crime Category (12 levels)")
\FloatBarrier
FN <- paste("Only conviction records for offenders with valid testing window", "start dates and with charges from one of the 12 main crime", "categories were included.") kable(freq(as_factor(CONEW$JCat12), user.missing = "Excluded user-missing"), format = "latex", booktabs = TRUE, digits = 2, format.args = list(big.mark = ','), caption = "Number of Adjudicated Charge Records (Convictions Only) by Crime Category (12 levels)") %>% footnote(general = FN, general_title = "Note: ", footnote_as_chunk = TRUE, threeparttable = TRUE)
\FloatBarrier
These materials are scholarly products based on research funded by the following grant.
Campbell, R., Pierce, S. J., & Sharma, D. (2015–2018). Serial sexual assaults: A longitudinal examination of offending patterns using DNA evidence. (NIJ Award # 2014-NE-BX-0006) [Grant]. National Institute of Justice.
\FloatBarrier
Campbell, R. (2019). Serial sexual assaults: A longitudinal examination of offending patterns using DNA evidence, Detroit, Michigan, 2009 [Data files, codebooks, computer programs, and statistical output]. ICPSR37134-v1. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2019-02-28. Retrieved from: https://doi.org/10.3886/ICPSR37134.v1
\FloatBarrier
We use R Markdown to enhance reproducibility. Knitting the source R Markdown
script r knitr:::current_input()
generates this PDF file containing
explanatory text, R code, plus R output (text and graphics).
r pandoc_version()
for this
document. This document was generated using the following computational environment and dependencies:
``` {r show_citations}
which_latex()
devtools::session_info()
The current Git commit details and status are: ```r git_report()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.